Mining Vehicle CAN Logs for Relationships Between Message Sequences
To protect themselves from hackers and competitors, automotive vehicle manufacturers obfuscate the Control Area Network (CAN) data being sent over their vehicle's internal networks. The rules used to obfuscate the data differ between the makes and models of today's vehicles. The inability to understand message semantics has become a major inhibitor to developing techniques for vehicle security and other types of automotive research. No small amount of research has been done to decode the obfuscation rules, but up until this point each researcher has primarily had to design their work independent of the work done by their predecessors. Further, to the best of our knowledge, no attempts to derive semantic meanings of messages have focused on the sequence that messages occur in. In the project, we utilized various data mining techniques to derive sequence rules from logs of recorded CAN messages. We use the decoding rules for a Tesla Model 3 to understand the relationship between the definitions of CAN messages and the order they occur in for the Tesla. The models that we create can be used in the future to extract general relationships between mapped CAN data and the sequence those messages occur in that can later be used to understand unmapped data. Additionally, using synthetic data, we show how these models can identify anomalies within the vehicle that might indicate intrusion or malfunction of the vehicle.