Mining Heterogeneous Graph for Patterns and Anomalies
Abstract
Knowledge discovery from disparate data sources can be very useful for gaining a better understanding of the real world. One potential way of discovering patterns and anomalies is to represent data as a graph. In addition, a better understanding of patterns and anomalies associated with a person, place, or activity, cannot be realized through a single graph. For instance, in social media, one can discover interesting patterns of behavior about an individual through a single account, but better insight into their overall behavior is realized by examining all of their social media actions simultaneously. Graphs are a logical choice for representing such data. In this project we will investigate a novel framework capable of discovering patterns in multiple graphs. Our objective is not only to show that known patterns and anomalies in individual sources can still be discovered efficiently, but also that new patterns and anomalies consisting of information from multiple data sources can be identified. We will use real-world data collected over several months from multiple sources consisting of top daily news stories and associated tweets in order to evaluate our methods.