Calculating Binary-Address Distance Between Functions: A New Approach to Rare Itemset Mining


  • Austin Tice


Programs commonly follow many implicit programming rules, most of which are not defined by developers. As a security analyst, it is vital to understand and discover these implicit rules. Previous work on this topic has dealt specifically with itemsets that have an arbitrarily large support, which then produces fewer false positives. However, in many cases it can be trivial to determine implicit rules for itemsets with a high support. Within these tools there are many instances where a particular itemset of low support may not be determined to be actionable and therefore disregarded. This paper proposes a new, automated method to solve the issue of finding implicit programming rules with low support, namely Rare Itemsets, by adding an additional heuristic that computes the binary-address distance between functions of interest. Motivation for this research comes from the stance that it will improve the speed at which reverse engineers and vulnerability researchers can find software vulnerabilities in programs.





Computer Science