Biblio
Presented at the NSA Science of Security Quarterly Meeting, November 2016.
Given the ever increasing number of research tools to automatically generate inputs to test Android applications (or simply apps), researchers recently asked the question "Are we there yet?" (in terms of the practicality of the tools). By conducting an empirical study of the various tools, the researchers found that Monkey (the most widely used tool of this category in industrial settings) outperformed all of the research tools in the study. In this paper, we present two signi cant extensions of that study. First, we conduct the rst industrial case study of applying Monkey against WeChat, a popular messenger app with over 762 million monthly active users, and report the empirical ndings on Monkey's limitations in an industrial setting. Second, we develop a new approach to address major limitations of Monkey and accomplish substantial code-coverage improvements over Monkey. We conclude the paper with empirical insights for future enhancements to both Monkey and our approach.