Visible to the public Biblio

Filters: Keyword is code obfuscation techniques  [Clear All Filters]
2020-08-10
Kim, Byoungchul, Jung, Jaemin, Han, Sangchul, Jeon, Soyeon, Cho, Seong-je, Choi, Jongmoo.  2019.  A New Technique for Detecting Android App Clones Using Implicit Intent and Method Information. 2019 Eleventh International Conference on Ubiquitous and Future Networks (ICUFN). :478–483.
Detecting repackaged apps is one of the important issues in the Android ecosystem. Many attackers usually reverse engineer a legitimate app, modify or embed malicious codes into the app, repackage and distribute it in the online markets. They also employ code obfuscation techniques to hide app cloning or repackaging. In this paper, we propose a new technique for detecting repackaged Android apps, which is robust to code obfuscation. The technique analyzes the similarity of Android apps based on the method call information of component classes that receive implicit intents. We developed a tool Calldroid that implemented the proposed technique, and evaluated it on apps transformed using well-known obfuscators. The evaluation results showed that the proposed technique can effectively detect repackaged apps.
2020-02-10
Ding, Steven H. H., Fung, Benjamin C. M., Charland, Philippe.  2019.  Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization. 2019 IEEE Symposium on Security and Privacy (SP). :472–489.

Reverse engineering is a manually intensive but necessary technique for understanding the inner workings of new malware, finding vulnerabilities in existing systems, and detecting patent infringements in released software. An assembly clone search engine facilitates the work of reverse engineers by identifying those duplicated or known parts. However, it is challenging to design a robust clone search engine, since there exist various compiler optimization options and code obfuscation techniques that make logically similar assembly functions appear to be very different. A practical clone search engine relies on a robust vector representation of assembly code. However, the existing clone search approaches, which rely on a manual feature engineering process to form a feature vector for an assembly function, fail to consider the relationships between features and identify those unique patterns that can statistically distinguish assembly functions. To address this problem, we propose to jointly learn the lexical semantic relationships and the vector representation of assembly functions based on assembly code. We have developed an assembly code representation learning model \textbackslashemphAsm2Vec. It only needs assembly code as input and does not require any prior knowledge such as the correct mapping between assembly functions. It can find and incorporate rich semantic relationships among tokens appearing in assembly code. We conduct extensive experiments and benchmark the learning model with state-of-the-art static and dynamic clone search approaches. We show that the learned representation is more robust and significantly outperforms existing methods against changes introduced by obfuscation and optimizations.