Visible to the public Biblio

Filters: Keyword is binary code  [Clear All Filters]
2022-08-02
Karthikeyan, P., Anandaraj, S.P., Vignesh, R., Poornima, S..  2021.  Review on Trustworthy Analysis in binary code. 2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS). 1:1386—1389.
The software industry is dominating many are like health care, finance, agriculture and entertainment. Software security has become an essential issue-outsider libraries, which assume a significant part in programming. The finding weaknesses in the binary code is a significant issue that presently cannot seem to be handled, as showed by numerous weaknesses wrote about an everyday schedule. Software seller sells the software to the client if the client wants to check the software's vulnerability it is a cumbersome task. Presently many deep learning-based methods also introduced to find the security weakness in the binary code. This paper present the merits and demerits of binary code analysis used by a different method.
2022-03-10
Zhang, Zhongtang, Liu, Shengli, Yang, Qichao, Guo, Shichen.  2021.  Semantic Understanding of Source and Binary Code based on Natural Language Processing. 2021 IEEE 4th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC). 4:2010—2016.
With the development of open source projects, a large number of open source codes will be reused in binary software, and bugs in source codes will also be introduced into binary codes. In order to detect the reused open source codes in binary codes, it is sometimes necessary to compare and analyze the similarity between source codes and binary codes. One of the main challenge is that the compilation process can generate different binary code representations for the same source code, such as different compiler versions, compilation optimization options and target architectures, which greatly increases the difficulty of semantic similarity detection between source code and binary code. In order to solve the influence of the compilation process on the comparison of semantic similarity of codes, this paper transforms the source code and binary code into LLVM intermediate representation (LLVM IR), which is a universal intermediate representation independent of source code and binary code. We carry out semantic feature extraction and embedding training on LLVM IR based on natural language processing model. Experimental results show that LLVM IR eliminates the influence of compilation on the syntax differences between source code and binary code, and the semantic features of code are well represented and preserved.
2021-03-04
Sun, H., Liu, L., Feng, L., Gu, Y. X..  2014.  Introducing Code Assets of a New White-Box Security Modeling Language. 2014 IEEE 38th International Computer Software and Applications Conference Workshops. :116—121.

This paper argues about a new conceptual modeling language for the White-Box (WB) security analysis. In the WB security domain, an attacker may have access to the inner structure of an application or even the entire binary code. It becomes pretty easy for attackers to inspect, reverse engineer, and tamper the application with the information they steal. The basis of this paper is the 14 patterns developed by a leading provider of software protection technologies and solutions. We provide a part of a new modeling language named i-WBS (White-Box Security) to describe problems of WB security better. The essence of White-Box security problem is code security. We made the new modeling language focus on code more than ever before. In this way, developers who are not security experts can easily understand what they need to really protect.

2021-02-01
Calhoun, C. S., Reinhart, J., Alarcon, G. A., Capiola, A..  2020.  Establishing Trust in Binary Analysis in Software Development and Applications. 2020 IEEE International Conference on Human-Machine Systems (ICHMS). :1–4.
The current exploratory study examined software programmer trust in binary analysis techniques used to evaluate and understand binary code components. Experienced software developers participated in knowledge elicitations to identify factors affecting trust in tools and methods used for understanding binary code behavior and minimizing potential security vulnerabilities. Developer perceptions of trust in those tools to assess implementation risk in binary components were captured across a variety of application contexts. The software developers reported source security and vulnerability reports provided the best insight and awareness of potential issues or shortcomings in binary code. Further, applications where the potential impact to systems and data loss is high require relying on more than one type of analysis to ensure the binary component is sound. The findings suggest binary analysis is viable for identifying issues and potential vulnerabilities as part of a comprehensive solution for understanding binary code behavior and security vulnerabilities, but relying simply on binary analysis tools and binary release metadata appears insufficient to ensure a secure solution.
2020-02-17
Letychevskyi, Oleksandr, Peschanenko, Volodymyr, Radchenko, Viktor, Hryniuk, Yaroslav, Yakovlev, Viktor.  2019.  Algebraic Patterns of Vulnerabilities in Binary Code. 2019 10th International Conference on Dependable Systems, Services and Technologies (DESSERT). :70–73.
This paper presents an algebraic approach for formalizing and detecting vulnerabilities in binary code. It uses behaviour algebra equations for creating patterns of vulnerabilities and algebraic matching methods for vulnerability detection. Algebraic matching is based on symbolic modelling. This paper considers a known vulnerability, buffer overflow, as an example to demonstrate an algebraic approach for pattern creation.
Letychevskyi, Oleksandr.  2019.  Two-Level Algebraic Method for Detection of Vulnerabilities in Binary Code. 2019 10th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS). 2:1074–1077.
This study introduces formal methods for detection of vulnerabilities in binary code. It considers the transformation of binary code into behavior algebra expressions and formalization of vulnerabilities. The detection method has two levels: behavior matching and symbolic execution with vulnerability pattern matching. This enables more efficient performance.
2018-06-11
Deng, H., Xie, H., Ma, W., Mao, Z., Zhou, C..  2017.  Double-bit quantization and weighting for nearest neighbor search. 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :1717–1721.

Binary embedding is an effective way for nearest neighbor (NN) search as binary code is storage efficient and fast to compute. It tries to convert real-value signatures into binary codes while preserving similarity of the original data. However, it greatly decreases the discriminability of original signatures due to the huge loss of information. In this paper, we propose a novel method double-bit quantization and weighting (DBQW) to solve the problem by mapping each dimension to double-bit binary code and assigning different weights according to their spatial relationship. The proposed method is applicable to a wide variety of embedding techniques, such as SH, PCA-ITQ and PCA-RR. Experimental comparisons on two datasets show that DBQW for NN search can achieve remarkable improvements in query accuracy compared to original binary embedding methods.

2018-06-07
Xu, Xiaojun, Liu, Chang, Feng, Qian, Yin, Heng, Song, Le, Song, Dawn.  2017.  Neural Network-based Graph Embedding for Cross-Platform Binary Code Similarity Detection. Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. :363–376.

The problem of cross-platform binary code similarity detection aims at detecting whether two binary functions coming from different platforms are similar or not. It has many security applications, including plagiarism detection, malware detection, vulnerability search, etc. Existing approaches rely on approximate graph-matching algorithms, which are inevitably slow and sometimes inaccurate, and hard to adapt to a new task. To address these issues, in this work, we propose a novel neural network-based approach to compute the embedding, i.e., a numeric vector, based on the control flow graph of each binary function, then the similarity detection can be done efficiently by measuring the distance between the embeddings for two functions. We implement a prototype called Gemini. Our extensive evaluation shows that Gemini outperforms the state-of-the-art approaches by large margins with respect to similarity detection accuracy. Further, Gemini can speed up prior art's embedding generation time by 3 to 4 orders of magnitude and reduce the required training time from more than 1 week down to 30 minutes to 10 hours. Our real world case studies demonstrate that Gemini can identify significantly more vulnerable firmware images than the state-of-the-art, i.e., Genius. Our research showcases a successful application of deep learning on computer security problems.

2018-02-06
Detken, K. O., Jahnke, M., Rix, T., Rein, A..  2017.  Software-Design for Internal Security Checks with Dynamic Integrity Measurement (DIM). 2017 9th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS). 1:367–373.

Most security software tools try to detect malicious components by cryptographic hashes, signatures or based on their behavior. The former, is a widely adopted approach based on Integrity Measurement Architecture (IMA) enabling appraisal and attestation of system components. The latter, however, may induce a very long time until misbehavior of a component leads to a successful detection. Another approach is a Dynamic Runtime Attestation (DRA) based on the comparison of binary code loaded in the memory and well-known references. Since DRA is a complex approach, involving multiple related components and often complex attestation strategies, a flexible and extensible architecture is needed. In a cooperation project an architecture was designed and a Proof of Concept (PoC) successfully developed and evaluated. To achieve needed flexibility and extensibility, the implementation facilitates central components providing attestation strategies (guidelines). These guidelines define and implement the necessary steps for all relevant attestation operations, i.e. measurement, reference generation and verification.

2018-01-10
Shen, Fumin, Gao, Xin, Liu, Li, Yang, Yang, Shen, Heng Tao.  2017.  Deep Asymmetric Pairwise Hashing. Proceedings of the 2017 ACM on Multimedia Conference. :1522–1530.
Recently, deep neural networks based hashing methods have greatly improved the multimedia retrieval performance by simultaneously learning feature representations and binary hash functions. Inspired by the latest advance in the asymmetric hashing scheme, in this work, we propose a novel Deep Asymmetric Pairwise Hashing approach (DAPH) for supervised hashing. The core idea is that two deep convolutional models are jointly trained such that their output codes for a pair of images can well reveal the similarity indicated by their semantic labels. A pairwise loss is elaborately designed to preserve the pairwise similarities between images as well as incorporating the independence and balance hash code learning criteria. By taking advantage of the flexibility of asymmetric hash functions, we devise an efficient alternating algorithm to optimize the asymmetric deep hash functions and high-quality binary code jointly. Experiments on three image benchmarks show that DAPH achieves the state-of-the-art performance on large-scale image retrieval.