Visible to the public Biblio

Filters: Keyword is simHash  [Clear All Filters]
2021-05-05
Zhang, Yunan, Xu, Aidong Xu, Jiang, Yixin.  2020.  Scalable and Accurate Binary Code Search Method Based on Simhash and Partial Trace. 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :818—826.

Binary code search has received much attention recently due to its impactful applications, e.g., plagiarism detection, malware detection and software vulnerability auditing. However, developing an effective binary code search tool is challenging due to the gigantic syntax and structural differences in binaries resulted from different compilers, compiler options and malware family. In this paper, we propose a scalable and accurate binary search engine which performs syntactic matching by combining a set of key techniques to address the challenges above. The key contribution is binary code searching technique which combined function filtering and partial trace method to match the function code relatively quick and accurate. In addition, a simhash and basic information based function filtering is proposed to dramatically reduce the irrelevant target functions. Besides, we introduce a partial trace method for matching the shortlisted function accurately. The experimental results show that our method can find similar functions, even with the presence of program structure distortion, in a scalable manner.

2020-07-06
Ben, Yongming, Han, Yanni, Cai, Ning, An, Wei, Xu, Zhen.  2019.  An Online System Dependency Graph Anomaly Detection based on Extended Weisfeiler-Lehman Kernel. MILCOM 2019 - 2019 IEEE Military Communications Conference (MILCOM). :1–6.
Modern operating systems are typical multitasking systems: Running multiple tasks at the same time. Therefore, a large number of system calls belonging to different processes are invoked at the same time. By associating these invocations, one can construct the system dependency graph. In rapidly evolving system dependency graphs, how to quickly find outliers is an urgent issue for intrusion detection. Clustering analysis based on graph similarity will help solve this problem. In this paper, an extended Weisfeiler-Lehman(WL) kernel is proposed. Firstly, an embedded vector with indefinite dimensions is constructed based on the original dependency graph. Then, the vector is compressed with Simhash to generate a fingerprint. Finally, anomaly detection based on clustering is carried out according to these fingerprints. Our scheme can achieve prominent detection with high efficiency. For validation, we choose StreamSpot, a relevant prior work, to act as benchmark, and use the same data set as it to carry out evaluations. Experiments show that our scheme can achieve the highest detection precision of 98% while maintaining a perfect recall performance. Moreover, both quantitative and visual comparisons demonstrate the outperforming clustering effect of our scheme than StreamSpot.
2020-07-03
Suo, Yucong, Zhang, Chen, Xi, Xiaoyun, Wang, Xinyi, Zou, Zhiqiang.  2019.  Video Data Hierarchical Retrieval via Deep Hash Method. 2019 IEEE 11th International Conference on Communication Software and Networks (ICCSN). :709—714.

Video retrieval technology faces a series of challenges with the tremendous growth in the number of videos. In order to improve the retrieval performance in efficiency and accuracy, a novel deep hash method for video data hierarchical retrieval is proposed in this paper. The approach first uses cluster-based method to extract key frames, which reduces the workload of subsequent work. On the basis of this, high-level semantical features are extracted from VGG16, a widely used deep convolutional neural network (deep CNN) model. Then we utilize a hierarchical retrieval strategy to improve the retrieval performance, roughly can be categorized as coarse search and fine search. In coarse search, we modify simHash to learn hash codes for faster speed, and in fine search, we use the Euclidean distance to achieve higher accuracy. Finally, we compare our approach with other two methods through practical experiments on two videos, and the results demonstrate that our approach has better retrieval effect.