Visible to the public Biblio

Found 141 results

Filters: Keyword is Libraries  [Clear All Filters]
2021-05-20
Usher, Will, Pascucci, Valerio.  2020.  Interactive Visualization of Terascale Data in the Browser: Fact or Fiction? 2020 IEEE 10th Symposium on Large Data Analysis and Visualization (LDAV). :27—36.

Information visualization applications have become ubiquitous, in no small part thanks to the ease of wide distribution and deployment to users enabled by the web browser. Scientific visualization applications, relying on native code libraries and parallel processing, have been less suited to such widespread distribution, as browsers do not provide the required libraries or compute capabilities. In this paper, we revisit this gap in visualization technologies and explore how new web technologies, WebAssembly and WebGPU, can be used to deploy powerful visualization solutions for large-scale scientific data in the browser. In particular, we evaluate the programming effort required to bring scientific visualization applications to the browser through these technologies and assess their competitiveness against classic native solutions. As a main example, we present a new GPU-driven isosurface extraction method for block-compressed data sets, that is suitable for interactive isosurface computation on large volumes in resource-constrained environments, such as the browser. We conclude that web browsers are on the verge of becoming a competitive platform for even the most demanding scientific visualization tasks, such as interactive visualization of isosurfaces from a 1TB DNS simulation. We call on researchers and developers to consider investing in a community software stack to ease use of these upcoming browser features to bring accessible scientific visualization to the browser.

2021-05-03
Mishra, Shachee, Polychronakis, Michalis.  2020.  Saffire: Context-sensitive Function Specialization against Code Reuse Attacks. 2020 IEEE European Symposium on Security and Privacy (EuroS P). :17–33.
The sophistication and complexity of recent exploitation techniques, which rely on memory disclosure and whole-function reuse to bypass address space layout randomization and control flow integrity, is indicative of the effect that the combination of exploit mitigations has in challenging the construction of reliable exploits. In addition to software diversification and control flow enforcement, recent efforts have focused on the complementary approach of code and API specialization to restrict further the critical operations that an attacker can perform as part of a code reuse exploit. In this paper we propose Saffire, a compiler-level defense against code reuse attacks. For each calling context of a critical function, Saffire creates a specialized and hardened replica of the function with a restricted interface that can accommodate only that particular invocation. This is achieved by applying staticargumentbinding, to eliminate arguments with static values and concretize them within the function body, and dynamicargumentbinding, which applies a narrow-scope form of data flow integrity to restrict the acceptable values of arguments that cannot be statically derived. We have implemented Saffire on top of LLVM, and applied it to a set of 11 applications, including Nginx, Firefox, and Chrome. The results of our experimental evaluation with a set of 17 real-world ROP exploits and three whole-function reuse exploits demonstrate the effectiveness of Saffire in preventing these attacks while incurring a negligible runtime overhead.
2021-03-29
Normatov, S., Rakhmatullaev, M..  2020.  Expert system with Fuzzy logic for protecting Scientific Information Resources. 2020 International Conference on Information Science and Communications Technologies (ICISCT). :1—4.

Analysis of the state of development of research on the protection of valuable scientific and educational databases, library resources, information centers, publishers show the importance of information security, especially in corporate information networks and systems for data exchange. Corporate library networks include dozens and even hundreds of libraries for active information exchange, and they (libraries) are equipped with information security tools to varying degrees. The purpose of the research is to create effective methods and tools to protect the databases of the scientific and educational resources from unauthorized access in libraries and library networks using fuzzy logic methods.

2021-03-22
Ban, T. Q., Nguyen, T. T. T., Long, V. T., Dung, P. D., Tung, B. T..  2020.  A Benchmarking of the Effectiveness of Modular Exponentiation Algorithms using the library GMP in C language. 2020 International Conference on Computational Intelligence (ICCI). :237–241.
This research aims to implement different modular exponentiation algorithms and evaluate the average complexity and compare it to the theoretical value. We use the library GMP to implement seven modular exponentiation algorithms. They are Left-to-right Square and Multiply, Right-to-left Square and Multiply, Left-to-right Signed Digit Square, and Multiply Left-to-right Square and Multiply Always Right-to-left Square and Multiply Always, Montgomery Ladder and Joye Ladder. For some exponent bit length, we choose 1024 bits and execute each algorithm on many exponent values and count the average numbers of squares and the average number of multiplications. Whenever relevant, our programs will check the consistency relations between the registers at the end of the exponentiation.
2021-03-15
Staicu, C.-A., Torp, M. T., Schäfer, M., Møller, A., Pradel, M..  2020.  Extracting Taint Specifications for JavaScript Libraries. 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). :198—209.

Modern JavaScript applications extensively depend on third-party libraries. Especially for the Node.js platform, vulnerabilities can have severe consequences to the security of applications, resulting in, e.g., cross-site scripting and command injection attacks. Existing static analysis tools that have been developed to automatically detect such issues are either too coarse-grained, looking only at package dependency structure while ignoring dataflow, or rely on manually written taint specifications for the most popular libraries to ensure analysis scalability. In this work, we propose a technique for automatically extracting taint specifications for JavaScript libraries, based on a dynamic analysis that leverages the existing test suites of the libraries and their available clients in the npm repository. Due to the dynamic nature of JavaScript, mapping observations from dynamic analysis to taint specifications that fit into a static analysis is non-trivial. Our main insight is that this challenge can be addressed by a combination of an access path mechanism that identifies entry and exit points, and the use of membranes around the libraries of interest. We show that our approach is effective at inferring useful taint specifications at scale. Our prototype tool automatically extracts 146 additional taint sinks and 7 840 propagation summaries spanning 1 393 npm modules. By integrating the extracted specifications into a commercial, state-of-the-art static analysis, 136 new alerts are produced, many of which correspond to likely security vulnerabilities. Moreover, many important specifications that were originally manually written are among the ones that our tool can now extract automatically.

Perkins, J., Eikenberry, J., Coglio, A., Rinard, M..  2020.  Comprehensive Java Metadata Tracking for Attack Detection and Repair. 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). :39—51.

We present ClearTrack, a system that tracks meta-data for each primitive value in Java programs to detect and nullify a range of vulnerabilities such as integer overflow/underflow and SQL/command injection vulnerabilities. Contributions include new techniques for eliminating false positives associated with benign integer overflows and underflows, new metadata-aware techniques for detecting and nullifying SQL/command command injection attacks, and results from an independent evaluation team. These results show that 1) ClearTrack operates successfully on Java programs comprising hundreds of thousands of lines of code (including instrumented jar files and Java system libraries, the majority of the applications comprise over 3 million lines of code), 2) because of computations such as cryptography and hash table calculations, these applications perform millions of benign integer overflows and underflows, and 3) ClearTrack successfully detects and nullifies all tested integer overflow and underflow and SQL/command injection vulnerabilities in the benchmark applications.

Cortiñas, C. T., Vassena, M., Russo, A..  2020.  Securing Asynchronous Exceptions. 2020 IEEE 33rd Computer Security Foundations Symposium (CSF). :214–229.

Language-based information-flow control (IFC) techniques often rely on special purpose, ad-hoc primitives to address different covert channels that originate in the runtime system, beyond the scope of language constructs. Since these piecemeal solutions may not compose securely, there is a need for a unified mechanism to control covert channels. As a first step towards this goal, we argue for the design of a general interface that allows programs to safely interact with the runtime system and the available computing resources. To coordinate the communication between programs and the runtime system, we propose the use of asynchronous exceptions (interrupts), which, to the best of our knowledge, have not been considered before in the context of IFC languages. Since asynchronous exceptions can be raised at any point during execution-often due to the occurrence of an external event-threads must temporarily mask them out when manipulating locks and shared data structures to avoid deadlocks and, therefore, breaking program invariants. Crucially, the naive combination of asynchronous exceptions with existing features of IFC languages (e.g., concurrency and synchronization variables) may open up new possibilities of information leakage. In this paper, we present MACasync, a concurrent, statically enforced IFC language that, as a novelty, features asynchronous exceptions. We show how asynchronous exceptions easily enable (out of the box) useful programming patterns like speculative execution and some degree of resource management. We prove that programs in MACasync satisfy progress-sensitive non-interference and mechanize our formal claims in the Agda proof assistant.

2021-03-04
Kalin, J., Ciolino, M., Noever, D., Dozier, G..  2020.  Black Box to White Box: Discover Model Characteristics Based on Strategic Probing. 2020 Third International Conference on Artificial Intelligence for Industries (AI4I). :60—63.

In Machine Learning, White Box Adversarial Attacks rely on knowing underlying knowledge about the model attributes. This works focuses on discovering to distrinct pieces of model information: the underlying architecture and primary training dataset. With the process in this paper, a structured set of input probes and the output of the model become the training data for a deep classifier. Two subdomains in Machine Learning are explored - image based classifiers and text transformers with GPT-2. With image classification, the focus is on exploring commonly deployed architectures and datasets available in popular public libraries. Using a single transformer architecture with multiple levels of parameters, text generation is explored by fine tuning off different datasets. Each dataset explored in image and text are distinguishable from one another. Diversity in text transformer outputs implies further research is needed to successfully classify architecture attribution in text domain.

2021-02-23
Gamba, J., Rashed, M., Razaghpanah, A., Tapiador, J., Vallina-Rodriguez, N..  2020.  An Analysis of Pre-installed Android Software. 2020 IEEE Symposium on Security and Privacy (SP). :1039—1055.

The open-source nature of the Android OS makes it possible for manufacturers to ship custom versions of the OS along with a set of pre-installed apps, often for product differentiation. Some device vendors have recently come under scrutiny for potentially invasive private data collection practices and other potentially harmful or unwanted behavior of the preinstalled apps on their devices. Yet, the landscape of preinstalled software in Android has largely remained unexplored, particularly in terms of the security and privacy implications of such customizations. In this paper, we present the first large- scale study of pre-installed software on Android devices from more than 200 vendors. Our work relies on a large dataset of real-world Android firmware acquired worldwide using crowd-sourcing methods. This allows us to answer questions related to the stakeholders involved in the supply chain, from device manufacturers and mobile network operators to third- party organizations like advertising and tracking services, and social network platforms. Our study allows us to also uncover relationships between these actors, which seem to revolve primarily around advertising and data-driven services. Overall, the supply chain around Android's open source model lacks transparency and has facilitated potentially harmful behaviors and backdoored access to sensitive data and services without user consent or awareness. We conclude the paper with recommendations to improve transparency, attribution, and accountability in the Android ecosystem.

2021-02-22
Haile, J., Havens, S..  2020.  Identifying Ubiquitious Third-Party Libraries in Compiled Executables Using Annotated and Translated Disassembled Code with Supervised Machine Learning. 2020 IEEE Security and Privacy Workshops (SPW). :157–162.
The size and complexity of the software ecosystem is a major challenge for vendors, asset owners and cybersecurity professionals who need to understand the security posture of these systems. Annotated and Translated Disassembled Code is a graph based datastore designed to organize firmware and software analysis data across builds, packages and systems, providing a highly scalable platform enabling automated binary software analysis tasks including corpora construction and storage for machine learning. This paper describes an approach for the identification of ubiquitous third-party libraries in firmware and software using Annotated and Translated Disassembled Code and supervised machine learning. Annotated and Translated Disassembled Code provide matched libraries, function names and addresses of previously unidentified code in software as it is being automatically analyzed. This data can be ingested by other software analysis tools to improve accuracy and save time. Defenders can add the identified libraries to their vulnerability searches and add effective detection and mitigation into their operating environment.
Lansley, M., Kapetanakis, S., Polatidis, N..  2020.  SEADer++ v2: Detecting Social Engineering Attacks using Natural Language Processing and Machine Learning. 2020 International Conference on INnovations in Intelligent SysTems and Applications (INISTA). :1–6.
Social engineering attacks are well known attacks in the cyberspace and relatively easy to try and implement because no technical knowledge is required. In various online environments such as business domains where customers talk through a chat service with employees or in social networks potential hackers can try to manipulate other people by employing social attacks against them to gain information that will benefit them in future attacks. Thus, we have used a number of natural language processing steps and a machine learning algorithm to identify potential attacks. The proposed method has been tested on a semi-synthetic dataset and it is shown to be both practical and effective.
Gündoğan, C., Amsüss, C., Schmidt, T. C., Wählisch, M..  2020.  IoT Content Object Security with OSCORE and NDN: A First Experimental Comparison. 2020 IFIP Networking Conference (Networking). :19–27.
The emerging Internet of Things (IoT) challenges the end-to-end transport of the Internet by low power lossy links and gateways that perform protocol translations. Protocols such as CoAP or MQTT-SN are degraded by the overhead of DTLS sessions, which in common deployment protect content transfer only up to the gateway. To preserve content security end-to-end via gateways and proxies, the IETF recently developed Object Security for Constrained RESTful Environments (OSCORE), which extends CoAP with content object security features commonly known from Information Centric Networks (ICN). This paper presents a comparative analysis of protocol stacks that protect request-response transactions. We measure protocol performances of CoAP over DTLS, OSCORE, and the information-centric Named Data Networking (NDN) protocol on a large-scale IoT testbed in single- and multi-hop scenarios. Our findings indicate that (a) OSCORE improves on CoAP over DTLS in error-prone wireless regimes due to omitting the overhead of maintaining security sessions at endpoints, and (b) NDN attains superior robustness and reliability due to its intrinsic network caches and hop-wise retransmissions.
2021-02-10
Tanana, D., Tanana, G..  2020.  Advanced Behavior-Based Technique for Cryptojacking Malware Detection. 2020 14th International Conference on Signal Processing and Communication Systems (ICSPCS). :1—4.
With rising value and popularity of cryptocurrencies, they inevitably attract cybercriminals seeking illicit profits within blockchain ecosystem. Two of the most popular methods are ransomware and cryptojacking. Ransomware, being the first and more obvious threat has been extensively studied in the past. Unlike that, scientists have often neglected cryptojacking, because it’s less obvious and less harmful than ransomware. In this paper, we’d like to propose enhanced detection program to combat cryptojacking, additionally briefly touching history of cryptojacking, also known as malicious mining and reviewing most notable previous attempts to detect and combat cryptojacking. The review would include out previous work on malicious mining detection and our current detection program is based on its previous iteration, which mostly used CPU usage heuristics to detect cryptojacking. However, we will include additional metrics for malicious mining detection, such as network usage and calls to cryptographic libraries, which result in a 93% detection rate against the selected number of cryptojacking samples, compared to 81% rate achieved in previous work. Finally, we’ll discuss generalization of proposed detection technique to include GPU cryptojackers.
2021-02-08
Wang, Y., Wen, M., Liu, Y., Wang, Y., Li, Z., Wang, C., Yu, H., Cheung, S.-C., Xu, C., Zhu, Z..  2020.  Watchman: Monitoring Dependency Conflicts for Python Library Ecosystem. 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). :125–135.
The PyPI ecosystem has indexed millions of Python libraries to allow developers to automatically download and install dependencies of their projects based on the specified version constraints. Despite the convenience brought by automation, version constraints in Python projects can easily conflict, resulting in build failures. We refer to such conflicts as Dependency Conflict (DC) issues. Although DC issues are common in Python projects, developers lack tool support to gain a comprehensive knowledge for diagnosing the root causes of these issues. In this paper, we conducted an empirical study on 235 real-world DC issues. We studied the manifestation patterns and fixing strategies of these issues and found several key factors that can lead to DC issues and their regressions. Based on our findings, we designed and implemented Watchman, a technique to continuously monitor dependency conflicts for the PyPI ecosystem. In our evaluation, Watchman analyzed PyPI snapshots between 11 Jul 2019 and 16 Aug 2019, and found 117 potential DC issues. We reported these issues to the developers of the corresponding projects. So far, 63 issues have been confirmed, 38 of which have been quickly fixed by applying our suggested patches.
2021-02-03
Lee, J..  2020.  CanvasMirror: Secure Integration of Third-Party Libraries in a WebVR Environment. 2020 50th Annual IEEE-IFIP International Conference on Dependable Systems and Networks-Supplemental Volume (DSN-S). :75—76.

Web technology has evolved to offer 360-degree immersive browsing experiences. This new technology, called WebVR, enables virtual reality by rendering a three-dimensional world on an HTML canvas. Unfortunately, there exists no browser-supported way of sharing this canvas between different parties. As a result, third-party library providers with ill intent (e.g., stealing sensitive information from end-users) can easily distort the entire WebVR site. To mitigate the new threats posed in WebVR, we propose CanvasMirror, which allows publishers to specify the behaviors of third-party libraries and enforce this specification. We show that CanvasMirror effectively separates the third-party context from the host origin by leveraging the privilege separation technique and safely integrates VR contents on a shared canvas.

Martin, S., Parra, G., Cubillo, J., Quintana, B., Gil, R., Perez, C., Castro, M..  2020.  Design of an Augmented Reality System for Immersive Learning of Digital Electronic. 2020 XIV Technologies Applied to Electronics Teaching Conference (TAEE). :1—6.

This article describes the development of two mobile applications for learning Digital Electronics. The first application is an interactive app for iOS where you can study the different digital circuits, and which will serve as the basis for the second: a game of questions in augmented reality.

2021-02-01
Wu, L., Chen, X., Meng, L., Meng, X..  2020.  Multitask Adversarial Learning for Chinese Font Style Transfer. 2020 International Joint Conference on Neural Networks (IJCNN). :1–8.
Style transfer between Chinese fonts is challenging due to both the complexity of Chinese characters and the significant difference between fonts. Existing algorithms for this task typically learn a mapping between the reference and target fonts for each character. Subsequently, this mapping is used to generate the characters that do not exist in the target font. However, the characters available for training are unlikely to cover all fine-grained parts of the missing characters, leading to the overfitting problem. As a result, the generated characters of the target font may suffer problems of incomplete or even radicals and dirty dots. To address this problem, this paper presents a multi-task adversarial learning approach, termed MTfontGAN, to generate more vivid Chinese characters. MTfontGAN learns to transfer a reference font to multiple target ones simultaneously. An alignment is imposed on the encoders of different tasks to make them focus on the important parts of the characters in general style transfer. Such cross-task interactions at the feature level effectively improve the generalization capability of MTfontGAN. The performance of MTfontGAN is evaluated on three Chinese font datasets. Experimental results show that MTfontGAN outperforms the state-of-the-art algorithms in a single-task setting. More importantly, increasing the number of tasks leads to better performance in all of them.
Calhoun, C. S., Reinhart, J., Alarcon, G. A., Capiola, A..  2020.  Establishing Trust in Binary Analysis in Software Development and Applications. 2020 IEEE International Conference on Human-Machine Systems (ICHMS). :1–4.
The current exploratory study examined software programmer trust in binary analysis techniques used to evaluate and understand binary code components. Experienced software developers participated in knowledge elicitations to identify factors affecting trust in tools and methods used for understanding binary code behavior and minimizing potential security vulnerabilities. Developer perceptions of trust in those tools to assess implementation risk in binary components were captured across a variety of application contexts. The software developers reported source security and vulnerability reports provided the best insight and awareness of potential issues or shortcomings in binary code. Further, applications where the potential impact to systems and data loss is high require relying on more than one type of analysis to ensure the binary component is sound. The findings suggest binary analysis is viable for identifying issues and potential vulnerabilities as part of a comprehensive solution for understanding binary code behavior and security vulnerabilities, but relying simply on binary analysis tools and binary release metadata appears insufficient to ensure a secure solution.
2021-01-28
Salib, E. H., Aboutabl, M. S..  2020.  Hands-on Undergraduate Labs on Anonymity Cryptographic Algorithms. 2020 IEEE Frontiers in Education Conference (FIE). :1—9.

This is an innovative practice full paper. In past projects, we have successfully used a private TOR (anonymity network) platform that enabled our students to explore the end-to-end inner workings of the TOR anonymity network through a number of controlled hands-on lab assignments. These have saisfied the needs of curriculum focusing on networking functions and algorithms. To be able to extend the use and application of the private TOR platform into cryptography courses, there is a desperate need to enhance the platform to allow the development of hands-on lab assignments on the cryptographic algorithms and methods utilized in the creation of TOR secure connections and end-to-end circuits for anonymity.In tackling this challenge, and since TOR is open source software, we identify the cryptographic functions called by the TOR algorithms in the process of establishing TLS connections and creating end-to-end TOR circuits as well tearing them down. We instrumented these functions with the appropriate code to log the cryptographic keys dynamically created at all nodes involved in the creation of the end to end circuit between the Client and the exit relay (connected to the target server).We implemented a set of pedagogical lab assignments on a private TOR platform and present them in this paper. Using these assignments, students are able to investigate and validate the cryptographic procedures applied in the establishment of the initial TLS connection, the creation of the first leg of a TOR circuit, as well as extending the circuit through additional relays (at least two relays). More advanced assignments are created to challenge the students to unwrap the traffic sent from the Client to the exit relay at all onion skin layers and compare it with the actual traffic delivered to the target server.

2021-01-20
Mavroudis, V., Svenda, P..  2020.  JCMathLib: Wrapper Cryptographic Library for Transparent and Certifiable JavaCard Applets. 2020 IEEE European Symposium on Security and Privacy Workshops (EuroS PW). :89—96.

The JavaCard multi-application platform is now deployed to over twenty billion smartcards, used in various applications ranging from banking payments and authentication tokens to SIM cards and electronic documents. In most of those use cases, access to various cryptographic primitives is required. The standard JavaCard API provides a basic level of access to such functionality (e.g., RSA encryption) but does not expose low-level cryptographic primitives (e.g., elliptic curve operations) and essential data types (e.g., Integers). Developers can access such features only through proprietary, manufacturer-specific APIs. Unfortunately, such APIs significantly reduce the interoperability and certification transparency of the software produced as they require non-disclosure agreements (NDA) that prohibit public sharing of the applet's source code.We introduce JCMathLib, an open library that provides an intermediate layer realizing essential data types and low-level cryptographic primitives from high-level operations. To achieve this, we introduce a series of optimization techniques for resource-constrained platforms that make optimal use of the underlying hardware, while having a small memory footprint. To the best of our knowledge, it is the first generic library for low-level cryptographic operations in JavaCards that does not rely on a proprietary API.Without any disclosure limitations, JCMathLib has the potential to increase transparency by enabling open code sharing, release of research prototypes, and public code audits. Moreover, JCMathLib can help resolve the conflict between strict open-source licenses such as GPL and proprietary APIs available only under an NDA. This is of particular importance due to the introduction of JavaCard API v3.1, which targets specifically IoT devices, where open-source development might be more common than in the relatively closed world of government-issued electronic documents.

Mindermann, K., Wagner, S..  2020.  Fluid Intelligence Doesn't Matter! Effects of Code Examples on the Usability of Crypto APIs. 2020 IEEE/ACM 42nd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). :306—307.

Context : Programmers frequently look for the code of previously solved problems that they can adapt for their own problem. Despite existing example code on the web, on sites like Stack Overflow, cryptographic Application Programming Interfaces (APIs) are commonly misused. There is little known about what makes examples helpful for developers in using crypto APIs. Analogical problem solving is a psychological theory that investigates how people use known solutions to solve new problems. There is evidence that the capacity to reason and solve novel problems a.k.a Fluid Intelligence (Gf) and structurally and procedurally similar solutions support problem solving. Aim: Our goal is to understand whether similarity and Gf also have an effect in the context of using cryptographic APIs with the help of code examples. Method : We conducted a controlled experiment with 76 student participants developing with or without procedurally similar examples, one of two Java crypto libraries and measured the Gf of the participants as well as the effect on usability (effectiveness, efficiency, satisfaction) and security bugs. Results: We observed a strong effect of code examples with a high procedural similarity on all dependent variables. Fluid intelligence Gf had no effect. It also made no difference which library the participants used. Conclusions: Example code must be more highly similar to a concrete solution, not very abstract and generic to have a positive effect in a development task.

2020-12-17
Sun, P., Garcia, L., Salles-Loustau, G., Zonouz, S..  2020.  Hybrid Firmware Analysis for Known Mobile and IoT Security Vulnerabilities. 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). :373—384.

Mobile and IoT operating systems–and their ensuing software updates–are usually distributed as binary files. Given that these binary files are commonly closed source, users or businesses who want to assess the security of the software need to rely on reverse engineering. Further, verifying the correct application of the latest software patches in a given binary is an open problem. The regular application of software patches is a central pillar for improving mobile and IoT device security. This requires developers, integrators, and vendors to propagate patches to all affected devices in a timely and coordinated fashion. In practice, vendors follow different and sometimes improper security update agendas for both mobile and IoT products. Moreover, previous studies revealed the existence of a hidden patch gap: several vendors falsely reported that they patched vulnerabilities. Therefore, techniques to verify whether vulnerabilities have been patched or not in a given binary are essential. Deep learning approaches have shown to be promising for static binary analyses with respect to inferring binary similarity as well as vulnerability detection. However, these approaches fail to capture the dynamic behavior of these systems, and, as a result, they may inundate the analysis with false positives when performing vulnerability discovery in the wild. In particular, they cannot capture the fine-grained characteristics necessary to distinguish whether a vulnerability has been patched or not. In this paper, we present PATCHECKO, a vulnerability and patch presence detection framework for executable binaries. PATCHECKO relies on a hybrid, cross-platform binary code similarity analysis that combines deep learning-based static binary analysis with dynamic binary analysis. PATCHECKO does not require access to the source code of the target binary nor that of vulnerable functions. We evaluate PATCHECKO on the most recent Google Pixel 2 smartphone and the Android Things IoT firmware images, within which 25 known CVE vulnerabilities have been previously reported and patched. Our deep learning model shows a vulnerability detection accuracy of over 93%. We further prune the candidates found by the deep learning stage–which includes false positives–via dynamic binary analysis. Consequently, PATCHECKO successfully identifies the correct matches among the candidate functions in the top 3 ranked outcomes 100% of the time. Furthermore, PATCHECKO's differential engine distinguishes between functions that are still vulnerable and those that are patched with an accuracy of 96%.

2020-12-11
Sabek, I., Chandramouli, B., Minhas, U. F..  2019.  CRA: Enabling Data-Intensive Applications in Containerized Environments. 2019 IEEE 35th International Conference on Data Engineering (ICDE). :1762—1765.
Today, a modern data center hosts a wide variety of applications comprising batch, interactive, machine learning, and streaming applications. In this paper, we factor out the commonalities in a large majority of these applications, into a generic dataflow layer called Common Runtime for Applications (CRA). In parallel, another trend, with containerization technologies (e.g., Docker), has taken a serious hold on cloud-scale data centers, with direct implications on building next generation of data center applications. Container orchestrators (e.g., Kubernetes) have made deployment a lot easy, and they solve many infrastructure level problems, e.g., service discovery, auto-restart, and replication. For best in class performance, there is a need to marry the next generation applications with containerization technologies. To that end, CRA leverages and builds upon the containerization and resource orchestration capabilities of Kubernetes/Docker, and makes it easy to build a wide range of cloud-edge applications on top. To the best of our knowledge, we are the first to present a cloud native runtime for building data center applications. We show the efficiency of CRA through various micro-benchmarking experiments.
Zhou, Y., Zeng, Z..  2019.  Info-Retrieval with Relevance Feedback using Hybrid Learning Scheme for RS Image. 2019 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC). :135—138.

Relevance feedback can be considered as a learning problem. It has been extensively used to improve the performance of retrieval multimedia information. In this paper, after the relevance feedback upon content-based image retrieval (CBIR) discussed, a hybrid learning scheme on multi-target retrieval (MTR) with relevance feedback was proposed. Suppose the symbolic image database (SID) of object-level with combined image metadata and feature model was constructed. During the interactive query for remote sensing image, we calculate the similarity metric so as to get the relevant image sets from the image library. For the purpose of further improvement of the precision of image retrieval, a hybrid learning scheme parameter also need to be chosen. As a result, the idea of our hybrid learning scheme contains an exception maximization algorithm (EMA) used for retrieving the most relevant images from SID and an algorithm called supported vector machine (SVM) with relevance feedback used for learning the feedback information substantially. Experimental results show that our hybrid learning scheme with relevance feedback on MTR can improve the performance and accuracy compared the basic algorithms.

Huang, Y., Wang, Y..  2019.  Multi-format speech perception hashing based on time-frequency parameter fusion of energy zero ratio and frequency band variance. 2019 3rd International Conference on Electronic Information Technology and Computer Engineering (EITCE). :243—251.

In order to solve the problems of the existing speech content authentication algorithm, such as single format, ununiversal algorithm, low security, low accuracy of tamper detection and location in small-scale, a multi-format speech perception hashing based on time-frequency parameter fusion of energy zero ratio and frequency band bariance is proposed. Firstly, the algorithm preprocesses the processed speech signal and calculates the short-time logarithmic energy, zero-crossing rate and frequency band variance of each speech fragment. Then calculate the energy to zero ratio of each frame, perform time- frequency parameter fusion on time-frequency features by mean filtering, and the time-frequency parameters are constructed by difference hashing method. Finally, the hash sequence is scrambled with equal length by logistic chaotic map, so as to improve the security of the hash sequence in the transmission process. Experiments show that the proposed algorithm is robustness, discrimination and key dependent.