Visible to the public Biblio

Filters: Keyword is program debugging  [Clear All Filters]
2019-07-01
Clemente, C. J., Jaafar, F., Malik, Y..  2018.  Is Predicting Software Security Bugs Using Deep Learning Better Than the Traditional Machine Learning Algorithms? 2018 IEEE International Conference on Software Quality, Reliability and Security (QRS). :95–102.

Software insecurity is being identified as one of the leading causes of security breaches. In this paper, we revisited one of the strategies in solving software insecurity, which is the use of software quality metrics. We utilized a multilayer deep feedforward network in examining whether there is a combination of metrics that can predict the appearance of security-related bugs. We also applied the traditional machine learning algorithms such as decision tree, random forest, naïve bayes, and support vector machines and compared the results with that of the Deep Learning technique. The results have successfully demonstrated that it was possible to develop an effective predictive model to forecast software insecurity based on the software metrics and using Deep Learning. All the models generated have shown an accuracy of more than sixty percent with Deep Learning leading the list. This finding proved that utilizing Deep Learning methods and a combination of software metrics can be tapped to create a better forecasting model thereby aiding software developers in predicting security bugs.

2019-02-14
Peng, H., Shoshitaishvili, Y., Payer, M..  2018.  T-Fuzz: Fuzzing by Program Transformation. 2018 IEEE Symposium on Security and Privacy (SP). :697-710.

Fuzzing is a simple yet effective approach to discover software bugs utilizing randomly generated inputs. However, it is limited by coverage and cannot find bugs hidden in deep execution paths of the program because the randomly generated inputs fail complex sanity checks, e.g., checks on magic values, checksums, or hashes. To improve coverage, existing approaches rely on imprecise heuristics or complex input mutation techniques (e.g., symbolic execution or taint analysis) to bypass sanity checks. Our novel method tackles coverage from a different angle: by removing sanity checks in the target program. T-Fuzz leverages a coverage-guided fuzzer to generate inputs. Whenever the fuzzer can no longer trigger new code paths, a light-weight, dynamic tracing based technique detects the input checks that the fuzzer-generated inputs fail. These checks are then removed from the target program. Fuzzing then continues on the transformed program, allowing the code protected by the removed checks to be triggered and potential bugs discovered. Fuzzing transformed programs to find bugs poses two challenges: (1) removal of checks leads to over-approximation and false positives, and (2) even for true bugs, the crashing input on the transformed program may not trigger the bug in the original program. As an auxiliary post-processing step, T-Fuzz leverages a symbolic execution-based approach to filter out false positives and reproduce true bugs in the original program. By transforming the program as well as mutating the input, T-Fuzz covers more code and finds more true bugs than any existing technique. We have evaluated T-Fuzz on the DARPA Cyber Grand Challenge dataset, LAVA-M dataset and 4 real-world programs (pngfix, tiffinfo, magick and pdftohtml). For the CGC dataset, T-Fuzz finds bugs in 166 binaries, Driller in 121, and AFL in 105. In addition, found 3 new bugs in previously-fuzzed programs and libraries.

2019-01-16
Gao, J., Lanchantin, J., Soffa, M. L., Qi, Y..  2018.  Black-Box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers. 2018 IEEE Security and Privacy Workshops (SPW). :50–56.

Although various techniques have been proposed to generate adversarial samples for white-box attacks on text, little attention has been paid to a black-box attack, which is a more realistic scenario. In this paper, we present a novel algorithm, DeepWordBug, to effectively generate small text perturbations in a black-box setting that forces a deep-learning classifier to misclassify a text input. We develop novel scoring strategies to find the most important words to modify such that the deep classifier makes a wrong prediction. Simple character-level transformations are applied to the highest-ranked words in order to minimize the edit distance of the perturbation. We evaluated DeepWordBug on two real-world text datasets: Enron spam emails and IMDB movie reviews. Our experimental results indicate that DeepWordBug can reduce the classification accuracy from 99% to 40% on Enron and from 87% to 26% on IMDB. Our results strongly demonstrate that the generated adversarial sequences from a deep-learning model can similarly evade other deep models.

2018-06-20
Aslanyan, H., Avetisyan, A., Arutunian, M., Keropyan, G., Kurmangaleev, S., Vardanyan, V..  2017.  Scalable Framework for Accurate Binary Code Comparison. 2017 Ivannikov ISPRAS Open Conference (ISPRAS). :34–38.
Comparison of two binary files has many practical applications: the ability to detect programmatic changes between two versions, the ability to find old versions of statically linked libraries to prevent the use of well-known bugs, malware analysis, etc. In this article, a framework for comparison of binary files is presented. Framework uses IdaPro [1] disassembler and Binnavi [2] platform to recover structure of the target program and represent it as a call graph (CG). A program dependence graph (PDG) corresponds to each vertex of the CG. The proposed comparison algorithm consists of two main stages. At the first stage, several heuristics are applied to find the exact matches. Two functions are matched if at least one of the calculated heuristics is the same and unique in both binaries. At the second stage, backward and forward slicing is applied on matched vertices of CG to find further matches. According to empiric results heuristic method is effective and has high matching quality for unchanged or slightly modified functions. As a contradiction, to match heavily modified functions, binary code clone detection is used and it is based on finding maximum common subgraph for pair of PDGs. To achieve high performance on extensive binaries, the whole matching process is parallelized. The framework is tested on the number of real world libraries, such as python, openssh, openssl, libxml2, rsync, php, etc. Results show that in most cases more than 95% functions are truly matched. The tool is scalable due to parallelization of functions matching process and generation of PDGs and CGs.
2018-06-11
Kwon, H., Harris, W., Esmaeilzadeh, H..  2017.  Proving Flow Security of Sequential Logic via Automatically-Synthesized Relational Invariants. 2017 IEEE 30th Computer Security Foundations Symposium (CSF). :420–435.

Due to the proliferation of reprogrammable hardware, core designs built from modules drawn from a variety of sources execute with direct access to critical system resources. Expressing guarantees that such modules satisfy, in particular the dynamic conditions under which they release information about their unbounded streams of inputs, and automatically proving that they satisfy such guarantees, is an open and critical problem.,,To address these challenges, we propose a domain-specific language, named STREAMS, for expressing information-flow policies with declassification over unbounded input streams. We also introduce a novel algorithm, named SIMAREL, that given a core design C and STREAMS policy P, automatically proves or falsifies that C satisfies P. The key technical insight behind the design of SIMAREL is a novel algorithm for efficiently synthesizing relational invariants over pairs of circuit executions.,,We expressed expected behavior of cores designed independently for research and production as STREAMS policies and used SIMAREL to check if each core satisfies its policy. SIMAREL proved that half of the cores satisfied expected behavior, but found unexpected information leaks in six open-source designs: an Ethernet controller, a flash memory controller, an SD-card storage manager, a robotics controller, a digital-signal processing (DSP) module, and a debugging interface.

2018-06-07
Nashaat, M., Ali, K., Miller, J..  2017.  Detecting Security Vulnerabilities in Object-Oriented PHP Programs. 2017 IEEE 17th International Working Conference on Source Code Analysis and Manipulation (SCAM). :159–164.

PHP is one of the most popular web development tools in use today. A major concern though is the improper and insecure uses of the language by application developers, motivating the development of various static analyses that detect security vulnerabilities in PHP programs. However, many of these approaches do not handle recent, important PHP features such as object orientation, which greatly limits the use of such approaches in practice. In this paper, we present OOPIXY, a security analysis tool that extends the PHP security analyzer PIXY to support reasoning about object-oriented features in PHP applications. Our empirical evaluation shows that OOPIXY detects 88% of security vulnerabilities found in micro benchmarks. When used on real-world PHP applications, OOPIXY detects security vulnerabilities that could not be detected using state-of-the-art tools, retaining a high level of precision. We have contacted the maintainers of those applications, and two applications' development teams verified the correctness of our findings. They are currently working on fixing the bugs that lead to those vulnerabilities.

2018-03-26
Aslan, Ö, Samet, R..  2017.  Mitigating Cyber Security Attacks by Being Aware of Vulnerabilities and Bugs. 2017 International Conference on Cyberworlds (CW). :222–225.

Because the Internet makes human lives easier, many devices are connected to the Internet daily. The private data of individuals and large companies, including health-related data, user bank accounts, and military and manufacturing data, are increasingly accessible via the Internet. Because almost all data is now accessible through the Internet, protecting these valuable assets has become a major concern. The goal of cyber security is to protect such assets from unauthorized use. Attackers use automated tools and manual techniques to penetrate systems by exploiting existing vulnerabilities and software bugs. To provide good enough security; attack methodologies, vulnerability concepts and defence strategies should be thoroughly investigated. The main purpose of this study is to show that the patches released for existing vulnerabilities at the operating system (OS) level and in software programs does not completely prevent cyber-attack. Instead, producing specific patches for each company and fixing software bugs by being aware of the software running on each specific system can provide a better result. This study also demonstrates that firewalls, antivirus software, Windows Defender and other prevention techniques are not sufficient to prevent attacks. Instead, this study examines different aspects of penetration testing to determine vulnerable applications and hosts using the Nmap and Metasploit frameworks. For a test case, a virtualized system is used that includes different versions of Windows and Linux OS.

2018-02-21
Bojanova, I., Black, P. E., Yesha, Y..  2017.  Cryptography classes in bugs framework (BF): Encryption bugs (ENC), verification bugs (VRF), and key management bugs (KMN). 2017 IEEE 28th Annual Software Technology Conference (STC). :1–8.

Accurate, precise, and unambiguous definitions of software weaknesses (bugs) and clear descriptions of software vulnerabilities are vital for building the foundations of cybersecurity. The Bugs Framework (BF) comprises rigorous definitions and (static) attributes of bug classes, along with their related dynamic properties, such as proximate, secondary and tertiary causes, consequences, and sites. This paper presents an overview of previously developed BF classes and the new cryptography related classes: Encryption Bugs (ENC), Verification Bugs (VRF), and Key Management Bugs (KMN). We analyze corresponding vulnerabilities and provide their clear descriptions by applying the BF taxonomy. We also discuss the lessons learned and share our plans for expanding BF.

2018-02-02
Tramèr, F., Atlidakis, V., Geambasu, R., Hsu, D., Hubaux, J. P., Humbert, M., Juels, A., Lin, H..  2017.  FairTest: Discovering Unwarranted Associations in Data-Driven Applications. 2017 IEEE European Symposium on Security and Privacy (EuroS P). :401–416.

In a world where traditional notions of privacy are increasingly challenged by the myriad companies that collect and analyze our data, it is important that decision-making entities are held accountable for unfair treatments arising from irresponsible data usage. Unfortunately, a lack of appropriate methodologies and tools means that even identifying unfair or discriminatory effects can be a challenge in practice. We introduce the unwarranted associations (UA) framework, a principled methodology for the discovery of unfair, discriminatory, or offensive user treatment in data-driven applications. The UA framework unifies and rationalizes a number of prior attempts at formalizing algorithmic fairness. It uniquely combines multiple investigative primitives and fairness metrics with broad applicability, granular exploration of unfair treatment in user subgroups, and incorporation of natural notions of utility that may account for observed disparities. We instantiate the UA framework in FairTest, the first comprehensive tool that helps developers check data-driven applications for unfair user treatment. It enables scalable and statistically rigorous investigation of associations between application outcomes (such as prices or premiums) and sensitive user attributes (such as race or gender). Furthermore, FairTest provides debugging capabilities that let programmers rule out potential confounders for observed unfair effects. We report on use of FairTest to investigate and in some cases address disparate impact, offensive labeling, and uneven rates of algorithmic error in four data-driven applications. As examples, our results reveal subtle biases against older populations in the distribution of error in a predictive health application and offensive racial labeling in an image tagger.

2017-12-28
Poon, W. N., Bennin, K. E., Huang, J., Phannachitta, P., Keung, J. W..  2017.  Cross-Project Defect Prediction Using a Credibility Theory Based Naive Bayes Classifier. 2017 IEEE International Conference on Software Quality, Reliability and Security (QRS). :434–441.

Several defect prediction models proposed are effective when historical datasets are available. Defect prediction becomes difficult when no historical data exist. Cross-project defect prediction (CPDP), which uses projects from other sources/companies to predict the defects in the target projects proposed in recent studies has shown promising results. However, the performance of most CPDP approaches are still beyond satisfactory mainly due to distribution mismatch between the source and target projects. In this study, a credibility theory based Naïve Bayes (CNB) classifier is proposed to establish a novel reweighting mechanism between the source projects and target projects so that the source data could simultaneously adapt to the target data distribution and retain its own pattern. Our experimental results show that the feasibility of the novel algorithm design and demonstrate the significant improvement in terms of the performance metrics considered achieved by CNB over other CPDP approaches.

2017-11-20
Kaur, R., Singh, A., Singh, S., Sharma, S..  2016.  Security of software defined networks: Taxonomic modeling, key components and open research area. 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT). :2832–2839.

Software defined networking promises network operators to dramatically simplify network management. It provides flexibility and innovation through network programmability. With SDN, network management moves from codifying functionality in terms of low-level device configuration to building software that facilitates network management and debugging[1]. SDN provides new techniques to solve long-standing problems in networking like routing by separating the complexity of state distribution from network specification. Despite all the hype surrounding SDNs, exploiting its full potential is demanding. Security is still the major issue and a striking challenge that reduces the growth of SDNs. Moreover the introduction of various architectural components and up cycling of novel entities of SDN poses new security issues and threats. SDN is considered as major target for digital threats and cyber-attacks[2] and have more devastating effects than simple networks. Initial SDN design doesn't considered security as its part; therefore, it must be raised on the agenda. This article discusses the security solutions proposed to secure SDNs. We categorize the security solutions in the article by presenting a thematic taxonomy based on SDN architectural layers/interfaces[3], security measures and goals, simulation framework. Moreover, the literature also points out the possible attacks[2] targeting different layers/interfaces of SDNs. For securing SDNs, the potential requirements and their key enablers are also identified and presented. Also, the articles sketch the design of secure and dependable SDNs. At last, we discuss open issues and challenges of SDN security that may be rated appropriate to be handled by professionals and researchers in the future.

2015-05-05
Petullo, W.M., Wenyuan Fei, Solworth, J.A., Gavlin, P..  2014.  Ethos' Deeply Integrated Distributed Types. Security and Privacy Workshops (SPW), 2014 IEEE. :167-180.

Programming languages have long incorporated type safety, increasing their level of abstraction and thus aiding programmers. Type safety eliminates whole classes of security-sensitive bugs, replacing the tedious and error-prone search for such bugs in each application with verifying the correctness of the type system. Despite their benefits, these protections often end at the process boundary, that is, type safety holds within a program but usually not to the file system or communication with other programs. Existing operating system approaches to bridge this gap require the use of a single programming language or common language runtime. We describe the deep integration of type safety in Ethos, a clean-slate operating system which requires that all program input and output satisfy a recognizer before applications are permitted to further process it. Ethos types are multilingual and runtime-agnostic, and each has an automatically generated unique type identifier. Ethos bridges the type-safety gap between programs by (1) providing a convenient mechanism for specifying the types each program may produce or consume, (2) ensuring that each type has a single, distributed-system-wide recognizer implementation, and (3) inescapably enforcing these type constraints.
 

2014-09-17
Szekeres, L., Payer, M., Tao Wei, Song, D..  2013.  SoK: Eternal War in Memory. Security and Privacy (SP), 2013 IEEE Symposium on. :48-62.

Memory corruption bugs in software written in low-level languages like C or C++ are one of the oldest problems in computer security. The lack of safety in these languages allows attackers to alter the program's behavior or take full control over it by hijacking its control flow. This problem has existed for more than 30 years and a vast number of potential solutions have been proposed, yet memory corruption attacks continue to pose a serious threat. Real world exploits show that all currently deployed protections can be defeated. This paper sheds light on the primary reasons for this by describing attacks that succeed on today's systems. We systematize the current knowledge about various protection techniques by setting up a general model for memory corruption attacks. Using this model we show what policies can stop which attacks. The model identifies weaknesses of currently deployed techniques, as well as other proposed protections enforcing stricter policies. We analyze the reasons why protection mechanisms implementing stricter polices are not deployed. To achieve wide adoption, protection mechanisms must support a multitude of features and must satisfy a host of requirements. Especially important is performance, as experience shows that only solutions whose overhead is in reasonable bounds get deployed. A comparison of different enforceable policies helps designers of new protection mechanisms in finding the balance between effectiveness (security) and efficiency. We identify some open research problems, and provide suggestions on improving the adoption of newer techniques.