Biblio
The article considers the approach to static analysis of program code and the general principles of static analyzer operation. The authors identify the most important syntactic and semantic information in the programs, which can be used to find errors in the source code. The general methodology for development of diagnostic rules is proposed, which will improve the efficiency of static code analyzers.
Code smells may be introduced in software due to market rivalry, work pressure deadline, improper functioning, skills or inexperience of software developers. Code smells indicate problems in design or code which makes software hard to change and maintain. Detecting code smells could reduce the effort of developers, resources and cost of the software. Many researchers have proposed different techniques like DETEX for detecting code smells which have limited precision and recall. To overcome these limitations, a new technique named as SVMCSD has been proposed for the detection of code smells, based on support vector machine learning technique. Four code smells are specified namely God Class, Feature Envy, Data Class and Long Method and the proposed technique is validated on two open source systems namely ArgoUML and Xerces. The accuracy of SVMCSD is found to be better than DETEX in terms of two metrics, precision and recall, when applied on a subset of a system. While considering the entire system, SVMCSD detect more occurrences of code smells than DETEX.
Software metrics are widely used to measure the quality of software and to give an early indication of the efficiency of the development process in industry. There are many well-established frameworks for measuring the quality of source code through metrics, but limited attention has been paid to the quality of software models. In this article, we evaluate the quality of state machine models specified using the Analytical Software Design (ASD) tooling. We discuss how we applied a number of metrics to ASD models in an industrial setting and report about results and lessons learned while collecting these metrics. Furthermore, we recommend some quality limits for each metric and validate them on models developed in a number of industrial projects.
Detecting software security vulnerabilities and distinguishing vulnerable from non-vulnerable code is anything but simple. Most of the time, vulnerabilities remain undisclosed until they are exposed, for instance, by an attack during the software operational phase. Software metrics are widely-used indicators of software quality, but the question is whether they can be used to distinguish vulnerable software units from the non-vulnerable ones during development. In this paper, we perform an exploratory study on software metrics, their interdependency, and their relation with security vulnerabilities. We aim at understanding: i) the correlation between software architectural characteristics, represented in the form of software metrics, and the number of vulnerabilities; and ii) which are the most informative and discriminative metrics that allow identifying vulnerable units of code. To achieve these goals, we use, respectively, correlation coefficients and heuristic search techniques. Our analysis is carried out on a dataset that includes software metrics and reported security vulnerabilities, exposed by security attacks, for all functions, classes, and files of five widely used projects. Results show: i) a strong correlation between several project-level metrics and the number of vulnerabilities, ii) the possibility of using a group of metrics, at both file and function levels, to distinguish vulnerable and non-vulnerable code with a high level of accuracy.
Choosing how to write natural language scenarios is challenging, because stakeholders may over-generalize their descriptions or overlook or be unaware of alternate scenarios. In security, for example, this can result in weak security constraints that are too general, or missing constraints. Another challenge is that analysts are unclear on where to stop generating new scenarios. In this paper, we introduce the Multifactor Quality Method (MQM) to help requirements analysts to empirically collect system constraints in scenarios based on elicited expert preferences. The method combines quantitative statistical analysis to measure system quality with qualitative coding to extract new requirements. The method is bootstrapped with minimal analyst expertise in the domain affected by the quality area, and then guides an analyst toward selecting expert-recommended requirements to monotonically increase system quality. We report the results of applying the method to security. This include 550 requirements elicited from 69 security experts during a bootstrapping stage, and subsequent evaluation of these results in a verification stage with 45 security experts to measure the overall improvement of the new requirements. Security experts in our studies have an average of 10 years of experience. Our results show that using our method, we detect an increase in the security quality ratings collected in the verification stage. Finally, we discuss how our proposed method helps to improve security requirements elicitation, analysis, and measurement.
Creating and implementing fault-tolerant distributed algorithms is a challenging task in highly safety-critical industries. Using formal methods supports design and development of complex algorithms. However, formal methods are often perceived as an unjustifiable overhead. This paper presents the experience and insights when using TLA+ and PlusCal to model and develop fault-tolerant and safety-critical modules for TAS Control Platform, a platform for railway control applications up to safety integrity level (SIL) 4. We show how formal methods helped us improve the correctness of the algorithms, improved development efficiency and how part of the gap between model and implementation has been closed by translation to C code. Additionally, we describe how we gained trust in the formal model and tools by following a specific design process called property-driven design, which also implicitly addresses software quality metrics such as code coverage metrics.
Software has emerged as a significant part of many domains, including financial service platforms, social networks and vehicle control. Standards organizations have responded to this by creating regulations to address issues such as safety and privacy. In this context, compliance of software with standards has emerged as a key issue. For software development organizations, compliance is a complex and costly goal to achieve and is often accomplished by producing so-called assurance cases, which demonstrate that the system indeed satisfies the property imposed by a standard (e.g., safety, privacy, security). As systems and standards undergo evolution for a variety of reasons, maintaining assurance cases multiplies the effort. In this work, we propose to exploit the connection between the field of model management and the problem of compliance management and propose methods that use model management techniques to address compliance scenarios such as assurance case evolution and reuse. For validation, we ground our approaches on the automotive domain and the ISO 26262 standard for functional safety of road vehicles.
Secure by design is an approach to developing secure software systems from the ground up. In such approach, the alternate security tactics are first thought, among them, the best are selected and enforced by the architecture design, and then used as guiding principles for developers. Thus, design flaws in the architecture of a software system mean that successful attacks could result in enormous consequences. Therefore, secure by design shifts the main focus of software assurance from finding security bugs to identifying architectural flaws in the design. Current research in software security has been neglecting vulnerabilities which are caused by flaws in a software architecture design and/or deteriorations of the implementation of the architectural decisions. In this paper, we present the concept of Common Architectural Weakness Enumeration (CAWE), a catalog which enumerates common types of vulnerabilities rooted in the architecture of a software and provides mitigation techniques to address them. The CAWE catalog organizes the architectural flaws according to known security tactics. We developed an interactive web-based solution which helps designers and developers explore this catalog based on architectural choices made in their project. CAWE catalog contains 224 weaknesses related to security architecture. Through this catalog, we aim to promote the awareness of security architectural flaws and stimulate the security design thinking of developers, software engineers, and architects.
The correct prediction of faulty modules or classes has a number of advantages such as improving the quality of software and assigning capable development resources to fix such faults. There have been different kinds of fault/defect prediction models proposed in literature, but a great majority of them makes use of static code metrics as independent variables for making predictions. Recently, process metrics have gained a considerable attention as alternative metrics to use for making trust-worthy predictions. The objective of this paper is to investigate different combinations of static code and process metrics for evaluating fault prediction performance. We have used publicly available data sets, along with a frequently used classifier, Naive Bayes, to run our experiments. We have, both statistically and visually, analyzed our experimental results. The statistical analysis showed evidence against any significant difference in fault prediction performances for a variety of different combinations of metrics. This reinforced earlier research results that process metrics are as good as predictors of fault proneness as static code metrics. Furthermore, the visual inspection of box plots revealed that the best set of metrics for fault prediction is a mix of both static code and process metrics. We also presented evidence in support of some process metrics being more discriminating than others and thus making them as good predictors to use.
Many fault-proneness prediction models have been proposed in literature to identify fault-prone code in software systems. Most of the approaches use fault data history and supervised learning algorithms to build these models. However, since fault data history is not always available, some approaches also suggest using semi-supervised or unsupervised fault-proneness prediction models. The HySOM model, proposed in literature, uses function-level source code metrics to predict fault-prone functions in software systems, without using any fault data. In this paper, we adapt the HySOM approach for object-oriented software systems to predict fault-prone code at class-level granularity using object-oriented source code metrics. This adaptation makes it easier to prioritize the efforts of the testing team as unit tests are often written for classes in object-oriented software systems, and not for methods. Our adaptation also generalizes one main element of the HySOM model, which is the calculation of the source code metrics threshold values. We conducted an empirical study using 12 public datasets. Results show that the adaptation of the HySOM model for class-level fault-proneness prediction improves the consistency and the performance of the model. We additionally compared the performance of the adapted model to supervised approaches based on the Naive Bayes Network, ANN and Random Forest algorithms.
Several defect prediction models proposed are effective when historical datasets are available. Defect prediction becomes difficult when no historical data exist. Cross-project defect prediction (CPDP), which uses projects from other sources/companies to predict the defects in the target projects proposed in recent studies has shown promising results. However, the performance of most CPDP approaches are still beyond satisfactory mainly due to distribution mismatch between the source and target projects. In this study, a credibility theory based Naïve Bayes (CNB) classifier is proposed to establish a novel reweighting mechanism between the source projects and target projects so that the source data could simultaneously adapt to the target data distribution and retain its own pattern. Our experimental results show that the feasibility of the novel algorithm design and demonstrate the significant improvement in terms of the performance metrics considered achieved by CNB over other CPDP approaches.
Software security is an important aspect of ensuring software quality. Early detection of vulnerable code during development is essential for the developers to make cost and time effective software testing. The traditional software metrics are used for early detection of software vulnerability, but they are not directly related to code constructs and do not specify any particular granularity level. The goal of this study is to help developers evaluate software security using class-level traceable patterns called micro patterns to reduce security risks. The concept of micro patterns is similar to design patterns, but they can be automatically recognized and mined from source code. If micro patterns can better predict vulnerable classes compared to traditional software metrics, they can be used in developing a vulnerability prediction model. This study explores the performance of class-level patterns in vulnerability prediction and compares them with traditional class-level software metrics. We studied security vulnerabilities as reported for one major release of Apache Tomcat, Apache Camel and three stand-alone Java web applications. We used machine learning techniques for predicting vulnerabilities using micro patterns and class-level metrics as features. We found that micro patterns have higher recall in detecting vulnerable classes than the software metrics.
Blockchain is an emerging technology for decentralized and transactional data sharing across a large network of untrusted participants. It enables new forms of distributed software architectures, where components can find agreements on their shared states without trusting a central integration point or any particular participating components. Considering the blockchain as a software connector helps make explicitly important architectural considerations on the resulting performance and quality attributes (for example, security, privacy, scalability and sustainability) of the system. Based on our experience in several projects using blockchain, in this paper we provide rationales to support the architectural decision on whether to employ a decentralized blockchain as opposed to other software solutions, like traditional shared data storage. Additionally, we explore specific implications of using the blockchain as a software connector including design trade-offs regarding quality attributes.
Code coverage is a widely used measure to determine how thoroughly an application is tested. There are many tools available for different languages. However, to the best of our knowledge, most of them focus on unit testing and ignore end-to-end tests with ui- or web tests. Furthermore, there is no support for determining code coverage of transcompiled cross-platform applications. This kind of application is written in one language, but compiled to and executed in a different programming language. Besides, it may run on a different platform. In this paper, we propose a new code coverage testing method that calculates the code coverage of any kind of test (unit-, integration- or ui-/web-test) for any type of (transcompiled) applications (desktop, web or mobile application). Developers obtain information about which parts of the source code are uncovered by tests. The basis of our approach is generic and may be applied in numerous programming languages based on an abstract syntax tree. We present our approach for any-kind-applications developed in Java and evaluate our tool on a web application created with Google Web Toolkit, on standard desktop applications, and on some small Java applications that use the Swing library to create user interfaces. Our results show that our tool is able to judge the code coverage of any kind of test. In particular, our tool is independent of the unit- or ui-/web test-framework in use. The runtime performance is promising although it is not as fast as already existing tools in the area of unit-testing.
Software fault prediction is a significant part of software quality assurance and it is commonly used to detect faulty software modules based on software measurement data. Several machine learning based approaches have been proposed for generating predictive models from collected data, although none has become standard given the specificities of each software project. Hence, we believe that recommending the best algorithm for each project is much more important and useful than developing a single algorithm for being used in any project. For achieving that goal, we propose in this paper a novel framework for recommending machine learning algorithms that is capable of automatically identifying the most suitable algorithm according to the software project that is being considered. Our solution, namely SFP-MLF, makes use of the meta-learning paradigm in order to learn the best learner for a particular project. Results show that the SFP-MLF framework provides both the best single algorithm recommendation and also the best ranking recommendation for the software fault prediction problem.
Software components are software units designed to interact with other independently developed software components. These components are assembled by third parties into software applications. The success of final software applications largely depends upon the selection of appropriate and easy to fit components in software application according to the need of customer. It is primary requirement to evaluate the quality of components before using them in the final software application system. All the quality characteristics may not be of same significance for a particular software application of a specific domain. Therefore, it is necessary to identify only those characteristics/ sub-characteristics, which may have higher importance over the others. Analytical Network Process (ANP) is used to solve the decision problem, where attributes of decision parameters form dependency networks. The objective of this paper is to propose ANP based model to prioritize the characteristics /sub-characteristics of quality and to o estimate the numeric value of software quality.
- « first
- ‹ previous
- 1
- 2
- 3