Visible to the public Highly Configurable Systems - January 2016Conflict Detection Enabled

Public Audience
Purpose: To highlight progress. Information is generally at a higher level which is accessible to the interested public.

PI(s): Jurgen Pfeffer
Co-PI(s): Christian Kastner

1) HARD PROBLEM(S) ADDRESSED (with short descriptions)

  • Scalability and composability: Isolating configuration options or controlling their interactions will lead us toward composable analysis with regard to configuration options.
  • Predictive security metrics: To what degree can configuration-related indicate implementations that are more prone to vulnerabilities or in which vulnerabilities have more severe consequences?

2) PUBLICATIONS

Report papers written as a results of this research. If accepted by or submitted to a journal, which journal. If presented at a conference, which conference.

1. Kaestner, Christian & Pfeffer, Juergen (2014). Limiting Recertification in Highly Configurable Systems. Analyzing Interactions and Isolation among Configuration Options. HotSoS 2014: 2014 Symposium and Bootcamp on the Science of Security, April 8-9, Raleigh, NC.

2. Ferreira , Gabriel & Kastner, Christian & Pfeffer, Jurgen & Apel, Sven (2015). Characterizing Configuration Complexity in Highly-Configurable Systems with Variational Call Graphs. HotSoS 2015: 2015 Symposium and Bootcamp on the Science of Security, April 21-22, Urbana-Champaign, IL.

3. S. Zhou, J. Al-Kofahi, T. Nguyen, C. Kastner, and S. Nadi. Extracting Configuration Knowledge from Build Files with Symbolic Analysis. In Proceedings of the 3rd International Workshop on Release Engineering (Releng), New York, NY: ACM Press, May 2015.
Essential infrastructure for accurate analysis of configurable systems; a lot of configuration knowledge is stored in build system scripts; without extracting that information all analyses produce false positives and false negatives when analyzing feasible configurations within source code.

4. S. Nadi, T. Berger, C. Kastner, and K. Czarnecki. Where do Configuration Constraints Stem From? An Extraction Approach and an Empirical Study. IEEE Transactions on Software Engineering (TSE), 2015.
Targeted at understanding configuration knowledge within the code base


5. F. Medeiros, C. Kastner, M. Ribeiro, S. Nadi, and R. Gheyi. The Love/Hate Relationship with The C Preprocessor: An Interview Study. In Proceedings of the 29th European Conference on Object-Oriented Programming (ECOOP), Berlin/Heidelberg: Springer-Verlag, 2015. Understanding why developers still use #ifdefs in the source code and how they use it (it's increasing complexity and can be quite dangerous). Programming guidelines (with regard to the preprocessor) are often suggested but rarely enforced systematically. A key insight is that developers are very reluctant of adopting more advanced alternative solutions (many of which have been suggested in research) that could prevent whole classes of problems. This shows a technology transfer problem, which might be challenging to address. Certification and automatic tooling to check guidelines might provide a strategy to foster adoption of more secure coding practices.

6. C. Hunsen, J. Siegmund, O. Lessenich, S. Apel, B. Zhang, C. Kastner, and M. Becker. Preprocessor-Based Variability in Open-Source and Industrial Software Systems: An Empirical Study. Empirical Software Engineering (ESE), Special Issue on Empirical Evidence on Software Product Line Engineering, 2015.
A study analyzing whether open source and industrial configurable systems share similar characteristics. The study shows that they are similar with regard to preprocessor usage. This means that our results are much more likely to be transferable from open source to also industrial systems.

3) KEY HIGHLIGHTS

  • Our results show that our configuration complexity constructs add information to the well-established metrics.

  • New extraction of the Linux kernel call graph (with increased accuracy enabled by previously implemented pointer analysis.

  • Extraction and quantification of the Linux kernel feature model. The hierarchy of the tree-based model was used to assign values to features that represent how frequently they are included when tailoring configuration of the Linux Kernel.

  • Statistical hypothesis test of files and functions in the Linux Kernel: mean comparisons (t-test) between vulnerable files and non-vulnerable files. The goal was to use size metrics (LOC, graph density) as proxies to characterize complexity of files. We wanted to check whether it is possible to distinguish vulnerable from non-vulnerable files when using these size metrics.