Highly Configurable Systems - July 2015
Public Audience
Purpose: To highlight progress. Information is generally at a higher level which is accessible to the interested public.
PI(s): Jurgen Pfeffer
Co-PI(s): Christian Kastner
1) HARD PROBLEM(S) ADDRESSED (with short descriptions)
- Scalability and composability: Isolating configuration options or controlling their interactions will lead us toward composable analysis with regard to configuration options.
- Predictive security metrics: To what degree can configuration-related indicate implementations that are more prone to vulnerabilities or in which vulnerabilities have more severe consequences?
2) PUBLICATIONS
Report papers written as a results of this research. If accepted by or submitted to a journal, which journal. If presented at a conference, which conference.
1. Kaestner, Christian & Pfeffer, Juergen (2014). Limiting Recertification in Highly Configurable Systems. Analyzing Interactions and Isolation among Configuration Options. HotSoS 2014: 2014 Symposium and Bootcamp on the Science of Security, April 8-9, Raleigh, NC.
2. Ferreira , Gabriel & Kastner, Christian & Pfeffer, Jurgen & Apel, Sven (2015). Characterizing Configuration Complexity in Highly-Configurable Systems with Variational Call Graphs. HotSoS 2015: 2015 Symposium and Bootcamp on the Science of Security, April 21-22, Urbana-Champaign, IL.
3. S. Zhou, J. Al-Kofahi, T. Nguyen, C. Kastner, and S. Nadi. Extracting Configuration Knowledge from Build Files with Symbolic Analysis. In Proceedings of the 3rd International Workshop on Release Engineering (Releng), New York, NY: ACM Press, May 2015.
Essential infrastructure for accurate analysis of configurable systems; a lot of configuration knowledge is stored in build system scripts; without extracting that information all analyses produce false positives and false negatives when analyzing feasible configurations within source code.
4. S. Nadi, T. Berger, C. Kastner, and K. Czarnecki. Where do Configuration Constraints Stem From? An Extraction Approach and an Empirical Study. IEEE Transactions on Software Engineering (TSE), 2015.
Targeted at understanding configuration knowledge within the code base
5. F. Medeiros, C. Kastner, M. Ribeiro, S. Nadi, and R. Gheyi. The Love/Hate Relationship with The C Preprocessor: An Interview Study. In Proceedings of the 29th European Conference on Object-Oriented Programming (ECOOP), Berlin/Heidelberg: Springer-Verlag, 2015. Understanding why developers still use #ifdefs in the source code and how they use it (it's increasing complexity and can be quite dangerous). Programming guidelines (with regard to the preprocessor) are often suggested but rarely enforced systematically. A key insight is that developers are very reluctant of adopting more advanced alternative solutions (many of which have been suggested in research) that could prevent whole classes of problems. This shows a technology transfer problem, which might be challenging to address. Certification and automatic tooling to check guidelines might provide a strategy to foster adoption of more secure coding practices.
6. C. Hunsen, J. Siegmund, O. Lessenich, S. Apel, B. Zhang, C. Kastner, and M. Becker. Preprocessor-Based Variability in Open-Source and Industrial Software Systems: An Empirical Study. Empirical Software Engineering (ESE), Special Issue on Empirical Evidence on Software Product Line Engineering, 2015.
A study analyzing whether open source and industrial configurable systems share similar characteristics. The study shows that they are similar with regard to preprocessor usage. This means that our results are much more likely to be transferable from open source to also industrial systems.
3) KEY HIGHLIGHTS
- We finished implementing a variational version of a pointer analysis designed for building call graphs.
- We extracted the variational control-flow graph of the Linux kernel resulting in 250,000+ nodes (functions) and 700,000+ connections (calls) among the nodes.
- We collected Linux CVEs and mapped them to our data.
- We started with network analysis and network visualization on the Linux graph and correlate these metrics with known errors.