Biblio
Context:
The ‘as code’ suffix in infrastructure as code (IaC) refers to applying software engineering activities, such as version control, to maintain IaC scripts. Without the application of these activities, defects that can have serious consequences may be introduced in IaC scripts. A systematic investigation of the development anti-patterns for IaC scripts can guide practitioners in identifying activities to avoid defects in IaC scripts. Development anti-patterns are recurring development activities that relate with defective IaC scripts.
Goal:
The goal of this paper is to help practitioners improve the quality of infrastructure as code (IaC) scripts by identifying development activities that relate with defective IaC scripts.
Methodology:
We identify development anti-patterns by adopting a mixed-methods approach, where we apply quantitative analysis with 2,138 open source IaC scripts and conduct a survey with 51 practitioners.
Findings:
We observe five development activities to be related with defective IaC scripts from our quantitative analysis. We identify five development anti-patterns namely, ‘boss is not around’, ‘many cooks spoil’, ‘minors are spoiler’, ‘silos’, and ‘unfocused contribution’.
Conclusion:
Our identified development anti-patterns suggest the importance of ‘as code’ activities in IaC because these activities are related to quality of IaC scripts.
Defects in infrastructure as code (IaC) scripts can have serious
consequences, for example, creating large-scale system outages. A
taxonomy of IaC defects can be useful for understanding the nature
of defects, and identifying activities needed to fix and prevent
defects in IaC scripts. The goal of this paper is to help practitioners
improve the quality of infrastructure as code (IaC) scripts by developing
a defect taxonomy for IaC scripts through qualitative analysis.
We develop a taxonomy of IaC defects by applying qualitative analysis
on 1,448 defect-related commits collected from open source
software (OSS) repositories of the Openstack organization. We conduct
a survey with 66 practitioners to assess if they agree with the
identified defect categories included in our taxonomy. We quantify
the frequency of identified defect categories by analyzing 80,425
commits collected from 291 OSS repositories spanning across 2005
to 2019.
Our defect taxonomy for IaC consists of eight categories, including
a category specific to IaC called idempotency (i.e., defects that
lead to incorrect system provisioning when the same IaC script is
executed multiple times). We observe the surveyed 66 practitioners
to agree most with idempotency. The most frequent defect category
is configuration data i.e., providing erroneous configuration data
in IaC scripts. Our taxonomy and the quantified frequency of the
defect categories may help in advancing the science of IaC script
quality.
Proactive security reviews and test efforts are a necessary component of the software development lifecycle. Resource limitations often preclude reviewing the entire code base. Making informed decisions on what code to review can improve a team's ability to find and remove vulnerabilities. Risk-based attack surface approximation (RASA) is a technique that uses crash dump stack traces to predict what code may contain exploitable vulnerabilities. The goal of this research is to help software development teams prioritize security efforts by the efficient development of a risk-based attack surface approximation. We explore the use of RASA using Mozilla Firefox and Microsoft Windows stack traces from crash dumps. We create RASA at the file level for Firefox, in which the 15.8% of the files that were part of the approximation contained 73.6% of the vulnerabilities seen for the product. We also explore the effect of random sampling of crashes on the approximation, as it may be impractical for organizations to store and process every crash received. We find that 10-fold random sampling of crashes at a rate of 10% resulted in 3% less vulnerabilities identified than using the entire set of stack traces for Mozilla Firefox. Sampling crashes in Windows 8.1 at a rate of 40% resulted in insignificant differences in vulnerability and file coverage as compared to a rate of 100%.
Massively Open Online Courses (MOOCs) provide a unique opportunity to reach out to students who would not normally be reached by alleviating the need to be physically present in the classroom. However, teaching software security coursework outside of a classroom setting can be challenging. What are the challenges when converting security material from an on-campus course to the MOOC format? The goal of this research is to assist educators in constructing software security coursework by providing a comparison of classroom courses and MOOCs. In this work, we compare demographic information, student motivations, and student results from an on-campus software security course and a MOOC version of the same course. We found that the two populations of students differed, with the MOOC reaching a more diverse set of students than the on-campus course. We found that students in the on-campus course had higher quiz scores, on average, than students in the MOOC. Finally, we document our experience running the courses and what we would do differently to assist future educators constructing similar MOOC's.
Security requirements around software systems have become more stringent as society becomes more interconnected via the Internet. New ways of prioritizing security efforts are needed so security professionals can use their time effectively to find security vulnerabilities or prevent them from occurring in the first place. The goal of this work is to help software development teams prioritize security efforts by approximating the attack surface of a software system via stack trace analysis. Automated attack surface approximation is a technique that uses crash dump stack traces to predict what code may contain exploitable vulnerabilities. If a code entity (a binary, file or function) appears on stack traces, then Attack Surface Approximation (ASA) considers that code entity is on the attack surface of the software system. We also explore whether number of appearances of code on stack traces correlates with where security vulnerabilities are found. To date, feasibility studies of ASA have been performed on Windows 8 and 8.1, and Mozilla Firefox. The results from these studies indicate that ASA may be useful for practitioners trying to secure their software systems. We are now working towards establishing the ground truth of what the attack surface of software systems is, along with looking at how ASA could change over time, among other metrics.
Even though build automation tools help to reduce errors and rapid releases of software changes, use of build automation tools is not widespread amongst software practitioners. Software practitioners perceive build automation tools as complex, which can hinder the adoption of these tools. How well founded such perception is, can be determined by systematic exploration of adoption factors that influence usage of build automation tools. The goal of this paper is to aid software practitioners in increasing their usage of build automation tools by identifying the adoption factors that influence usage of these tools. We conducted a survey to empirically identify the adoption factors that influence usage of build automation tools. We obtained survey responses from 268 software professionals who work at NestedApps, Red Hat, as well as contribute to open source software. We observe that adoption factors related to complexity do not have the strongest influence on usage of build automation tools. Instead, we observe compatibility-related adoption factors, such as adjustment with existing tools, and adjustment with practitioner's existing workflow, to have influence on usage of build automation tools with greater importance. Findings from our paper suggest that usage of build automation tools might increase if: build automation tools fit well with practitioners' existing workflow and tool usage; and usage of build automation tools are made more visible among practitioners' peers.
Android applications pose security and privacy risks for end-users. These risks are often quantified by performing dynamic analysis and permission analysis of the Android applications after release. Prediction of security and privacy risks associated with Android applications at early stages of application development, e.g. when the developer (s) are writing the code of the application, might help Android application developers in releasing applications to end-users that have less security and privacy risk. The goal of this paper is to aid Android application developers in assessing the security and privacy risk associated with Android applications by using static code metrics as predictors. In our paper, we consider security and privacy risk of Android application as how susceptible the application is to leaking private information of end-users and to releasing vulnerabilities. We investigate how effectively static code metrics that are extracted from the source code of Android applications, can be used to predict security and privacy risk of Android applications. We collected 21 static code metrics of 1,407 Android applications, and use the collected static code metrics to predict security and privacy risk of the applications. As the oracle of security and privacy risk, we used Androrisk, a tool that quantifies the amount of security and privacy risk of an Android application using analysis of Android permissions and dynamic analysis. To accomplish our goal, we used statistical learners such as, radial-based support vector machine (r-SVM). For r-SVM, we observe a precision of 0.83. Findings from our paper suggest that with proper selection of static code metrics, r-SVM can be used effectively to predict security and privacy risk of Android applications.
Intrusion Detection Systems (IDSs) are crucial security mechanisms widely deployed for critical network protection. However, conventional IDSs become incompetent due to the rapid growth in network size and the sophistication of large scale attacks. To mitigate this problem, Collaborative IDSs (CIDSs) have been proposed in literature. In CIDSs, a number of IDSs exchange their intrusion alerts and other relevant data so as to achieve better intrusion detection performance. Nevertheless, the required information exchange may result in privacy leakage, especially when these IDSs belong to different self-interested organizations. In order to obtain a quantitative understanding of the fundamental tradeoff between the intrusion detection accuracy and the organizations' privacy, a repeated two-layer single-leader multi-follower game is proposed in this work. Based on our game-theoretic analysis, we are able to derive the expected behaviors of both the attacker and the IDSs and obtain the utility-privacy tradeoff curve. In addition, the existence of Nash equilibrium (NE) is proved and an asynchronous dynamic update algorithm is proposed to compute the optimal collaboration strategies of IDSs. Finally, simulation results are shown to validate the analysis.
Scientific advancement is fueled by solid fundamental research, followed by replication, meta-analysis, and theory building. To support such advancement, researchers and government agencies have been working towards a "science of security". As in other sciences, security science requires high-quality fundamental research addressing important problems and reporting approaches that capture the information necessary for replication, meta-analysis, and theory building. The goal of this paper is to aid security researchers in establishing a baseline of the state of scientific reporting in security through an analysis of indicators of scientific research as reported in top security conferences, specifically the 2015 ACM CCS and 2016 IEEE S&P proceedings. To conduct this analysis, we employed a series of rubrics to analyze the completeness of information reported in papers relative to the type of evaluation used (e.g. empirical study, proof, discussion). Our findings indicated some important information is often missing from papers, including explicit documentation of research objectives and the threats to validity. Our findings show a relatively small number of replications reported in the literature. We hope that this initial analysis will serve as a baseline against which we can measure the advancement of the science of security.
When encountering a packet flow for which it has no covering rule, a software-defined networking (SDN) switch requests an appropriate rule from its controller; this request delays the routing of the flow until the controller responds. We show that this delay gives rise to a timing side channel in which an attacker can test for the recent occurrence of a target flow by judiciously probing the switch with forged flows and using the delays they suffer to discern whether covering rules were previously installed in the switch. We develop a Markov model of an SDN switch to permit the attacker to select the best probe (or probes) to infer whether a target flow has recently occurred. Our model captures complexities related to rule evictions to make room for other rules; rule timeouts due to inactivity; the presence of multiple rules that apply to overlapping sets of flows; and rule priorities. We show that our model permits detection of target flows with considerable accuracy in many cases.
Scientific advancement is fueled by solid fundamental research, followed by replication, meta-analysis, and theory building. To support such advancement, researchers and government agencies have been working towards a "science of security". As in other sciences, security science requires high-quality fundamental research addressing important problems and reporting approaches that capture the information necessary for replication, meta-analysis, and theory building. The goal of this paper is to aid security researchers in establishing a baseline of the state of scientific reporting in security through an analysis of indicators of scientific research as reported in top security conferences, specifically the 2015 ACM CCS and 2016 IEEE S&P proceedings. To conduct this analysis, we employed a series of rubrics to analyze the completeness of information reported in papers relative to the type of evaluation used (e.g. empirical study, proof, discussion). Our findings indicated some important information is often missing from papers, including explicit documentation of research objectives and the threats to validity. Our findings show a relatively small number of replications reported in the literature. We hope that this initial analysis will serve as a baseline against which we can measure the advancement of the science of security.
Objective: To evaluate the effectiveness of domain highlighting in helping users identify whether webpages are legitimate or spurious.
Background: As a component of the URL, a domain name can be overlooked. Consequently, browsers highlight the domain name to help users identify which website they are visiting. Nevertheless, few studies have assessed the effectiveness of domain highlighting, and the only formal study confounded highlighting with instructions to look at the address bar.
Method: Two phishing detection experiments were conducted. Experiment 1 was run online: Participants judged the legitimacy of webpages in two phases. In phase one, participants were to judge the legitimacy based on any information on the webpage, whereas phase two they were to focus on the address bar. Whether the domain was highlighted was also varied. Experiment 2 was conducted similarly but with participants in a laboratory setting, which allowed tracking of fixations.
Results: Participants differentiated the legitimate and fraudulent webpages better than chance. There was some benefit of attending to the address bar, but domain highlighting did not provide effective protection against phishing attacks. Analysis of eye-gaze fixation measures was in agreement with the task performance, but heat-map results revealed that participants’ visual attention was attracted by the domain highlighting.
Conclusion: Failure to detect many fraudulent webpages even when the domain was highlighted implies that users lacked knowledge of webpage security cues or how to use those cues.
Application: Potential applications include development of phishing-prevention training incorporating domain highlighting with other methods to help users identify phishing webpages.
When encountering a packet for which it has no matching forwarding rule, a software-defined networking (SDN) switch requests an appropriate rule from its controller; this request delays the routing of the flow until the controller responds. We show that this delay gives rise to a timing side channel in which an attacker can test for the recent occurrence of a target flow by judiciously probing the switch with forged flows and using the delays they encounter to discern whether covering rules were previously installed in the switch. We develop a Markov model of an SDN switch to permit the attacker to select the best probe (or probes) to infer whether a target flow has recently occurred. Our model captures practical challenges related to rule evictions to make room for other rules; rule timeouts due to inactivity; the presence of multiple rules that apply to overlapping sets of flows; and rule priorities. We show that our model enables detection of target flows with considerable accuracy in many cases.
The current approach to protect users from phishing attacks is to display a warning when the webpage is considered suspicious. We hypothesize that users are capable of making correct informed decisions when the warning also conveys the reasons why it is displayed. We chose to use traffic rankings of domains, which can be easily described to users, as a warning trigger and evaluated the effect of the phishing warning message and phishing training. The evaluation was conducted in a field experiment. We found that knowledge gained from the training enhances the effectiveness of phishing warnings, as the number of participants being phished was reduced. However, the knowledge by itself was not sufficient to provide phishing protection. We suggest that integrating training in the warning interface, involving traffic ranking in phishing detection, and explaining why warnings are generated will improve current phishing defense.
To evaluate the effectiveness of domain highlighting in helping users identify whether Web pages are legitimate or spurious. As a component of the URL, a domain name can be overlooked. Consequently, browsers highlight the domain name to help users identify which Web site they are visiting. Nevertheless, few studies have assessed the effectiveness of domain highlighting, and the only formal study confounded highlighting with instructions to look at the address bar. We conducted two phishing detection experiments. Experiment 1 was run online: Participants judged the legitimacy of Web pages in two phases. In Phase 1, participants were to judge the legitimacy based on any information on the Web page, whereas in Phase 2, they were to focus on the address bar. Whether the domain was highlighted was also varied. Experiment 2 was conducted similarly but with participants in a laboratory setting, which allowed tracking of fixations. Participants differentiated the legitimate and fraudulent Web pages better than chance. There was some benefit of attending to the address bar, but domain highlighting did not provide effective protection against phishing attacks. Analysis of eye-gaze fixation measures was in agreement with the task performance, but heat-map results revealed that participants’ visual attention was attracted by the highlighted domains. Failure to detect many fraudulent Web pages even when the domain was highlighted implies that users lacked knowledge of Web page security cues or how to use those cues. Potential applications include development of phishing prevention training incorporating domain highlighting with other methods to help users identify phishing Web pages.
Social network services enable users to conveniently share personal information. Often, the information shared concerns other people, especially other members of the social network service. In such situations, two or more people can have conflicting privacy preferences; thus, an appropriate sharing policy may not be apparent. We identify such situations as multiuser privacy scenarios. Current approaches propose finding a sharing policy through preference aggregation. However, studies suggest that users feel more confident in their decisions regarding sharing when they know the reasons behind each other's preferences. The goals of this paper are (1) understanding how people decide the appropriate sharing policy in multiuser scenarios where arguments are employed, and (2) developing a computational model to predict an appropriate sharing policy for a given scenario. We report on a study that involved a survey of 988 Amazon MTurk users about a variety of multiuser scenarios and the optimal sharing policy for each scenario. Our evaluation of the participants' responses reveals that contextual factors, user preferences, and arguments influence the optimal sharing policy in a multiuser scenario. We develop and evaluate an inference model that predicts the optimal sharing policy given the three types of features. We analyze the predictions of our inference model to uncover potential scenario types that lead to incorrect predictions, and to enhance our understanding of when multiuser scenarios are more or less prone to dispute.
To appear
The ineffectiveness of phishing warnings has been attributed to users' poor comprehension of the warning. However, the effectiveness of a phishing warning is typically evaluated at the time when users interact with a suspected phishing webpage, which we call the effect with phishing warning. Nevertheless, users' improved phishing detection when the warning is absent—or the effect of the warning—is the ultimate goal to prevent users from falling for phishing scams. We conducted an online study to evaluate the effect with and of several phishing warning variations, varying the point at which the warning was presented and whether procedural knowledge instruction was included in the warning interface. The current Chrome phishing warning was also included as a control. 360 Amazon Mechanical-Turk workers made submission; 500¬ word maximum for symposia) decisions about 10 login webpages (8 authentic, 2 fraudulent) with the aid of warning (first phase). After a short distracting task, the workers made the same decisions about 10 different login webpages (8 authentic, 2 fraudulent) without warning. In phase one, the compliance rates with two proposed warning interfaces (98% and 94%) were similar to those of the Chrome warning (98%), regardless of when the warning was presented. In phase two (without warning), performance was better for the condition in which warning with procedural knowledge instruction was presented before the phishing webpage in phase one, suggesting a better of effect than for the other conditions. With the procedural knowledge of how to determine a webpage’s legitimacy, users identified phishing webpages more accurately even without the warning being presented.
Many previous studies have shown that trust in automation mediates the effectiveness of automation in maintaining performance, and one critical factor that affects trust is the reliability of the automated system. In the cyber domain, automated systems are pervasive, yet the involvement of human trust has not been studied extensively as in other domains such as transportation.
In the current study, we used a phishing email identification task (with a phishing detection automated assistant system) as a testbed to study human trust in automation in the cyber domain. More specifically, we systematically investigated the influence of “description” (i.e., whether the user was informed about the actual reliability of the automated system) and “experience” (i.e., whether the user was provided feedback on their choices), in addition to the reliability level of the automated phishing detection system. These factors were varied in different conditions of response bias (false alarm vs. misses) and task difficulty (easy vs. difficult), which were found may be critical in a pilot study. Measures of user performance and trust were compared across different conditions. The measures of interest were human trust in the warning (a subjective rating of how trustable the warning system is), human reliance on the automated system (an objective measure of whether the participants comply with the system’s warnings), and performance (the overall quality of the decisions made).
Objective: To evaluate the effectiveness of domain highlighting in helping users identify whether webpages are legitimate or spurious.
Background: As a component of the URL, a domain name can be overlooked. Consequently, browsers highlight the domain name to help users identify which website they are visiting. Nevertheless, few studies have assessed the effectiveness of domain highlighting, and the only formal study confounded highlighting with instructions to look at the address bar.
Method: We conducted two phishing detection experiments. Experiment 1 was run online: Participants judged the legitimacy of webpages in two phases. In phase one, participants were to judge the legitimacy based on any information on the webpage, whereas phase two they were to focus on the address bar. Whether the domain was highlighted was also varied. Experiment 2 was conducted similarly but with participants in a laboratory setting, which allowed tracking of fixations.
Results: Participants differentiated the legitimate and fraudulent webpages better than chance. There was some benefit of attending to the address bar, but domain highlighting did not provide effective protection against phishing attacks. Analysis of eye-gaze fixation measures was in agreement with the task performance, but heat-map results revealed that participants’ visual attention was attracted by the highlighted domains.
Conclusion: Failure to detect many fraudulent webpages even when the domain was highlighted implies that users lacked knowledge of webpage security cues or how to use those cues.