EAGER: Effective Detection of Vulnerabilities and Linguistic Stratification in Open Source Software

Submitted by Raul Aranovich on Tue, 01/02/2018 - 3:04pm

Project Details

Lead PI

Raul Aranovich

Co-PIs

Vladimir Filkov

Premkumar Devanbu

Performance Period

Oct 01, 2014 - Sep 30, 2017

Institution(s)

University of California-Davis

Award Number

1445079

Outcomes Report URL

https://www.research.gov/research-portal/appmanager/base/desktop?_nfpb=true&_win...

Software vulnerabilities are weaknesses in the code that may be exploited by cybercriminals to harm a system. They often do not hinder a program's functionality, and are thus difficult to detect. This project focuses on developing methods to identify such "weak spots" in a program, where vulnerabilities are more likely to occur.

The approach used for detecting weak spots is based on the novel idea of examining linguistic patterns employed by code developers in Open-Source Software (OSS) online communities. Using a combination of natural language processing methods and sociolinguistic analyses, the PIs research the links between a programmer's role within a social hierarchy of trust and influence and his or her skills in producing code that avoids vulnerabilities and adheres to the communal cybersecurity standards. The research results in a faster way to identify vulnerabilities, therefore contributing to make programs safer. It also contributes to understanding of the natural properties of code and the social dynamics of communication in online groups, laying the foundation for further research into linguistic aspects of software engineering.

Raul Aranovich