Data science

group_project

Visible to the public  TWC: Small: User Behavior Modeling and Prediction in Anonymous Social Networks

Human beings are diverse, and their online behavior is often unpredictable. In today's data-driven world, providers of online services are collecting detailed and comprehensive server-side traces of user activity. These records or logs include detailed, timestamped logs of actions taken by users, often called clickstreams. Given their scale and level of detail, clickstreams present an enormous opportunity for research into user behavioral analysis and modeling.

group_project

Visible to the public EAGER: Collaborative: PRICE: Using process tracing to improve household IoT users' privacy decisions

Household Internet-of-Things (IoT) devices are intended to collect information in the home and to communicate with each other, to create powerful new applications that support our day-to-day activities. Existing research suggests that users have a difficult time selecting their privacy settings on such devices. The goal of this project is to investigate how, why and when privacy decisions of household IoT users are suboptimal, and to use the insights from this research to create and test a simple single user interface that integrates privacy settings across all devices within a household.

group_project

Visible to the public TWC: Small: Statistical Models for Opinion Spam Detection Leveraging Linguistic and Behavioral Cues

Online opinions now play a pivotal role in decision making and influence a wide spectrum of our lives. Choices of restaurants at which to dine, places to stay, universities to attend, books to read, doctors to consult, and even political candidates to vote for, are largely influenced by crowdsourced opinions. However, it is estimated that up to 30% of reviews on websites are fake. As a larger part of the US economy is becoming driven by social opinions, it poses a serious risk to the general public (e.g., by getting mislead to invest on low quality products, services or doctors).

group_project

Visible to the public TWC: Medium: Collaborative: Online Social Network Fraud and Attack Research and Identification

Online social networks (OSNs) face various forms of fraud and attacks, such as spam, denial of service, Sybil attacks, and viral marketing. In order to build trustworthy and secure OSNs, it has become critical to develop techniques to analyze and detect OSN fraud and attacks. Existing OSN security approaches usually target a specific type of OSN fraud or attack and often fall short of detecting more complex attacks such as collusive attacks that involve many fraudulent OSN accounts, or dynamic attacks that encompass multiple attack phases over time.

group_project

Visible to the public EAGER: Guaranteed-Secure and Searchable Genomic Data Repositories

Publicly available and searchable genomic data banks could revolutionize clinical and research settings, but privacy concerns about releasing such information are currently preventing its usage. This project aims to address these concerns by providing new mechanisms by which individuals can donate their genomic information to a data bank in such a way that third parties, such as doctors or researchers, querying the data bank are guaranteed to learn only aggregate functions of the population's data that the individuals authorize.

group_project

Visible to the public TTP: Small: Collaborative: Defending Against Website Fingerprinting in Tor

The more people use the Internet, the more they risk sharing information they don't want other people to know. Tor is a technology that every day helps millions of people protect their privacy online. Tor users -- ranging from ordinary citizens to companies with valuable intellectual property -- gain protection for the content of their online messages and activities, as well as whom they interact with and when. For the most part, Tor is very secure. However, it has a known vulnerability to an attack called website fingerprinting.

group_project

Visible to the public EAGER: Collaborative: Algorithmic Framework for Anomaly Detection in Interdependent Networks

Modern critical infrastructure relies on successful interdependent function among many different types of networks. For example, the Internet depends on access to the power grid, which in turn depends on the power-grid communication network and the energy production network. For this reason, network science researchers have begun examining the robustness of critical infrastructure as a network of networks, or a multilayer network. Research in network anomaly detection systems has focused on single network structures (specifically, the Internet as a single network).

group_project

Visible to the public EAGER: Leveling the Digital Playing Field for the Job Seeker

This project aims to assess how online data impacts the hiring process. In an ideal situation, one might imagine that employers hire the most skilled applicant, but sociological research indicates that this may not be the case. A job applicant's similarity to the interviewer in class background and class-based leisure activities often matters as much or more to employers than a job applicant's skills or work experience.

group_project

Visible to the public TWC: Small: Unsupervised and Statistical Natural Language Processing Techniques for Automatic Phishing and Opinion Spam Detection

In phishing, an attacker tries to steal sensitive information, e.g., bank/credit card account numbers, login information, etc., from Internet users. The US society and economy are increasingly dependent on the Internet and the web, which is plagued by phishing. One popular phishing method is to create a site that mimics a good site and then attract users to it via email, which is by far the most popular medium to entice unsuspecting users to the phishing site.

group_project

Visible to the public SaTC-BSF: TWC: Small: Using Individual Differences to Personalize Security Mitigations

Over the past decade, people have realized that failure to account for human factors has resulted in many software security problems. Yet, when software does feature user-centric design, it takes into account average user behavior rather than catering to the individual. Thus, systems designers have gone from designing for security experts to now appealing to the least common denominator.