Bullying has lasting negative psychological and physical effects on victims, bystanders, and bullies alike; online settings can magnify both the scale and impact of these effects, as anonymity can embolden people to make hostile posts about individuals or groups. This project aims to reduce the prevalence of such posts through the design of active, automated "bystander interventions" in online comment threads. Bystander interventions, in which one or more witnesses to a bullying incident pressures the bully to stop, are often effective in schoolyards, but people are often reluctant to intervene in online scenarios. Instead, a computer program could post comments that contain these interventions, potentially reducing follow-on aggression from the original poster or others who might pile on -- if bullies perceive these posts as coming from human bystanders, and if bullies under the cover of pseudonyms react to bystander interventions as they do in in-person confrontations. The project will proceed in three main stages. The first stage involves improving cyberbullying detection through better detection of non-standard language use associated with bullying in a particular commenting system. The second stage involves developing a dialogue system that acts like a human bystander, creating messages that look appropriate in the context of given a comment thread and that contain psychologically-valid bystander interventions. The third stage involves deploying the tool in a large video sharing site and monitoring its ability to detect and, through interventions, mitigate further bullying. If successful, the project could have real impacts in reducing online aggression in social media systems while reducing the need for (and possible harms to) human moderators; the tools will also be released to the community to support other kinds of research around how chatbots and humans might interact in online comments. The work on detection aims to advance natural language processing (NLP) and computational pragmatics, particularly around non-canonical language use, because state-of-the-art bullying detection schemes typically use bag-of-words approaches that do not consider the linguistic and structural features of cyberbullying. The team will explore how to identify both explicit indicators of bullying, by developing topic models based on complex features where particular topics are more often associated with bullying, and implicit indicators, through looking for words whose use in a given context diverges from their location in other contexts. The context will be represented as a subspace of words, where the words themselves occur as low-dimensional word embeddings. The dialogue generation portion of the project will characterize and represent properties of effective bystander interventions from the psychology literature. This representation will drive a dialogue manager designed to generate bystander responses automatically so that the responses contain features that are both believable and are known to be effective in reducing bullying online. These components will first be evaluated through offline testing, using comment data labeled for bullying content and human ratings of the generated dialogue. Once a reasonably effective pipeline has been built, it will be evaluated in a series of online experiments in which comment threads are monitored and automated bystander responses generated for some, but not all, threads detected as containing bullying. The software will log the monitored threads and any generated responses, along with behavior both before and after the automated bystander response in a particular thread; these data will allow the team to evaluate the impact of the bystander intervention on bullying incidents later in the thread.