Biblio
The results of recent experiments have suggested that code stylometry can successfully identify the author of short programs from among hundreds of candidates with up to 98% precision. This potential ability to discern the programmer of a code sample from a large group of possible authors could have concerning consequences for the open-source community at large, particularly those contributors that may wish to remain anonymous. Recent international events have suggested the developers of certain anti-censorship and anti-surveillance tools are being targeted by their governments and forced to delete their repositories or face prosecution. In light of this threat to the freedom and privacy of individual programmers around the world, we devised a tool, Style Counsel, to aid programmers in obfuscating their inherent style and imitating another, overt, author's style in order to protect their anonymity from this forensic technique. Our system utilizes the implicit rules encoded in the decision points of a random forest ensemble in order to derive a set of recommendations to present to the user detailing how to achieve this obfuscation and mimicry attack.
Social robots may make use of social abilities such as persuasion, commanding obedience, and lying. Meanwhile, the field of computer security and privacy has shown that these interpersonal skills can be applied by humans to perform social engineering attacks. Social engineering attacks are the deliberate application of manipulative social skills by an individual in an attempt to achieve a goal by convincing others to do or say things that may or may not be in their best interests. In our work we argue that robot social engineering attacks are already possible and that defenses should be developed to protect against these attacks. We do this by defining what a robot social engineer is, outlining how previous research has demonstrated robot social engineering, and discussing the risks that can accompany robot social engineering attacks.
One of the most challenging issues facing academic conferences and educational institutions today is plagiarism detection. Typically, these entities wish to ensure that the work products submitted to them have not been plagiarized from another source (e.g., authors submitting identical papers to multiple journals). Assembling large centralized databases of documents dramatically improves the effectiveness of plagiarism detection techniques, but introduces a number of privacy and legal issues: all document contents must be completely revealed to the database operator, making it an attractive target for abuse or attack. Moreover, this content aggregation involves the disclosure of potentially sensitive private content, and in some cases this disclosure may be prohibited by law. In this work, we introduce Elxa, the first scalable centralized plagiarism detection system that protects the privacy of the submissions. Elxa incorporates techniques from the current state of the art in plagiarism detection, as evaluated by the information retrieval community. Our system is designed to be operated on existing cloud computing infrastructure, and to provide incentives for the untrusted database operator to maintain the availability of the network. Elxa can be used to detect plagiarism in student work, duplicate paper submissions (and their associated peer reviews), similarities between confidential reports (e.g., malware summaries), or any approximate text reuse within a network of private documents. We implement a prototype using the Hadoop MapReduce framework, and demonstrate that it is feasible to achieve competitive detection effectiveness in the private setting.