Visible to the public Biblio

Filters: Keyword is WikiLeaks dataset  [Clear All Filters]
2018-01-10
Alzhrani, K., Rudd, E. M., Chow, C. E., Boult, T. E..  2017.  Automated U.S diplomatic cables security classification: Topic model pruning vs. classification based on clusters. 2017 IEEE International Symposium on Technologies for Homeland Security (HST). :1–6.
The U.S Government has been the target for cyberattacks from all over the world. Just recently, former President Obama accused the Russian government of the leaking emails to Wikileaks and declared that the U.S. might be forced to respond. While Russia denied involvement, it is clear that the U.S. has to take some defensive measures to protect its data infrastructure. Insider threats have been the cause of other sensitive information leaks too, including the infamous Edward Snowden incident. Most of the recent leaks were in the form of text. Due to the nature of text data, security classifications are assigned manually. In an adversarial environment, insiders can leak texts through E-mail, printers, or any untrusted channels. The optimal defense is to automatically detect the unstructured text security class and enforce the appropriate protection mechanism without degrading services or daily tasks. Unfortunately, existing Data Leak Prevention (DLP) systems are not well suited for detecting unstructured texts. In this paper, we compare two recent approaches in the literature for text security classification, evaluating them on actual sensitive text data from the WikiLeaks dataset.