Visible to the public Biblio

Filters: Keyword is belief propagation  [Clear All Filters]
2020-03-16
Ren, Wenyu, Yu, Tuo, Yardley, Timothy, Nahrstedt, Klara.  2019.  CAPTAR: Causal-Polytree-based Anomaly Reasoning for SCADA Networks. 2019 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm). :1–7.
The Supervisory Control and Data Acquisition (SCADA) system is the most commonly used industrial control system but is subject to a wide range of serious threats. Intrusion detection systems are deployed to promote the security of SCADA systems, but they continuously generate tremendous number of alerts without further comprehending them. There is a need for an efficient system to correlate alerts and discover attack strategies to provide explainable situational awareness to SCADA operators. In this paper, we present a causal-polytree-based anomaly reasoning framework for SCADA networks, named CAPTAR. CAPTAR takes the meta-alerts from our previous anomaly detection framework EDMAND, correlates the them using a naive Bayes classifier, and matches them to predefined causal polytrees. Utilizing Bayesian inference on the causal polytrees, CAPTAR can produces a high-level view of the security state of the protected SCADA network. Experiments on a prototype of CAPTAR proves its anomaly reasoning ability and its capabilities of satisfying the real-time reasoning requirement.
2019-03-06
Khalil, Issa M., Guan, Bei, Nabeel, Mohamed, Yu, Ting.  2018.  A Domain Is Only As Good As Its Buddies: Detecting Stealthy Malicious Domains via Graph Inference. Proceedings of the Eighth ACM Conference on Data and Application Security and Privacy. :330-341.

Inference based techniques are one of the major approaches to analyze DNS data and detect malicious domains. The key idea of inference techniques is to first define associations between domains based on features extracted from DNS data. Then, an inference algorithm is deployed to infer potential malicious domains based on their direct/indirect associations with known malicious ones. The way associations are defined is key to the effectiveness of an inference technique. It is desirable to be both accurate (i.e., avoid falsely associating domains with no meaningful connections) and with good coverage (i.e., identify all associations between domains with meaningful connections). Due to the limited scope of information provided by DNS data, it becomes a challenge to design an association scheme that achieves both high accuracy and good coverage. In this paper, we propose a new approach to identify domains controlled by the same entity. Our key idea is an in-depth analysis of active DNS data to accurately separate public IPs from dedicated ones, which enables us to build high-quality associations between domains. Our scheme avoids the pitfall of naive approaches that rely on weak "co-IP" relationship of domains (i.e., two domains are resolved to the same IP) that results in low detection accuracy, and, meanwhile, identifies many meaningful connections between domains that are discarded by existing state-of-the-art approaches. Our experimental results show that the proposed approach not only significantly improves the domain coverage compared to existing approaches but also achieves better detection accuracy. Existing path-based inference algorithms are specifically designed for DNS data analysis. They are effective but computationally expensive. To further demonstrate the strength of our domain association scheme as well as improve the inference efficiency, we construct a new domain-IP graph that can work well with the generic belief propagation algorithm. Through comprehensive experiments, we show that this approach offers significant efficiency and scalability improvement with only a minor impact to detection accuracy, which suggests that such a combination could offer a good tradeoff for malicious domain detection in practice.

2018-06-20
Tran, H., Nguyen, A., Vo, P., Vu, T..  2017.  DNS graph mining for malicious domain detection. 2017 IEEE International Conference on Big Data (Big Data). :4680–4685.

As a vital component of variety cyber attacks, malicious domain detection becomes a hot topic for cyber security. Several recent techniques are proposed to identify malicious domains through analysis of DNS data because much of global information in DNS data which cannot be affected by the attackers. The attackers always recycle resources, so they frequently change the domain - IP resolutions and create new domains to avoid detection. Therefore, multiple malicious domains are hosted by the same IPs and multiple IPs also host same malicious domains in simultaneously, which create intrinsic association among them. Hence, using the labeled domains which can be traced back from queries history of all domains to verify and figure out the association of them all. Graphs seem the best candidate to represent for this relationship and there are many algorithms developed on graph with high performance. A graph-based interface can be developed and transformed to the graph mining task of inferring graph node's reputation scores using improvements of the belief propagation algorithm. Then higher reputation scores the nodes reveal, the more malicious probabilities they infer. For demonstration, this paper proposes a malicious domain detection technique and evaluates on a real-world dataset. The dataset is collected from DNS data servers which will be used for building a DNS graph. The proposed technique achieves high performance in accuracy rates over 98.3%, precision and recall rates as: 99.1%, 98.6%. Especially, with a small set of labeled domains (legitimate and malicious domains), the technique can discover a large set of potential malicious domains. The results indicate that the method is strongly effective in detecting malicious domains.

2018-04-02
Cheng, Q., Kwiat, K., Kamhoua, C. A., Njilla, L..  2017.  Attack Graph Based Network Risk Assessment: Exact Inference vs Region-Based Approximation. 2017 IEEE 18th International Symposium on High Assurance Systems Engineering (HASE). :84–87.

Quantitative risk assessment is a critical first step in risk management and assured design of networked computer systems. It is challenging to evaluate the marginal probabilities of target states/conditions when using a probabilistic attack graph to represent all possible attack paths and the probabilistic cause-consequence relations among nodes. The brute force approach has the exponential complexity and the belief propagation method gives approximation when the corresponding factor graph has cycles. To improve the approximation accuracy, a region-based method is adopted, which clusters some highly dependent nodes into regions and messages are passed among regions. Experiments are conducted to compare the performance of the different methods.

2017-08-02
Jang, Min-Hee, Faloutsos, Christos, Kim, Sang-Wook, Kang, U, Ha, Jiwoon.  2016.  PIN-TRUST: Fast Trust Propagation Exploiting Positive, Implicit, and Negative Information. Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. :629–638.

Given "who-trusts/distrusts-whom" information, how can we propagate the trust and distrust? With the appearance of fraudsters in social network sites, the importance of trust prediction has increased. Most such methods use only explicit and implicit trust information (e.g., if Smith likes several of Johnson's reviews, then Smith implicitly trusts Johnson), but they do not consider distrust. In this paper, we propose PIN-TRUST, a novel method to handle all three types of interaction information: explicit trust, implicit trust, and explicit distrust. The novelties of our method are the following: (a) it is carefully designed, to take into account positive, implicit, and negative information, (b) it is scalable (i.e., linear on the input size), (c) most importantly, it is effective and accurate. Our extensive experiments with a real dataset, Epinions.com data, of 100K nodes and 1M edges, confirm that PIN-TRUST is scalable and outperforms existing methods in terms of prediction accuracy, achieving up to 50.4 percentage relative improvement. 

2017-07-24
Chakrabarti, Aniket, Marwah, Manish, Arlitt, Martin.  2016.  Robust Anomaly Detection for Large-Scale Sensor Data. Proceedings of the 3rd ACM International Conference on Systems for Energy-Efficient Built Environments. :31–40.

Large scale sensor networks are ubiquitous nowadays. An important objective of deploying sensors is to detect anomalies in the monitored system or infrastructure, which allows remedial measures to be taken to prevent failures, inefficiencies, and security breaches. Most existing sensor anomaly detection methods are local, i.e., they do not capture the global dependency structure of the sensors, nor do they perform well in the presence of missing or erroneous data. In this paper, we propose an anomaly detection technique for large scale sensor data that leverages relationships between sensors to improve robustness even when data is missing or erroneous. We develop a probabilistic graphical model-based global outlier detection technique that represents a sensor network as a pairwise Markov Random Field and uses graphical model inference to detect anomalies. We show our model is more robust than local models, and detects anomalies with 90% accuracy even when 50% of sensors are erroneous. We also build a synthetic graphical model generator that preserves statistical properties of a real data set to test our outlier detection technique at scale.

2017-02-14
A. Oprea, Z. Li, T. F. Yen, S. H. Chin, S. Alrwais.  2015.  "Detection of Early-Stage Enterprise Infection by Mining Large-Scale Log Data". 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks. :45-56.

Recent years have seen the rise of sophisticated attacks including advanced persistent threats (APT) which pose severe risks to organizations and governments. Additionally, new malware strains appear at a higher rate than ever before. Since many of these malware evade existing security products, traditional defenses deployed by enterprises today often fail at detecting infections at an early stage. We address the problem of detecting early-stage APT infection by proposing a new framework based on belief propagation inspired from graph theory. We demonstrate that our techniques perform well on two large datasets. We achieve high accuracy on two months of DNS logs released by Los Alamos National Lab (LANL), which include APT infection attacks simulated by LANL domain experts. We also apply our algorithms to 38TB of web proxy logs collected at the border of a large enterprise and identify hundreds of malicious domains overlooked by state-of-the-art security products.