Biblio

Filters: Author is Wu, Zhonghai  [Clear All Filters]
2023-07-21
Xin, Wu, Shen, Qingni, Feng, Ke, Xia, Yutang, Wu, Zhonghai, Lin, Zhenghao.  2022.  Personalized User Profiles-based Insider Threat Detection for Distributed File System. 2022 IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :1441—1446.
In recent years, data security incidents caused by insider threats in distributed file systems have attracted the attention of academia and industry. The most common way to detect insider threats is based on user profiles. Through analysis, we realize that based on existing user profiles are not efficient enough, and there are many false positives when a stable user profile has not yet been formed. In this work, we propose personalized user profiles and design an insider threat detection framework, which can intelligently detect insider threats for securing distributed file systems in real-time. To generate personalized user profiles, we come up with a time window-based clustering algorithm and a weighted kernel density estimation algorithm. Compared with non-personalized user profiles, both the Recall and Precision of insider threat detection based on personalized user profiles have been improved, resulting in their harmonic mean F1 increased to 96.52%. Meanwhile, to reduce the false positives of insider threat detection, we put forward operation recommendations based on user similarity to predict new operations that users will produce in the future, which can reduce the false positive rate (FPR). The FPR is reduced to 1.54% and the false positive identification rate (FPIR) is as high as 92.62%. Furthermore, to mitigate the risks caused by inaccurate authorization for users, we present user tags based on operation content and permission. The experimental results show that our proposed framework can detect insider threats more effectively and precisely, with lower FPR and high FPIR.
2022-06-06
Zhang, Xinyuan, Liu, Hongzhi, Wu, Zhonghai.  2020.  Noise Reduction Framework for Distantly Supervised Relation Extraction with Human in the Loop. 2020 IEEE 10th International Conference on Electronics Information and Emergency Communication (ICEIEC). :1–4.
Distant supervision is a widely used data labeling method for relation extraction. While aligning knowledge base with the corpus, distant supervision leads to a mass of wrong labels which are defined as noise. The pattern-based denoising model has achieved great progress in selecting trustable sentences (instances). However, the writing of relation-specific patterns heavily relies on expert’s knowledge and is a high labor intensity work. To solve these problems, we propose a noise reduction framework, NOIR, to iteratively select trustable sentences with a little help of a human. Under the guidance of experts, the iterative process can avoid semantic drift. Besides, NOIR can help experts discover relation-specific tokens that are hard to think of. Experimental results on three real-world datasets show the effectiveness of the proposed method compared with state-of-the-art methods.
2020-09-14
Wu, Pengfei, Deng, Robert, Shen, Qingni, Liu, Ximeng, Li, Qi, Wu, Zhonghai.  2019.  ObliComm: Towards Building an Efficient Oblivious Communication System. IEEE Transactions on Dependable and Secure Computing. :1–1.
Anonymous Communication (AC) hides traffic patterns and protects message metadata from being leaked during message transmission. Many practical AC systems have been proposed aiming to reduce communication latency and support a large number of users. However, how to design AC systems which possess strong security property and at the same time achieve optimal performance (i.e., the lowest latency or highest horizontal scalability) has been a challenging problem. In this paper, we propose an ObliComm framework, which consists of six modular AC subroutines. We also present a strong security definition for AC, named oblivious communication, encompassing confidentiality, unobservability, and a new requirement sending-and-receiving operation hiding. The AC subroutines in ObliComm allow for modular construction of oblivious communication systems in different network topologies. All constructed systems satisfy oblivious communication definition and can be provably secure in the universal composability (UC) framework. Additionally, we model the relationship between the network topology and communication measurements by queuing theory, which enables the system's efficiency can be optimized and estimated by quantitative analysis and calculation. Through theoretical analyses and empirical experiments, we demonstrate the efficiency of our scheme and soundness of the queuing model.
2019-10-08
Jiang, Zhengshen, Liu, Hongzhi, Fu, Bin, Wu, Zhonghai, Zhang, Tao.  2018.  Recommendation in Heterogeneous Information Networks Based on Generalized Random Walk Model and Bayesian Personalized Ranking. Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. :288–296.

Recommendation based on heterogeneous information network(HIN) is attracting more and more attention due to its ability to emulate collaborative filtering, content-based filtering, context-aware recommendation and combinations of any of these recommendation semantics. Random walk based methods are usually used to mine the paths, weigh the paths, and compute the closeness or relevance between two nodes in a HIN. A key for the success of these methods is how to properly set the weights of links in a HIN. In existing methods, the weights of links are mostly set heuristically. In this paper, we propose a Bayesian Personalized Ranking(BPR) based machine learning method, called HeteLearn, to learn the weights of links in a HIN. In order to model user preferences for personalized recommendation, we also propose a generalized random walk with restart model on HINs. We evaluate the proposed method in a personalized recommendation task and a tag recommendation task. Experimental results show that our method performs significantly better than both the traditional collaborative filtering and the state-of-the-art HIN-based recommendation methods.