Biblio

Filters: Author is Li, Z.  [Clear All Filters]
2021-04-27
Gui, J., Li, D., Chen, Z., Rhee, J., Xiao, X., Zhang, M., Jee, K., Li, Z., Chen, H..  2020.  APTrace: A Responsive System for Agile Enterprise Level Causality Analysis. 2020 IEEE 36th International Conference on Data Engineering (ICDE). :1701–1712.
While backtracking analysis has been successful in assisting the investigation of complex security attacks, it faces a critical dependency explosion problem. To address this problem, security analysts currently need to tune backtracking analysis manually with different case-specific heuristics. However, existing systems fail to fulfill two important system requirements to achieve effective backtracking analysis. First, there need flexible abstractions to express various types of heuristics. Second, the system needs to be responsive in providing updates so that the progress of backtracking analysis can be frequently inspected, which typically involves multiple rounds of manual tuning. In this paper, we propose a novel system, APTrace, to meet both of the above requirements. As we demonstrate in the evaluation, security analysts can effectively express heuristics to reduce more than 99.5% of irrelevant events in the backtracking analysis of real-world attack cases. To improve the responsiveness of backtracking analysis, we present a novel execution-window partitioning algorithm that significantly reduces the waiting time between two consecutive updates (especially, 57 times reduction for the top 1% waiting time).
2021-02-08
Wang, Y., Wen, M., Liu, Y., Wang, Y., Li, Z., Wang, C., Yu, H., Cheung, S.-C., Xu, C., Zhu, Z..  2020.  Watchman: Monitoring Dependency Conflicts for Python Library Ecosystem. 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). :125–135.
The PyPI ecosystem has indexed millions of Python libraries to allow developers to automatically download and install dependencies of their projects based on the specified version constraints. Despite the convenience brought by automation, version constraints in Python projects can easily conflict, resulting in build failures. We refer to such conflicts as Dependency Conflict (DC) issues. Although DC issues are common in Python projects, developers lack tool support to gain a comprehensive knowledge for diagnosing the root causes of these issues. In this paper, we conducted an empirical study on 235 real-world DC issues. We studied the manifestation patterns and fixing strategies of these issues and found several key factors that can lead to DC issues and their regressions. Based on our findings, we designed and implemented Watchman, a technique to continuously monitor dependency conflicts for the PyPI ecosystem. In our evaluation, Watchman analyzed PyPI snapshots between 11 Jul 2019 and 16 Aug 2019, and found 117 potential DC issues. We reported these issues to the developers of the corresponding projects. So far, 63 issues have been confirmed, 38 of which have been quickly fixed by applying our suggested patches.
2021-04-27
Yang, Y., Lu, K., Cheng, H., Fu, M., Li, Z..  2020.  Time-controlled Regular Language Search over Encrypted Big Data. 2020 IEEE 9th Joint International Information Technology and Artificial Intelligence Conference (ITAIC). 9:1041—1045.

The rapid development of cloud computing and the arrival of the big data era make the relationship between users and cloud closer. Cloud computing has powerful data computing and data storage capabilities, which can ubiquitously provide users with resources. However, users do not fully trust the cloud server's storage services, so lots of data is encrypted and uploaded to the cloud. Searchable encryption can protect the confidentiality of data and provide encrypted data retrieval functions. In this paper, we propose a time-controlled searchable encryption scheme with regular language over encrypted big data, which provides flexible search pattern and convenient data sharing. Our solution allows users with data's secret keys to generate trapdoors by themselves. And users without data's secret keys can generate trapdoors with the help of a trusted third party without revealing the data owner's secret key. Our system uses a time-controlled mechanism to collect keywords queried by users and ensures that the querying user's identity is not directly exposed. The obtained keywords are the basis for subsequent big data analysis. We conducted a security analysis of the proposed scheme and proved that the scheme is secure. The simulation experiment and comparison of our scheme show that the system has feasible efficiency.

2021-01-28
Zhang, M., Wei, T., Li, Z., Zhou, Z..  2020.  A service-oriented adaptive anonymity algorithm. 2020 39th Chinese Control Conference (CCC). :7626—7631.

Recently, a large amount of research studies aiming at the privacy-preserving data publishing have been conducted. We find that most K-anonymity algorithms fail to consider the characteristics of attribute values distribution in data and the contribution value differences in quasi-identifier attributes when service-oriented. In this paper, the importance of distribution characteristics of attribute values and the differences in contribution value of quasi-identifier attributes to anonymous results are illustrated. In order to maximize the utility of released data, a service-oriented adaptive anonymity algorithm is proposed. We establish a model of reaction dispersion degree to quantify the characteristics of attribute value distribution and introduce the concept of utility weight related to the contribution value of quasi-identifier attributes. The priority coefficient and the characterization coefficient of partition quality are defined to optimize selection strategies of dimension and splitting value in anonymity group partition process adaptively, which can reduce unnecessary information loss so as to further improve the utility of anonymized data. The rationality and validity of the algorithm are verified by theoretical analysis and multiple experiments.

2021-03-04
Tang, R., Yang, Z., Li, Z., Meng, W., Wang, H., Li, Q., Sun, Y., Pei, D., Wei, T., Xu, Y. et al..  2020.  ZeroWall: Detecting Zero-Day Web Attacks through Encoder-Decoder Recurrent Neural Networks. IEEE INFOCOM 2020 - IEEE Conference on Computer Communications. :2479—2488.

Zero-day Web attacks are arguably the most serious threats to Web security, but are very challenging to detect because they are not seen or known previously and thus cannot be detected by widely-deployed signature-based Web Application Firewalls (WAFs). This paper proposes ZeroWall, an unsupervised approach, which works with an existing WAF in pipeline, to effectively detecting zero-day Web attacks. Using historical Web requests allowed by an existing signature-based WAF, a vast majority of which are assumed to be benign, ZeroWall trains a self-translation machine using an encoder-decoder recurrent neural network to capture the syntax and semantic patterns of benign requests. In real-time detection, a zero-day attack request (which the WAF fails to detect), not understood well by self-translation machine, cannot be translated back to its original request by the machine, thus is declared as an attack. In our evaluation using 8 real-world traces of 1.4 billion Web requests, ZeroWall successfully detects real zero-day attacks missed by existing WAFs and achieves high F1-scores over 0.98, which significantly outperforms all baseline approaches.

2021-02-22
Han, Z., Wang, F., Li, Z..  2020.  Research on Nearest Neighbor Data Association Algorithm Based on Target “Dynamic” Monitoring Model. 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC). 1:665–668.
In order to solve the problem that the Nearest Neighbor Data Association (NNDA) algorithm cannot detect the “dynamic” change of the target, this paper proposes the nearest neighbor data association algorithm based on the Targets “Dynamic” Monitoring Model (TDMM). Firstly, the gate searching and updating of targets are completed based on TDMM, then the NNDA algorithm is utilized to achieve the data association of targets to realize track updating. Finally, the NNDA algorithm based on TDMM is realized by simulation. The experimental results show that the algorithm proposed can achieve “dynamic” monitoring in multi-target data association, and have more obvious advantages than Multiple Hypothesis Tracking (MHT) in timeliness and association performance.
2019-02-22
Liao, X., Yu, Y., Li, B., Li, Z., Qin, Z..  2019.  A New Payload Partition Strategy in Color Image Steganography. IEEE Transactions on Circuits and Systems for Video Technology. :1-1.

In traditional steganographic schemes, RGB three channels payloads are assigned equally in a true color image. In fact, the security of color image steganography relates not only to data-embedding algorithms but also to different payload partition. How to exploit inter-channel correlations to allocate payload for performance enhancement is still an open issue in color image steganography. In this paper, a novel channel-dependent payload partition strategy based on amplifying channel modification probabilities is proposed, so as to adaptively assign the embedding capacity among RGB channels. The modification probabilities of three corresponding pixels in RGB channels are simultaneously increased, and thus the embedding impacts could be clustered, in order to improve the empirical steganographic security against the channel co-occurrences detection. Experimental results show that the new color image steganographic schemes incorporated with the proposed strategy can effectively make the embedding changes concentrated mainly in textured regions, and achieve better performance on resisting the modern color image steganalysis.

2020-12-01
Chen, S., Hu, W., Li, Z..  2019.  High Performance Data Encryption with AES Implementation on FPGA. 2019 IEEE 5th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS). :149—153.

Nowadays big data has getting more and more attention in both the academic and the industrial research. With the development of big data, people pay more attention to data security. A significant feature of big data is the large size of the data. In order to improve the encryption speed of the large size of data, this paper uses the deep pipeline and full expansion technology to implement the AES encryption algorithm on FPGA. Achieved throughput of 31.30 Gbps with a minimum latency of 0.134 us. This design can quickly encrypt large amounts of data and provide technical support for the development of big data.

2019-01-21
Zhang, Z., Li, Z., Xia, C., Cui, J., Ma, J..  2018.  H-Securebox: A Hardened Memory Data Protection Framework on ARM Devices. 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC). :325–332.

ARM devices (mobile phone, IoT devices) are getting more popular in our daily life due to the low power consumption and cost. These devices carry a huge number of user's private information, which attracts attackers' attention and increase the security risk. The operating systems (e.g., Android, Linux) works out many memory data protection strategies on user's private information. However, the monolithic OS may contain security vulnerabilities that are exploited by the attacker to get root or even kernel privilege. Once the kernel privilege is obtained by the attacker, all data protection strategies will be gone and user's private information can be taken away. In this paper, we propose a hardened memory data protection framework called H-Securebox to defeat kernel-level memory data stolen attacks. H-Securebox leverages ARM hardware virtualization technique to protect the data on the memory with hypervisor privilege. We designed three types H-Securebox for programing developers to use. Although the attacker may have kernel privilege, she can not touch private data inside H-Securebox, since hypervisor privilege is higher than kernel privilege. With the implementation of H-Securebox system assisting by a tiny hypervisor on Raspberry Pi2 development board, we measure the performance overhead of our system and do the security evaluations. The results positively show that the overhead is negligible and the malicious application with root or kernel privilege can not access the private data protected by our system.

2019-06-24
You, Y., Li, Z., Oechtering, T. J..  2018.  Optimal Privacy-Enhancing And Cost-Efficient Energy Management Strategies For Smart Grid Consumers. 2018 IEEE Statistical Signal Processing Workshop (SSP). :826–830.

The design of optimal energy management strategies that trade-off consumers' privacy and expected energy cost by using an energy storage is studied. The Kullback-Leibler divergence rate is used to assess the privacy risk of the unauthorized testing on consumers' behavior. We further show how this design problem can be formulated as a belief state Markov decision process problem so that standard tools of the Markov decision process framework can be utilized, and the optimal solution can be obtained by using Bellman dynamic programming. Finally, we illustrate the privacy-enhancement and cost-saving by numerical examples.

2019-03-11
Li, Z., Xie, X., Ma, X., Guan, Z..  2018.  Trustworthiness Optimization of Industrial Cluster Network Platform Based on Blockchain. 2018 8th International Conference on Logistics, Informatics and Service Sciences (LISS). :1–6.

Industrial cluster is an important organization form and carrier of development of small and medium-sized enterprises, and information service platform is an important facility of industrial cluster. Improving the credibility of the network platform is conducive to eliminate the adverse effects of distrust and information asymmetry on industrial clusters. The decentralization, transparency, openness, and intangibility of block chain technology make it an inevitable choice for trustworthiness optimization of industrial cluster network platform. This paper first studied on trusted standard of industry cluster network platform and construct a new trusted framework of industry cluster network platform. Then the paper focus on trustworthiness optimization of data layer and application layer of the platform. The purpose of this paper is to build an industrial cluster network platform with data access, information trustworthiness, function availability, high-speed and low consumption, and promote the sustainable and efficient development of industrial cluster.

2019-04-01
Liu, F., Li, Z., Li, X., Lv, T..  2018.  A Text-Based CAPTCHA Cracking System with Generative Adversarial Networks. 2018 IEEE International Symposium on Multimedia (ISM). :192–193.
As a multimedia security mechanism, CAPTCHAs are completely automated public turing test to tell computers and humans apart. Although cracking CAPTCHA has been explored for many years, it is still a challenging problem for real practice. In this demo, we present a text based CAPTCHA cracking system by using convolutional neural networks(CNN). To solve small sample problem, we propose to combine conditional deep convolutional generative adversarial networks(cDCGAN) and CNN, which makes a tremendous progress in accuracy. In addition, we also select multiple models with low pearson correlation coefficients for majority voting ensemble, which further improves the accuracy. The experimental results show that the system has great advantages and provides a new mean for cracking CAPTCHAs.
Li, Z., Liao, Q..  2018.  CAPTCHA: Machine or Human Solvers? A Game-Theoretical Analysis 2018 5th IEEE International Conference on Cyber Security and Cloud Computing (CSCloud)/2018 4th IEEE International Conference on Edge Computing and Scalable Cloud (EdgeCom). :18–23.
CAPTCHAs have become an ubiquitous defense used to protect open web resources from being exploited at scale. Traditionally, attackers have developed automatic programs known as CAPTCHA solvers to bypass the mechanism. With the presence of cheap labor in developing countries, hackers now have options to use human solvers. In this research, we develop a game theoretical framework to model the interactions between the defender and the attacker regarding the design and countermeasure of CAPTCHA system. With the result of equilibrium analysis, both parties can determine the optimal allocation of software-based or human-based CAPTCHA solvers. Counterintuitively, instead of the traditional wisdom of making CAPTCHA harder and harder, it may be of best interest of the defender to make CAPTCHA easier. We further suggest a welfare-improving CAPTCHA business model by involving decentralized cryptocurrency computation.
2017-12-12
Ktob, A., Li, Z..  2017.  The Arabic Knowledge Graph: Opportunities and Challenges. 2017 IEEE 11th International Conference on Semantic Computing (ICSC). :48–52.

Semantic Web has brought forth the idea of computing with knowledge, hence, attributing the ability of thinking to machines. Knowledge Graphs represent a major advancement in the construction of the Web of Data where machines are context-aware when answering users' queries. The English Knowledge Graph was a milestone realized by Google in 2012. Even though it is a useful source of information for English users and applications, it does not offer much for the Arabic users and applications. In this paper, we investigated the different challenges and opportunities prone to the life-cycle of the construction of the Arabic Knowledge Graph (AKG) while following some best practices and techniques. Additionally, this work suggests some potential solutions to these challenges. The proprietary factor of data creates a major problem in the way of harvesting this latter. Moreover, when the Arabic data is openly available, it is generally in an unstructured form which requires further processing. The complexity of the Arabic language itself creates a further problem for any automatic or semi-automatic extraction processes. Therefore, the usage of NLP techniques is a feasible solution. Some preliminary results are presented later in this paper. The AKG has very promising outcomes for the Semantic Web in general and the Arabic community in particular. The goal of the Arabic Knowledge Graph is mainly the integration of the different isolated datasets available on the Web. Later, it can be used in both the academic (by providing a large dataset for many different research fields and enhance discovery) and commercial sectors (by improving search engines, providing metadata, interlinking businesses).

2017-12-20
Wang, M., Li, Z., Lin, Y..  2017.  A Distributed Intrusion Detection System for Cognitive Radio Networks Based on Evidence Theory. 2017 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C). :226–232.

Reliable detection of intrusion is the basis of safety in cognitive radio networks (CRNs). So far, few scholars applied intrusion detection systems (IDSs) to combat intrusion against CRNs. In order to improve the performance of intrusion detection in CRNs, a distributed intrusion detection scheme has been proposed. In this paper, a method base on Dempster-Shafer's (D-S) evidence theory to detect intrusion in CRNs is put forward, in which the detection data and credibility of different local IDS Agent is combined by D-S in the cooperative detection center, so that different local detection decisions are taken into consideration in the final decision. The effectiveness of the proposed scheme is verified by simulation, and the results reflect a noticeable performance improvement between the proposed scheme and the traditional method.

2018-06-11
Yang, C., Li, Z., Qu, W., Liu, Z., Qi, H..  2017.  Grid-Based Indexing and Search Algorithms for Large-Scale and High-Dimensional Data. 2017 14th International Symposium on Pervasive Systems, Algorithms and Networks 2017 11th International Conference on Frontier of Computer Science and Technology 2017 Third International Symposium of Creative Computing (ISPAN-FCST-ISCC). :46–51.

The rapid development of Internet has resulted in massive information overloading recently. These information is usually represented by high-dimensional feature vectors in many related applications such as recognition, classification and retrieval. These applications usually need efficient indexing and search methods for such large-scale and high-dimensional database, which typically is a challenging task. Some efforts have been made and solved this problem to some extent. However, most of them are implemented in a single machine, which is not suitable to handle large-scale database.In this paper, we present a novel data index structure and nearest neighbor search algorithm implemented on Apache Spark. We impose a grid on the database and index data by non-empty grid cells. This grid-based index structure is simple and easy to be implemented in parallel. Moreover, we propose to build a scalable KNN graph on the grids, which increase the efficiency of this index structure by a low cost in parallel implementation. Finally, experiments are conducted in both public databases and synthetic databases, showing that the proposed methods achieve overall high performance in both efficiency and accuracy.

2018-05-16
Liren, Z., Xin, Y., Yang, P., Li, Z..  2017.  Magnetic performance measurement and mathematical model establishment of main core of magnetic modulator. 2017 13th IEEE International Conference on Electronic Measurement Instruments (ICEMI). :12–16.

In order to investigate the relationship and effect on the performance of magnetic modulator among applied DC current, excitation source, excitation loop current, sensitivity and induced voltage of detecting winding, this paper measured initial permeability, maximum permeability, saturation magnetic induction intensity, remanent magnetic induction intensity, coercivity, saturated magnetic field intensity, magnetization curve, permeability curve and hysteresis loop of main core 1J85 permalloy of magnetic modulator based on ballistic method. On this foundation, employ curve fitting tool of MATLAB; adopt multiple regression method to comprehensively compare and analyze the sum of squares due to error (SSE), coefficient of determination (R-square), degree-of-freedom adjusted coefficient of determination (Adjusted R-square), and root mean squared error (RMSE) of fitting results. Finally, establish B-H curve mathematical model based on the sum of arc-hyperbolic sine function and polynomial.

2017-12-28
Guo, J., Li, Z..  2017.  A Mean-Covariance Decomposition Modeling Method for Battery Capacity Prognostics. 2017 International Conference on Sensing, Diagnostics, Prognostics, and Control (SDPC). :549–556.

Lithium Ion batteries usually degrade to an unacceptable capacity level after hundreds or even thousands of cycles. The continuously observed capacity fade data over time and their internal structure can be informative for constructing capacity fade models. This paper applies a mean-covariance decomposition modeling method to analyze the capacity fade data. The proposed approach directly examines the variances and correlations in data of interest and express the correlation matrix in hyper-spherical coordinates using angles and trigonometric functions. The proposed method is applied to model and predict key batteries performance metrics using testing data under various testing conditions.

2018-05-01
Li, Z., Beugnon, S., Puech, W., Bors, A. G..  2017.  Rethinking the High Capacity 3D Steganography: Increasing Its Resistance to Steganalysis. 2017 IEEE International Conference on Image Processing (ICIP). :510–414.

3D steganography is used in order to embed or hide information into 3D objects without causing visible or machine detectable modifications. In this paper we rethink about a high capacity 3D steganography based on the Hamiltonian path quantization, and increase its resistance to steganalysis. We analyze the parameters that may influence the distortion of a 3D shape as well as the resistance of the steganography to 3D steganalysis. According to the experimental results, the proposed high capacity 3D steganographic method has an increased resistance to steganalysis.

2018-02-28
Ma, G., Li, X., Pei, Q., Li, Z..  2017.  A Security Routing Protocol for Internet of Things Based on RPL. 2017 International Conference on Networking and Network Applications (NaNA). :209–213.

RPL is a lightweight IPv6 network routing protocol specifically designed by IETF, which can make full use of the energy of intelligent devices and compute the resource to build the flexible topological structure. This paper analyzes the security problems of RPL, sets up a test network to test RPL network security, proposes a RPL based security routing protocol M-RPL. The routing protocol establishes a hierarchical clustering network topology, the intelligent device of the network establishes the backup path in different clusters during the route discovery phase, enable backup paths to ensure data routing when a network is compromised. Setting up a test prototype network, simulating some attacks against the routing protocols in the network. The test results show that the M-RPL network can effectively resist the routing attacks. M-RPL provides a solution to ensure the Internet of Things (IoT) security.

2018-11-19
Huang, H., Wang, H., Luo, W., Ma, L., Jiang, W., Zhu, X., Li, Z., Liu, W..  2017.  Real-Time Neural Style Transfer for Videos. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). :7044–7052.

Recent research endeavors have shown the potential of using feed-forward convolutional neural networks to accomplish fast style transfer for images. In this work, we take one step further to explore the possibility of exploiting a feed-forward network to perform style transfer for videos and simultaneously maintain temporal consistency among stylized video frames. Our feed-forward network is trained by enforcing the outputs of consecutive frames to be both well stylized and temporally consistent. More specifically, a hybrid loss is proposed to capitalize on the content information of input frames, the style information of a given style image, and the temporal information of consecutive frames. To calculate the temporal loss during the training stage, a novel two-frame synergic training mechanism is proposed. Compared with directly applying an existing image style transfer method to videos, our proposed method employs the trained network to yield temporally consistent stylized videos which are much more visually pleasant. In contrast to the prior video style transfer method which relies on time-consuming optimization on the fly, our method runs in real time while generating competitive visual results.

2018-09-28
Li, Z., Li, S..  2017.  Random forest algorithm under differential privacy. 2017 IEEE 17th International Conference on Communication Technology (ICCT). :1901–1905.

Trying to solve the risk of data privacy disclosure in classification process, a Random Forest algorithm under differential privacy named DPRF-gini is proposed in the paper. In the process of building decision tree, the algorithm first disturbed the process of feature selection and attribute partition by using exponential mechanism, and then meet the requirement of differential privacy by adding Laplace noise to the leaf node. Compared with the original algorithm, Empirical results show that protection of data privacy is further enhanced while the accuracy of the algorithm is slightly reduced.

2018-02-15
Wang, C., Lizana, F. R., Li, Z., Peterchev, A. V., Goetz, S. M..  2017.  Submodule short-circuit fault diagnosis based on wavelet transform and support vector machines for modular multilevel converter with series and parallel connectivity. IECON 2017 - 43rd Annual Conference of the IEEE Industrial Electronics Society. :3239–3244.

The modular multilevel converter with series and parallel connectivity was shown to provide advantages in several industrial applications. Its reliability largely depends on the absence of failures in the power semiconductors. We propose and analyze a fault-diagnosis technique to identify shorted switches based on features generated through wavelet transform of the converter output and subsequent classification in support vector machines. The multi-class support vector machine is trained with multiple recordings of the output of each fault condition as well as the converter under normal operation. Simulation results reveal that the proposed method has high classification latency and high robustness. Except for the monitoring of the output, which is required for the converter control in any case, this method does not require additional module sensors.

2017-02-27
Li, Z., Oechtering, T. J..  2015.  Privacy on hypothesis testing in smart grids. 2015 IEEE Information Theory Workshop - Fall (ITW). :337–341.

In this paper, we study the problem of privacy information leakage in a smart grid. The privacy risk is assumed to be caused by an unauthorized binary hypothesis testing of the consumer's behaviour based on the smart meter readings of energy supplies from the energy provider. Another energy supplies are produced by an alternative energy source. A controller equipped with an energy storage device manages the energy inflows to satisfy the energy demand of the consumer. We study the optimal energy control strategy which minimizes the asymptotic exponential decay rate of the minimum Type II error probability in the unauthorized hypothesis testing to suppress the privacy risk. Our study shows that the cardinality of the energy supplies from the energy provider for the optimal control strategy is no more than two. This result implies a simple objective of the optimal energy control strategy. When additional side information is available for the adversary, the optimal control strategy and privacy risk are compared with the case of leaking smart meter readings to the adversary only.