Biblio
With the rapid progress of informatization construction in power business, data resource has become the basic strategic resource of the power industry and innovative element in power production. The security protection of data in power business is particularly important in the informatization construction of power business. In order to implement data security protection, transparent encryption is one of the fifteen key technical standards in the Construction Guideline of the Standard Network Data Security System. However, data storage in the encrypted state is bound to affect the security audit of data to a certain extent. Based on this problem, this paper proposes a scheme to audit the sensitivity of the power business data under the protection of encryption to achieve an efficient sensitivity audit of ciphertext data with the premise of not revealing the decryption key or data information. Through a security demonstration, this paper fully proves that this solution is secure under the known plaintext attacks.
The regularity of devastating cyber-attacks has made cybersecurity a grand societal challenge. Many cybersecurity professionals are closely examining the international Dark Web to proactively pinpoint potential cyber threats. Despite its potential, the Dark Web contains hundreds of thousands of non-English posts. While machine translation is the prevailing approach to process non-English text, applying MT on hacker forum text results in mistranslations. In this study, we draw upon Long-Short Term Memory (LSTM), Cross-Lingual Knowledge Transfer (CLKT), and Generative Adversarial Networks (GANs) principles to design a novel Adversarial CLKT (A-CLKT) approach. A-CLKT operates on untranslated text to retain the original semantics of the language and leverages the collective knowledge about cyber threats across languages to create a language invariant representation without any manual feature engineering or external resources. Three experiments demonstrate how A-CLKT outperforms state-of-the-art machine learning, deep learning, and CLKT algorithms in identifying cyber-threats in French and Russian forums.
Cyber threat intelligence (CTI) necessitates automated monitoring of dark web platforms (e.g., Dark Net Markets and carding shops) on a large scale. While there are existing methods for collecting data from the surface web, large-scale dark web data collection is commonly hindered by anti-crawling measures. Text-based CAPTCHA serves as the most prohibitive type of these measures. Text-based CAPTCHA requires the user to recognize a combination of hard-to-read characters. Dark web CAPTCHA patterns are intentionally designed to have additional background noise and variable character length to prevent automated CAPTCHA breaking. Existing CAPTCHA breaking methods cannot remedy these challenges and are therefore not applicable to the dark web. In this study, we propose a novel framework for breaking text-based CAPTCHA in the dark web. The proposed framework utilizes Generative Adversarial Network (GAN) to counteract dark web-specific background noise and leverages an enhanced character segmentation algorithm. Our proposed method was evaluated on both benchmark and dark web CAPTCHA testbeds. The proposed method significantly outperformed the state-of-the-art baseline methods on all datasets, achieving over 92.08% success rate on dark web testbeds. Our research enables the CTI community to develop advanced capabilities of large-scale dark web monitoring.
Personally identifiable information (PII) has become a major target of cyber-attacks, causing severe losses to data breach victims. To protect data breach victims, researchers focus on collecting exposed PII to assess privacy risk and identify at-risk individuals. However, existing studies mostly rely on exposed PII collected from either the dark web or the surface web. Due to the wide exposure of PII on both the dark web and surface web, collecting from only the dark web or the surface web could result in an underestimation of privacy risk. Despite its research and practical value, jointly collecting PII from both sources is a non-trivial task. In this paper, we summarize our effort to systematically identify, collect, and monitor a total of 1,212,004,819 exposed PII records across both the dark web and surface web. Our effort resulted in 5.8 million stolen SSNs, 845,000 stolen credit/debit cards, and 1.2 billion stolen account credentials. From the surface web, we identified and collected over 1.3 million PII records of the victims whose PII is exposed on the dark web. To the best of our knowledge, this is the largest academic collection of exposed PII, which, if properly anonymized, enables various privacy research inquiries, including assessing privacy risk and identifying at-risk populations.
RISC-V is free and open standard instruction set architecture following reduced instruction set computer principle. Because of its openness and scalability, RISC-V has been adapted not only for embedded CPUs such as mobile and IoT market, but also for heavy-workload CPUs such as the data center or super computing field. On top of it, Robotics is also a good application of RISC-V because security and reliability become crucial issues of robotics system. These problems could be solved by enthusiastic open source community members as they have shown on open source operating system. However, running RISC-V on local FPGA becomes harder than before because now RISC-V foundation are focusing on cloud-based FPGA environment. We have experienced that recently released OS and toolchains for RISC-V are not working well on the previous CPU image for local FPGA. In this paper we design the local FPGA platform for RISC-V processor and run the robotics application on mainstream Robot Operating System on top of the RISC-V processor. This platform allow us to explore the architecture space of RISC-V CPU for robotics application, and get the insight of the RISC-V CPU architecture for optimal performance and the secure system.
Cyber-Physical Systems (CPSs) are engineered systems seamlessly integrating computational algorithms and physical components. CPS advances offer numerous benefits to domains such as health, transportation, smart homes and manufacturing. Despite these advances, the overall cybersecurity posture of CPS devices remains unclear. In this paper, we provide knowledge on how to improve CPS resiliency by evaluating and comparing the accuracy, and scalability of two popular vulnerability assessment tools, Nessus and OpenVAS. Accuracy and suitability are evaluated with a diverse sample of pre-defined vulnerabilities in Industrial Control Systems (ICS), smart cars, smart home devices, and a smart water system. Scalability is evaluated using a large-scale vulnerability assessment of 1,000 Internet accessible CPS devices found on Shodan, the search engine for the Internet of Things (IoT). Assessment results indicate several CPS devices from major vendors suffer from critical vulnerabilities such as unsupported operating systems, OpenSSH vulnerabilities allowing unauthorized information disclosure, and PHP vulnerabilities susceptible to denial of service attacks.
With the development of Internet technology, software vulnerabilities have become a major threat to current computer security. In this work, we propose the vulnerability detection for source code using Contextual LSTM. Compared with CNN and LSTM, we evaluated the CLSTM on 23185 programs, which are collected from SARD. We extracted the features through the program slicing. Based on the features, we used the natural language processing to analysis programs with source code. The experimental results demonstrate that CLSTM has the best performance for vulnerability detection, reaching the accuracy of 96.711% and the F1 score of 0.96984.
As more and more technologies to store and analyze massive amount of data become available, it is extremely important to make privacy-sensitive data de-identified so that further analysis can be conducted by different parties. For example, data needs to go through data de-identification process before being transferred to institutes for further value added analysis. As such, privacy protection issues associated with the release of data and data mining have become a popular field of study in the domain of big data. As a strict and verifiable definition of privacy, differential privacy has attracted noteworthy attention and widespread research in recent years. Nevertheless, differential privacy is not practical for most applications due to its performance of synthetic dataset generation for data query. Moreover, the definition of data protection by randomized noise in native differential privacy is abstract to users. Therefore, we design a pragmatic DP-based data de-identification protection and risk of data disclosure estimation system, in which a DP-based noise addition mechanism is applied to generate synthetic datasets. Furthermore, the risk of data disclosure to these synthetic datasets can be evaluated before releasing to buyers/consumers.
With the advancement of sensor electronic devices, wireless sensor networks have attracted more and more attention. Range query has become a significant part of sensor networks due to its availability and convenience. However, It is challenging to process range query while still protecting sensitive data from disclosure. Existing work mainly focuses on privacy- preserving range query, but neglects the damage of collusion attacks, probability attacks and differential attacks. In this paper, we propose a privacy- preserving, energy-efficient and multi-dimensional range query protocol called PERQ, which not only achieves data privacy, but also considers collusion attacks, probability attacks and differential attacks. Generalized distance-based and modular arithmetic range query mechanism are used. In addition, a novel cyclic modular verification scheme is proposed to verify the data integrity. Extensive theoretical analysis and experimental results confirm the high performance of PERQ in terms of energy efficiency, security and accountability requirements.
By applying power usage statistics from smart meters, users are able to save energy in their homes or control smart appliances via home automation systems. However, owing to security and privacy concerns, it is recommended that smart meters (SM) should not have direct communication with smart appliances. In this paper, we propose a design for a smart meter gateway (SMGW) associated with a two-phase authentication mechanism and key management scheme to link a smart grid with smart appliances. With placement of the SMGW, we can reduce the design complexity of SMs as well as enhance the strength of security.
Cloud computing has emerged as a compelling vision for managing data and delivering query answering capability over the internet. This new way of computing also poses a real risk of disclosing confidential information to the cloud. Searchable encryption addresses this issue by allowing the cloud to compute the answer to a query based on the cipher texts of data and queries. Thanks to its inner product preservation property, the asymmetric scalar-product-preserving encryption (ASPE) has been adopted and enhanced in a growing number of works toperform a variety of queries and tasks in the cloud computingsetting. However, the security property of ASPE and its enhancedschemes has not been studied carefully. In this paper, we show acomplete disclosure of ASPE and several previously unknownsecurity risks of its enhanced schemes. Meanwhile, efficientalgorithms are proposed to learn the plaintext of data and queriesencrypted by these schemes with little or no knowledge beyondthe ciphertexts. We demonstrate these risks on real data sets.
Ransomware techniques have evolved over time with the most resilient attacks making data recovery practically impossible. This has driven countermeasures to shift towards recovery against prevention but in this paper, we model ransomware attacks from an infection vector point of view. We follow the basic infection chain of crypto ransomware and use Bayesian network statistics to infer some of the most common ransomware infection vectors. We also employ the use of attack and sensor nodes to capture uncertainty in the Bayesian network.
Complex systems are prevalent in many fields such as finance, security and industry. A fundamental problem in system management is to perform diagnosis in case of system failure such that the causal anomalies, i.e., root causes, can be identified for system debugging and repair. Recently, invariant network has proven a powerful tool in characterizing complex system behaviors. In an invariant network, a node represents a system component, and an edge indicates a stable interaction between two components. Recent approaches have shown that by modeling fault propagation in the invariant network, causal anomalies can be effectively discovered. Despite their success, the existing methods have a major limitation: they typically assume there is only a single and global fault propagation in the entire network. However, in real-world large-scale complex systems, it's more common for multiple fault propagations to grow simultaneously and locally within different node clusters and jointly define the system failure status. Inspired by this key observation, we propose a two-phase framework to identify and rank causal anomalies. In the first phase, a probabilistic clustering is performed to uncover impaired node clusters in the invariant network. Then, in the second phase, a low-rank network diffusion model is designed to backtrack causal anomalies in different impaired clusters. Extensive experimental results on real-life datasets demonstrate the effectiveness of our method.
Security evaluation of diverse SDN frameworks is of significant importance to design resilient systems and deal with attacks. Focused on SDN scenarios, a game-theoretic model is proposed to analyze their security performance in existing SDN architectures. The model can describe specific traits in different structures, represent several types of information of players (attacker and defender) and quantitatively calculate systems' reliability. Simulation results illustrate dynamic SDN structures have distinct security improvement over static ones. Besides, effective dynamic scheduling mechanisms adopted in dynamic systems can enhance their security further.
The anonymizing network Tor is examined as one method of anonymizing port scanning tools and avoiding identification and retaliation. Performing anonymized port scans through Tor is possible using Nmap, but parallelization of the scanning processes is required to accelerate the scan rate.
Cybersecurity is a problem of growing relevance that impacts all facets of society. As a result, many researchers have become interested in studying cybercriminals and online hacker communities in order to develop more effective cyber defenses. In particular, analysis of hacker community contents may reveal existing and emerging threats that pose great risk to individuals, businesses, and government. Thus, we are interested in developing an automated methodology for identifying tangible and verifiable evidence of potential threats within hacker forums, IRC channels, and carding shops. To identify threats, we couple machine learning methodology with information retrieval techniques. Our approach allows us to distill potential threats from the entirety of collected hacker contents. We present several examples of identified threats found through our analysis techniques. Results suggest that hacker communities can be analyzed to aid in cyber threat detection, thus providing promising direction for future work.