Visible to the public Biblio

Found 478 results

Filters: Keyword is Big Data  [Clear All Filters]
2022-06-09
Wang, Jun, Wang, Wen, Wu, Dan, Lei, Ting, Liu, DunNan, Li, PeiJun, Su, Shu.  2021.  Research on Business Model of Internet of Vehicles Platform Based on Token Economy. 2021 2nd International Conference on Big Data Economy and Information Management (BDEIM). :120–124.
With the increasing number of electric vehicles, the scale of the market also increases. In the past, the electric vehicle market had problems such as opaque information, numerous levels and data leakage, which were criticized for the impact of the overall development and policies of the electric vehicle industry. In view of the problems existing in the transparency and security of big data management transactions of the Internet of vehicles, this paper combs the commercial operation framework of the Internet of Vehicles Platform, analyses the feasibility and necessity of establishing the token system of the Internet of Vehicles Platform, and constructs the token economic system architecture of the Internet of Vehicles Platform and its development path.
2022-06-08
Zhang, Guangxin, Zhao, Liying, Qiao, Dongliang, Shang, Ziwen, Huang, Rui.  2021.  Design of transmission line safety early warning system based on big data variable analysis. 2021 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS). :90–93.
In order to improve the accuracy and efficiency of transmission line safety early warning, a transmission line safety early warning system based on big data variable analysis is proposed. Firstly, the overall architecture of the system is designed under the B / S architecture. Secondly, in the hardware part of the system, the security data real-time monitoring module, data transmission module and security warning module are designed to meet the functional requirements of the system. Finally, in the system software design part, the big data variable analysis method is used to calculate the hidden danger of transmission line safety, so as to improve the effectiveness of transmission safety early warning. The experimental results show that, compared with the traditional security early warning system, the early warning accuracy and efficiency of the designed system are significantly improved, which can ensure the safe operation of the transmission line.
Xue, Bi.  2021.  Information Fusion and Intelligent Management of Industrial Internet of Things under the Background of Big Data. 2021 13th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA). :68–71.
This paper summarizes the types and contents of enterprise big data information, analyzes the demand and characteristics of enterprise shared data information based on the Internet of things, and analyzes the current situation of enterprise big data fusion at home and abroad. Firstly, using the idea of the Internet of things for reference, the intelligent sensor is used as the key component of data acquisition, and the multi energy data acquisition technology is discussed. Then the data information of entity enterprises is taken as the research object and a low energy consumption transmission method based on data fusion mechanism for industrial ubiquitous Internet of things is proposed. Finally, a network monitoring and data fusion platform for the industrial Internet of things is implemented. The monitoring node networking and platform usability test are also performed. It is proved that the scheme can achieve multi parameter, real-time, high reliable network intelligent management.
Yang, Ruxia, Gao, Xianzhou, Gao, Peng.  2021.  Research on Intelligent Recognition and Tracking Technology of Sensitive Data for Electric Power Big Data. 2021 13th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA). :229–234.
Current power sensitive data security protection adopts classification and grading protection. Company classification and grading are mainly in formulating specifications. Data classification and grading processing is carried out manually, which is heavy and time-consuming, while traditional data identification mainly relies on rules for data identification, the level of automation and intelligence is low, and there are many problems in recognition accuracy. Data classification and classification is the basis of data security protection. Sensitive data identification is the key to data classification and classification, and it is also the first step to achieve accurate data security protection. This paper proposes an intelligent identification and tracking technology of sensitive data for electric power big data, which can improve the ability of data classification and classification, help the realization of data classification and classification, and provide support for the accurate implementation of data security capabilities.
Imtiaz, Sayem Mohammad, Sultana, Kazi Zakia, Varde, Aparna S..  2021.  Mining Learner-friendly Security Patterns from Huge Published Histories of Software Applications for an Intelligent Tutoring System in Secure Coding. 2021 IEEE International Conference on Big Data (Big Data). :4869–4876.

Security patterns are proven solutions to recurring problems in software development. The growing importance of secure software development has introduced diverse research efforts on security patterns that mostly focused on classification schemes, evolution and evaluation of the patterns. Despite a huge mature history of research and popularity among researchers, security patterns have not fully penetrated software development practices. Besides, software security education has not been benefited by these patterns though a commonly stated motivation is the dissemination of expert knowledge and experience. This is because the patterns lack a simple embodiment to help students learn about vulnerable code, and to guide new developers on secure coding. In order to address this problem, we propose to conduct intelligent data mining in the context of software engineering to discover learner-friendly software security patterns. Our proposed model entails knowledge discovery from large scale published real-world vulnerability histories in software applications. We harness association rule mining for frequent pattern discovery to mine easily comprehensible and explainable learner-friendly rules, mainly of the type "flaw implies fix" and "attack type implies flaw", so as to enhance training in secure coding which in turn would augment secure software development. We propose to build a learner-friendly intelligent tutoring system (ITS) based on the newly discovered security patterns and rules explored. We present our proposed model based on association rule mining in secure software development with the goal of building this ITS. Our proposed model and prototype experiments are discussed in this paper along with challenges and ongoing work.

Guo, Jiansheng, Qi, Liang, Suo, Jiao.  2021.  Research on Data Classification of Intelligent Connected Vehicles Based on Scenarios. 2021 International Conference on E-Commerce and E-Management (ICECEM). :153–158.
The intelligent connected vehicle industry has entered a period of opportunity, industry data is accumulating rapidly, and the formulation of industry standards to regulate big data management and application is imminent. As the basis of data security, data classification has received unprecedented attention. By combing through the research and development status of data classification in various industries, this article combines industry characteristics and re-examines the framework of industry data classification from the aspects of information security and data assetization, and tries to find the balance point between data security and data value. The intelligent networked automobile industry provides support for big data applications, this article combines the characteristics of the connected vehicle industry, re-examines the data characteristics of the intelligent connected vehicle industry from the 2 aspects as information security and data assetization, and eventually proposes a scene-based hierarchical framework. The framework includes the complete classification process, model, and quantifiable parameters, which provides a solution and theoretical endorsement for the construction of a big data automatic classification system for the intelligent connected vehicle industry and safe data open applications.
2022-06-06
Uchida, Hikaru, Matsubara, Masaki, Wakabayashi, Kei, Morishima, Atsuyuki.  2020.  Human-in-the-loop Approach towards Dual Process AI Decisions. 2020 IEEE International Conference on Big Data (Big Data). :3096–3098.
How to develop AI systems that can explain how they made decisions is one of the important and hot topics today. Inspired by the dual-process theory in psychology, this paper proposes a human-in-the-loop approach to develop System-2 AI that makes an inference logically and outputs interpretable explanation. Our proposed method first asks crowd workers to raise understandable features of objects of multiple classes and collect training data from the Internet to generate classifiers for the features. Logical decision rules with the set of generated classifiers can explain why each object is of a particular class. In our preliminary experiment, we applied our method to an image classification of Asian national flags and examined the effectiveness and issues of our method. In our future studies, we plan to combine the System-2 AI with System-1 AI (e.g., neural networks) to efficiently output decisions.
Yeruva, Vijaya Kumari, Chandrashekar, Mayanka, Lee, Yugyung, Rydberg-Cox, Jeff, Blanton, Virginia, Oyler, Nathan A.  2020.  Interpretation of Sentiment Analysis with Human-in-the-Loop. 2020 IEEE International Conference on Big Data (Big Data). :3099–3108.
Human-in-the-Loop has been receiving special attention from the data science and machine learning community. It is essential to realize the advantages of human feedback and the pressing need for manual annotation to improve machine learning performance. Recent advancements in natural language processing (NLP) and machine learning have created unique challenges and opportunities for digital humanities research. In particular, there are ample opportunities for NLP and machine learning researchers to analyze data from literary texts and use these complex source texts to broaden our understanding of human sentiment using the human-in-the-loop approach. This paper presents our understanding of how human annotators differ from machine annotators in sentiment analysis tasks and how these differences can contribute to designing systems for the "human in the loop" sentiment analysis in complex, unstructured texts. We further explore the challenges and benefits of the human-machine collaboration for sentiment analysis using a case study in Greek tragedy and address some open questions about collaborative annotation for sentiments in literary texts. We focus primarily on (i) an analysis of the challenges in sentiment analysis tasks for humans and machines, and (ii) whether consistent annotation results are generated from multiple human annotators and multiple machine annotators. For human annotators, we have used a survey-based approach with about 60 college students. We have selected six popular sentiment analysis tools for machine annotators, including VADER, CoreNLP's sentiment annotator, TextBlob, LIME, Glove+LSTM, and RoBERTa. We have conducted a qualitative and quantitative evaluation with the human-in-the-loop approach and confirmed our observations on sentiment tasks using the Greek tragedy case study.
Peng, Liwen, Zhu, Xiaolin, Zhang, Peng.  2021.  A Framework for Mobile Forensics Based on Clustering of Big Data. 2021 IEEE 4th International Conference on Electronics Technology (ICET). :1300–1303.
With the rapid development of the wireless network and smart mobile equipment, many lawbreakers employ mobile devices to destroy and steal important information and property from other persons. In order to fighting the criminal act efficiently, the public security organ need to collect the evidences from the crime tools and submit to the court. In the meantime, with development of internal storage technology, the law enforcement officials collect lots of information from the smart mobile equipment, for the sake of handling the huge amounts of data, we propose a framework that combine distributed clustering methods to analyze data sets, this model will split massive data into smaller pieces and use clustering method to analyze each smaller one on disparate machines to solve the problem of large amount of data, thus forensics investigation work will be more effectively.
2022-05-24
Liu, Yizhong, Xia, Yu, Liu, Jianwei, Hei, Yiming.  2021.  A Secure and Decentralized Reconfiguration Protocol For Sharding Blockchains. 2021 7th IEEE Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS). :111–116.
Most present reconfiguration methods in sharding blockchains rely on a secure randomness, whose generation might be complicated. Besides, a reference committee is usually in charge of the reconfiguration, making the process not decentralized. To address the above issues, this paper proposes a secure and decentralized shard reconfiguration protocol, which allows each shard to complete the selection and confirmation of its own shard members in turn. The PoW mining puzzle is calculated using the public key hash value in the member list confirmed by the last shard. Through the mining and shard member list commitment process, each shard can update its members safely and efficiently once in a while. Furthermore, it is proved that our protocol satisfies the safety, consistency, liveness, and decentralization properties. The honest member proportion in each confirmed shard member list is guaranteed to exceed a certain safety threshold, and all honest nodes have an identical view on the list. The reconfiguration is ensured to make progress, and each node has the same right to participate in the process. Our secure and decentralized shard reconfiguration protocol could be applied to all committee-based sharding blockchains.
2022-05-19
Zhang, Xiangyu, Yang, Jianfeng, Li, Xiumei, Liu, Minghao, Kang, Ruichun, Wang, Runmin.  2021.  Deeply Multi-channel guided Fusion Mechanism for Natural Scene Text Detection. 2021 7th International Conference on Big Data and Information Analytics (BigDIA). :149–156.
Scene text detection methods have developed greatly in the past few years. However, due to the limitation of the diversity of the text background of natural scene, the previous methods often failed when detecting more complicated text instances (e.g., super-long text and arbitrarily shaped text). In this paper, a text detection method based on multi -channel bounding box fusion is designed to address the problem. Firstly, the convolutional neural network is used as the basic network for feature extraction, including shallow text feature map and deep semantic text feature map. Secondly, the whole convolutional network is used for upsampling of feature map and fusion of feature map at each layer, so as to obtain pixel-level text and non-text classification results. Then, two independent text detection boxes channels are designed: the boundary box regression channel and get the bounding box directly on the score map channel. Finally, the result is obtained by combining multi-channel boundary box fusion mechanism with the detection box of the two channels. Experiments on ICDAR2013 and ICDAR2015 demonstrate that the proposed method achieves competitive results in scene text detection.
Zhang, Cheng, Yamana, Hayato.  2021.  Improving Text Classification Using Knowledge in Labels. 2021 IEEE 6th International Conference on Big Data Analytics (ICBDA). :193–197.
Various algorithms and models have been proposed to address text classification tasks; however, they rarely consider incorporating the additional knowledge hidden in class labels. We argue that hidden information in class labels leads to better classification accuracy. In this study, instead of encoding the labels into numerical values, we incorporated the knowledge in the labels into the original model without changing the model architecture. We combined the output of an original classification model with the relatedness calculated based on the embeddings of a sequence and a keyword set. A keyword set is a word set to represent knowledge in the labels. Usually, it is generated from the classes while it could also be customized by the users. The experimental results show that our proposed method achieved statistically significant improvements in text classification tasks. The source code and experimental details of this study can be found on Github11https://github.com/HeroadZ/KiL.
Kuilboer, Jean-Pierre, Stull, Tristan.  2021.  Text Analytics and Big Data in the Financial domain. 2021 16th Iberian Conference on Information Systems and Technologies (CISTI). :1–4.
This research attempts to provide some insights on the application of text mining and Natural Language Processing (NLP). The application domain is consumer complaints about financial institutions in the USA. As an advanced analytics discipline embedded within the Big Data paradigm, the practice of text analytics contains elements of emergent knowledge processes. Since our experiment should be able to scale up we make use of a pipeline based on Spark-NLP. The usage scenario is adapting the model to a specific industrial context and using the dataset offered by the "Consumer Financial Protection Bureau" to illustrate the application.
Ndichu, Samuel, Ban, Tao, Takahashi, Takeshi, Inoue, Daisuke.  2021.  A Machine Learning Approach to Detection of Critical Alerts from Imbalanced Multi-Appliance Threat Alert Logs. 2021 IEEE International Conference on Big Data (Big Data). :2119–2127.
The extraordinary number of alerts generated by network intrusion detection systems (NIDS) can desensitize security analysts tasked with incident response. Security information and event management systems (SIEMs) perform some rudimentary automation but cannot replicate the decision-making process of a skilled analyst. Machine learning and artificial intelligence (AI) can detect patterns in data with appropriate training. In practice, the majority of the alert data comprises false alerts, and true alerts form only a small proportion. Consequently, a naive engine that classifies all security alerts into the majority class can yield a superficial high accuracy close to 100%. Without any correction for the class imbalance, the false alerts will dominate algorithmic predictions resulting in poor generalization performance. We propose a machine-learning approach to address the class imbalance problem in multi-appliance security alert data and automate the security alert analysis process performed in security operations centers (SOCs). We first used the neighborhood cleaning rule (NCR) to identify and remove ambiguous, noisy, and redundant false alerts. Then, we applied the support vector machine synthetic minority oversampling technique (SVMSMOTE) to generate synthetic training true alerts. Finally, we fit and evaluated the decision tree and random forest classifiers. In the experiments, using alert data from eight security appliances, we demonstrated that the proposed method can significantly reduce the need for manual auditing, decreasing the number of uninspected alerts and achieving a performance of 99.524% in recall.
2022-05-10
Ye, YuGuang.  2021.  Research on the Security Defense Strategy of Smart City's Substitution Computer Network in Big Data. 2021 5th International Conference on Electronics, Communication and Aerospace Technology (ICECA). :1428–1431.
With the rapid development of the information technology era, the era of big data has also arrived. While computer networks are promoting the prosperity and development of society, their applications have become more extensive and in-depth. Smart city video surveillance systems have entered an era of networked surveillance and business integration. The problems are also endless. This article discusses computer network security in the era of big data, hoping to help strengthen the security of computer networks in our country. This paper studies the computer network security prevention strategies of smart cities in the era of big data.
Ahakonye, Love Allen Chijioke, Amaizu, Gabriel Chukwunonso, Nwakanma, Cosmas Ifeanyi, Lee, Jae Min, Kim, Dong-Seong.  2021.  Enhanced Vulnerability Detection in SCADA Systems using Hyper-Parameter-Tuned Ensemble Learning. 2021 International Conference on Information and Communication Technology Convergence (ICTC). :458–461.
The growth of inter-dependency intricacies of Supervisory Control and Data Acquisition (SCADA) systems in industrial operations generates a likelihood of increased vulnerability to malicious threats and machine learning approaches have been extensively utilized in the research for vulnerability detection. Nonetheless, to improve security, an enhanced vulnerability detection using hyper-parameter-tune machine learning is proposed for early detection, classification and mitigation of SCADA communication and transmission networks by classifying benign, or malicious DNS attacks. The proposed scheme, an ensemble optimizer (GentleBoost) upon hyper-parameter tuning, gave a comparative achievement. From the simulation results, the proposed scheme had an outstanding performance within the shortest possible time with an accuracy of 99.49%, 99.23% for precision, and a recall rate of 99.75%. Also, the model was compared to other contemporary algorithms and outperformed all the other algorithms proving to be an approach to keep abreast of the SCADA network vulnerabilities and attacks.
Tao, Yunting, Kong, Fanyu, Yu, Jia, Xu, Qiuliang.  2021.  Modification and Performance Improvement of Paillier Homomorphic Cryptosystem. 2021 IEEE 19th International Conference on Embedded and Ubiquitous Computing (EUC). :131–136.
Data security and privacy have become an important problem while big data systems are growing dramatically fast in various application fields. Paillier additive homomorphic cryptosystem is widely used in information security fields such as big data security, communication security, cloud computing security, and artificial intelligence security. However, how to improve its computational performance is one of the most critical problems in practice. In this paper, we propose two modifications to improve the performance of the Paillier cryptosystem. Firstly, we introduce a key generation method to generate the private key with low Hamming weight, and this can be used to accelerate the decryption computation of the Paillier cryptosystem. Secondly, we propose an acceleration method based on Hensel lifting in the Paillier cryptosystem. This method can obtain a faster and improved decryption process by showing the mathematical analysis of the decryption algorithm.
2022-05-09
Zobaed, Sakib M, Salehi, Mohsen Amini, Buyya, Rajkumar.  2021.  SAED: Edge-Based Intelligence for Privacy-Preserving Enterprise Search on the Cloud. 2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid). :366–375.
Cloud-based enterprise search services (e.g., AWS Kendra) have been entrancing big data owners by offering convenient and real-time search solutions to them. However, the problem is that individuals and organizations possessing confidential big data are hesitant to embrace such services due to valid data privacy concerns. In addition, to offer an intelligent search, these services access the user’s search history that further jeopardizes his/her privacy. To overcome the privacy problem, the main idea of this research is to separate the intelligence aspect of the search from its pattern matching aspect. According to this idea, the search intelligence is provided by an on-premises edge tier and the shared cloud tier only serves as an exhaustive pattern matching search utility. We propose Smartness at Edge (SAED mechanism that offers intelligence in the form of semantic and personalized search at the edge tier while maintaining privacy of the search on the cloud tier. At the edge tier, SAED uses a knowledge-based lexical database to expand the query and cover its semantics. SAED personalizes the search via an RNN model that can learn the user’s interest. A word embedding model is used to retrieve documents based on their semantic relevance to the search query. SAED is generic and can be plugged into existing enterprise search systems and enable them to offer intelligent and privacy-preserving search without enforcing any change on them. Evaluation results on two enterprise search systems under real settings and verified by human users demonstrate that SAED can improve the relevancy of the retrieved results by on average ≈24% for plain-text and ≈75% for encrypted generic datasets.
2022-05-06
Bhagavan, Srini, Gharibi, Mohamed, Rao, Praveen.  2021.  FedSmarteum: Secure Federated Matrix Factorization Using Smart Contracts for Multi-Cloud Supply Chain. 2021 IEEE International Conference on Big Data (Big Data). :4054–4063.
With increased awareness comes unprecedented expectations. We live in a digital, cloud era wherein the underlying information architectures are expected to be elastic, secure, resilient, and handle petabyte scaling. The expectation of epic proportions from the next generation of the data frameworks is to not only do all of the above but also build it on a foundation of trust and explainability across multi-organization business networks. From cloud providers to automobile industries or even vaccine manufacturers, components are often sourced by a complex, not full digitized thread of disjoint suppliers. Building Machine Learning and AI-based order fulfillment and predictive models, remediating issues, is a challenge for multi-organization supply chain automation. We posit that Federated Learning in conjunction with blockchain and smart contracts are technologies primed to tackle data privacy and centralization challenges. In this paper, motivated by challenges in the industry, we propose a decentralized distributed system in conjunction with a recommendation system model (Matrix Factorization) that is trained using Federated Learning on an Ethereum blockchain network. We leverage smart contracts that allow decentralized serverless aggregation to update local-ized items vectors. Furthermore, we utilize Homomorphic Encryption (HE) to allow sharing the encrypted gradients over the network while maintaining their privacy. Based on our results, we argue that training a model over a serverless Blockchain network using smart contracts will provide the same accuracy as in a centralized model while maintaining our serverless model privacy and reducing the overhead communication to a central server. Finally, we assert such a system that provides transparency, audit-ready and deep insights into supply chain operations for enterprise cloud customers resulting in cost savings and higher Quality of Service (QoS).
Nayak, Lipsa, Jayalakshmi, V..  2021.  A Study of Securing Healthcare Big Data using DNA Encoding based ECC. 2021 6th International Conference on Inventive Computation Technologies (ICICT). :348—352.
IT world is migrating towards utilizing cloud computing as an essential data storing and exchanging platform. With the amelioration of technology, a colossal amount of data is generating with time. Cloud computing provides an enormous data storage capacity with the flexibility of accessing it without the time and place restrictions with virtualized resources. Healthcare industries spawn intense amounts of data from various medical instruments and digital records of patients. To access data remotely from any geographical location, the healthcare industry is moving towards cloud computing. EHR and PHR are patient's digital records, which include sensitive information of patients. Apart from all the proficient service provided by cloud computing, security is a primary concern for various organizations. To address the security issue, several cryptographic techniques implemented by researchers worldwide. In this paper, a vigorous cryptographic method discussed which is implemented by combining DNA cryptography and Elliptic Curve Cryptography to protect sensitive data in the cloud.
2022-05-05
Salman, Zainab, Hammad, Mustafa, Al-Omary, Alauddin Yousif.  2021.  A Homomorphic Cloud Framework for Big Data Analytics Based on Elliptic Curve Cryptography. 2021 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT). :7—11.
Homomorphic Encryption (HE) comes as a sophisticated and powerful cryptography system that can preserve the privacy of data in all cases when the data is at rest or even when data is in processing and computing. All the computations needed by the user or the provider can be done on the encrypted data without any need to decrypt it. However, HE has overheads such as big key sizes and long ciphertexts and as a result long execution time. This paper proposes a novel solution for big data analytic based on clustering and the Elliptical Curve Cryptography (ECC). The Extremely Distributed Clustering technique (EDC) has been used to divide big data into several subsets of cloud computing nodes. Different clustering techniques had been investigated, and it was found that using hybrid techniques can improve the performance and efficiency of big data analytic while at the same time data is protected and privacy is preserved using ECC.
2022-04-26
Zhai, Hongqun, Zhang, Juan.  2021.  Research on Application of Radio Frequency Identification Technology in Intelligent Maritime Supervision. 2021 IEEE International Conference on Data Science and Computer Application (ICDSCA). :433–436.

The increasing volume of domestic and foreign trade brings new challenges to the efficiency and safety supervision of transportation. With the rapid development of Internet technology, it has opened up a new era of intelligent Internet of Things and the modern marine Internet of Vessels. Radio Frequency Identification technology strengthens the intelligent navigation and management of ships through the unique identification function of “label is object, object is label”. Intelligent Internet of Vessels can achieve the function of “limited electronic monitoring and unlimited electronic deterrence” combined with marine big data and Cyber Physical Systems, and further improve the level of modern maritime supervision and service.

2022-04-25
Dijk, Allard.  2021.  Detection of Advanced Persistent Threats using Artificial Intelligence for Deep Packet Inspection. 2021 IEEE International Conference on Big Data (Big Data). :2092–2097.

Advanced persistent threats (APT’s) are stealthy threat actors with the skills to gain covert control of the computer network for an extended period of time. They are the highest cyber attack risk factor for large companies and states. A successful attack via an APT can cost millions of dollars, can disrupt civil life and has the capabilities to do physical damage. APT groups are typically state-sponsored and are considered the most effective and skilled cyber attackers. Attacks of APT’s are executed in several stages as pointed out in the Lockheed Martin cyber kill chain (CKC). Each of these APT stages can potentially be identified as patterns in network traffic. Using the "APT-2020" dataset, that compiles the characteristics and stages of an APT, we carried out experiments on the detection of anomalous traffic for all APT stages. We compare several artificial intelligence models, like a stacked auto encoder, a recurrent neural network and a one class state vector machine and show significant improvements on detection in the data exfiltration stage. This dataset is the first to have a data exfiltration stage included to experiment on. According to APT-2020’s authors current models have the biggest challenge specific to this stage. We introduce a method to successfully detect data exfiltration by analyzing the payload of the network traffic flow. This flow based deep packet inspection approach improves detection compared to other state of the art methods.

2022-04-18
Shi, Pinyi, Song, Yongwook, Fei, Zongming, Griffioen, James.  2021.  Checking Network Security Policy Violations via Natural Language Questions. 2021 International Conference on Computer Communications and Networks (ICCCN). :1–9.
Network security policies provide high-level directives regarding acceptable and unacceptable use of the network. Organizations specify these high-level directives in policy documents written using human-readable natural language. The challenge is to convert these natural language policies to the network configurations/specifications needed to enforce the policy. Network administrators, who are responsible for enforcing the policies, typically translate the policies manually, which is a challenging and error-prone process. As a result, network operators (as well as the policy authors) often want to verify that network policies are being correctly enforced. In this paper, we propose Network Policy Conversation Engine (NPCE), a system designed to help network operators (or policy writers) interact with the network using natural language (similar to the language used in the network policy statements themselves) to understand whether policies are being correctly enforced. The system leverages emerging big data collection and analysis techniques to record flow and packet level activity throughout the network that can be used to answer users policy questions. The system also takes advantage of recent advances in Natural Language Processing (NLP) to translate natural language policy questions into the corresponding network queries. To evaluate our system, we demonstrate a wide range of policy questions – inspired by actual networks policies posted on university websites – that can be asked of the system to determine if a policy violation has occurred.
Vijayalakshmi, K., Jayalakshmi, V..  2021.  Identifying Considerable Anomalies and Conflicts in ABAC Security Policies. 2021 5th International Conference on Intelligent Computing and Control Systems (ICICCS). :1273–1280.
Nowadays security of shared resources and big data is an important and critical issue. With the growth of information technology and social networks, data and resources are shared in the distributed environment such as cloud and fog computing. Various access control models protect the shared resources from unauthorized users or malicious intruders. Despite the attribute-based access control model that meets the complex security requirement of todays' new computing technologies, considerable anomalies and conflicts in ABAC policies affect the efficiency of the security system. One important and toughest task is policy validation thus to detect and eliminate anomalies and conflicts in policies. Though the previous researches identified anomalies, failed to detect and analyze all considerable anomalies that results vulnerable to hacks and attacks. The primary objective of this paper is to study and analyze the possible anomalies and conflicts in ABAC security policies. We have discussed and analyzed considerable conflicts in policies based on previous researches. This paper can provide a detailed review of anomalies and conflicts in security policies.