Biblio
Filters: First Letter Of Last Name is T [Clear All Filters]
Empirical Research on Multifactor Quantitative Stock Selection Strategy Based on Machine Learning. 2022 3rd International Conference on Pattern Recognition and Machine Learning (PRML). :380—383.
.
2022. In this paper, stock selection strategy design based on machine learning and multi-factor analysis is a research hotspot in quantitative investment field. Four machine learning algorithms including support vector machine, gradient lifting regression, random forest and linear regression are used to predict the rise and fall of stocks by taking stock fundamentals as input variables. The portfolio strategy is constructed on this basis. Finally, the stock selection strategy is further optimized. The empirical results show that the multifactor quantitative stock selection strategy has a good stock selection effect, and yield performance under the support vector machine algorithm is the best. With the increase of the number of factors, there is an inverse relationship between the fitting degree and the yield under various algorithms.
Disparity Analysis Between the Assembly and Byte Malware Samples with Deep Autoencoders. 2022 19th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP). :1—4.
.
2022. Malware attacks in the cyber world continue to increase despite the efforts of Malware analysts to combat this problem. Recently, Malware samples have been presented as binary sequences and assembly codes. However, most researchers focus only on the raw Malware sequence in their proposed solutions, ignoring that the assembly codes may contain important details that enable rapid Malware detection. In this work, we leveraged the capabilities of deep autoencoders to investigate the presence of feature disparities in the assembly and raw binary Malware samples. First, we treated the task as outliers to investigate whether the autoencoder would identify and justify features as samples from the same family. Second, we added noise to all samples and used Deep Autoencoder to reconstruct the original samples by denoising. Experiments with the Microsoft Malware dataset showed that the byte samples' features differed from the assembly code samples.
Investigation Malware Analysis Depend on Reverse Engineering Using IDAPro. 2022 8th International Conference on Contemporary Information Technology and Mathematics (ICCITM). :227—231.
.
2022. Any software that runs malicious payloads on victims’ computers is referred to as malware. It is an increasing threat that costs people, businesses, and organizations a lot of money. Attacks on security have developed significantly in recent years. Malware may infiltrate both offline and online media, like: chat, SMS, and spam (email, or social media), because it has a built-in defensive mechanism and may conceal itself from antivirus software or even corrupt it. As a result, there is an urgent need to detect and prevent malware before it damages critical assets around the world. In fact, there are lots of different techniques and tools used to combat versus malware. In this paper, the malware samples were analyzing in the Virtual Box environment using in-depth analysis based on reverse engineering using advanced static malware analysis techniques. The results Obtained from malware analysis which represent a set of valuable information, all anti-malware and anti-virus program companies need for in order to update their products.
A Survey of Explainable Graph Neural Networks for Cyber Malware Analysis. 2022 IEEE International Conference on Big Data (Big Data). :2932—2939.
.
2022. Malicious cybersecurity activities have become increasingly worrisome for individuals and companies alike. While machine learning methods like Graph Neural Networks (GNNs) have proven successful on the malware detection task, their output is often difficult to understand. Explainable malware detection methods are needed to automatically identify malicious programs and present results to malware analysts in a way that is human interpretable. In this survey, we outline a number of GNN explainability methods and compare their performance on a real-world malware detection dataset. Specifically, we formulated the detection problem as a graph classification problem on the malware Control Flow Graphs (CFGs). We find that gradient-based methods outperform perturbation-based methods in terms of computational expense and performance on explainer-specific metrics (e.g., Fidelity and Sparsity). Our results provide insights into designing new GNN-based models for cyber malware detection and attribution.
Detecting Malware Using Graph Embedding and DNN. 2022 International Conference on Blockchain Technology and Information Security (ICBCTIS). :28—31.
.
2022. Nowadays, the popularity of intelligent terminals makes malwares more and more serious. Among the many features of application, the call graph can accurately express the behavior of the application. The rapid development of graph neural network in recent years provides a new solution for the malicious analysis of application using call graphs as features. However, there are still problems such as low accuracy. This paper established a large-scale data set containing more than 40,000 samples and selected the class call graph, which was extracted from the application, as the feature and used the graph embedding combined with the deep neural network to detect the malware. The experimental results show that the accuracy of the detection model proposed in this paper is 97.7%; the precision is 96.6%; the recall is 96.8%; the F1-score is 96.4%, which is better than the existing detection model based on Markov chain and graph embedding detection model.
Poster: Flexible Function Estimation of IoT Malware Using Graph Embedding Technique. 2022 IEEE Symposium on Computers and Communications (ISCC). :1—3.
.
2022. Most IoT malware is variants generated by editing and reusing parts of the functions based on publicly available source codes. In our previous study, we proposed a method to estimate the functions of a specimen using the Function Call Sequence Graph (FCSG), which is a directed graph of execution sequence of function calls. In the FCSG-based method, the subgraph corresponding to a malware functionality is manually created and called a signature-FSCG. The specimens with the signature-FSCG are expected to have the corresponding functionality. However, this method cannot detect the specimens with a slightly different subgraph from the signature-FSCG. This paper found that these specimens were supposed to have the same functionality for a signature-FSCG. These specimens need more flexible signature matching, and we propose a graph embedding technique to realize it.
Analysis of the Optimized KNN Algorithm for the Data Security of DR Service. 2022 IEEE 6th Conference on Energy Internet and Energy System Integration (EI2). :1634–1637.
.
2022. The data of large-scale distributed demand-side iot devices are gradually migrated to the cloud. This cloud deployment mode makes it convenient for IoT devices to participate in the interaction between supply and demand, and at the same time exposes various vulnerabilities of IoT devices to the Internet, which can be easily accessed and manipulated by hackers to launch large-scale DDoS attacks. As an easy-to-understand supervised learning classification algorithm, KNN can obtain more accurate classification results without too many adjustment parameters, and has achieved many research achievements in the field of DDoS detection. However, in the face of high-dimensional data, this method has high operation cost, high cost and not practical. Aiming at this disadvantage, this chapter explores the potential of classical KNN algorithm in data storage structure, K-nearest neighbor search and hyperparameter optimization, and proposes an improved KNN algorithm for DDoS attack detection of demand-side IoT devices.
Application of Intelligent Transportation System Data using Big Data Technologies. 2022 Innovations in Intelligent Systems and Applications Conference (ASYU). :1–6.
.
2022. Problems such as the increase in the number of private vehicles with the population, the rise in environmental pollution, the emergence of unmet infrastructure and resource problems, and the decrease in time efficiency in cities have put local governments, cities, and countries in search of solutions. These problems faced by cities and countries are tried to be solved in the concept of smart cities and intelligent transportation by using information and communication technologies in line with the needs. While designing intelligent transportation systems (ITS), beyond traditional methods, big data should be designed in a state-of-the-art and appropriate way with the help of methods such as artificial intelligence, machine learning, and deep learning. In this study, a data-driven decision support system model was established to help the business make strategic decisions with the help of intelligent transportation data and to contribute to the elimination of public transportation problems in the city. Our study model has been established using big data technologies and business intelligence technologies: a decision support system including data sources layer, data ingestion/ collection layer, data storage and processing layer, data analytics layer, application/presentation layer, developer layer, and data management/ data security layer stages. In our study, the decision support system was modeled using ITS data supported by big data technologies, where the traditional structure could not find a solution. This paper aims to create a basis for future studies looking for solutions to the problems of integration, storage, processing, and analysis of big data and to add value to the literature that is missing within the framework of the model. We provide both the lack of literature, eliminate the lack of models before the application process of existing data sets to the business intelligence architecture and a model study before the application to be carried out by the authors.
ISSN: 2770-7946
Access Control Audit and Traceability Forensics Technology Based on Blockchain. 2022 4th International Conference on Frontiers Technology of Information and Computer (ICFTIC). :932—937.
.
2022. Access control includes authorization of security administrators and access of users. Aiming at the problems of log information storage difficulty and easy tampering faced by auditing and traceability forensics of authorization and access in cross-domain scenarios, we propose an access control auditing and traceability forensics method based on Blockchain, whose core is Ethereum Blockchain and IPFS interstellar mail system, and its main function is to store access control log information and trace forensics. Due to the technical characteristics of blockchain, such as openness, transparency and collective maintenance, the log information metadata storage based on Blockchain meets the requirements of distribution and trustworthiness, and the exit of any node will not affect the operation of the whole system. At the same time, by storing log information in the blockchain structure and using mapping, it is easy to locate suspicious authorization or judgment that lead to permission leakage, so that security administrators can quickly grasp the causes of permission leakage. Using this distributed storage structure for security audit has stronger anti-attack and anti-risk.
Automatic labeling of the elements of a vulnerability report CVE with NLP. 2022 IEEE 23rd International Conference on Information Reuse and Integration for Data Science (IRI). :164—165.
.
2022. Common Vulnerabilities and Exposures (CVE) databases contain information about vulnerabilities of software products and source code. If individual elements of CVE descriptions can be extracted and structured, then the data can be used to search and analyze CVE descriptions. Herein we propose a method to label each element in CVE descriptions by applying Named Entity Recognition (NER). For NER, we used BERT, a transformer-based natural language processing model. Using NER with machine learning can label information from CVE descriptions even if there are some distortions in the data. An experiment involving manually prepared label information for 1000 CVE descriptions shows that the labeling accuracy of the proposed method is about 0.81 for precision and about 0.89 for recall. In addition, we devise a way to train the data by dividing it into labels. Our proposed method can be used to label each element automatically from CVE descriptions.
Implementation of Physical Layer Security into 5G NR Systems and E2E Latency Assessment. GLOBECOM 2022 - 2022 IEEE Global Communications Conference. :4044—4050.
.
2022. This paper assesses the impact on the performance that information-theoretic physical layer security (IT-PLS) introduces when integrated into a 5G New Radio (NR) system. For this, we implement a wiretap code for IT-PLS based on a modular coding scheme that uses a universal-hash function in its security layer. The main advantage of this approach lies in its flexible integration into the lower layers of the 5G NR protocol stack without affecting the communication's reliability. Specifically, we use IT-PLS to secure the transmission of downlink control information by integrating an extra pre-coding security layer as part of the physical downlink control channel (PDCCH) procedures, thus not requiring any change of the 3GPP 38 series standard. We conduct experiments using a real-time open-source 5G NR standalone implementation and use software-defined radios for over-the-air transmissions in a controlled laboratory environment. The overhead added by IT-PLS is determined in terms of the latency introduced into the system, which is measured at the physical layer for an end-to-end (E2E) connection between the gNB and the user equipment.
A Named In-Network Computing Service Deployment Scheme for NDN-Enabled Software Router. 2022 5th International Conference on Hot Information-Centric Networking (HotICN). :25–29.
.
2022. Named in-network computing is an emerging technology of Named Data Networking (NDN). Through deploying the named computing services/functions on NDN router, the router can utilize its free resources to provide nearby computation for users while relieving the pressure of cloud and network edge. Benefitted from the characteristic of named addressing, named computing services/functions can be easily discovered and migrated in the network. To implement named in-network computing, integrating the computing services as Virtual Machines (VMs) into the software router is a feasible way, but how to effectively deploy the service VMs to optimize the local processing capability is still a challenge. Focusing on this problem, we first give the design of NDN-enabled software router in this paper, then propose a service earning based named service deployment scheme (SE-NSD). For available service VMs, SE-NSD not only considers their popularities but further evaluates their service earnings (processed data amount per CPU cycle). Through modelling the deployment problem as the knapsack problem, SE-NSD determines the optimal service VMs deployment scheme. The simulation results show that, comparing with the popularity-based deployment scheme, SE-NSD can promote about 30% in-network computing capability while slightly reducing the service invoking RTT of user.
ISSN: 2831-4395
Analytical Choice of an Effective Cyber Security Structure with Artificial Intelligence in Industrial Control Systems. 2022 10th International Scientific Conference on Computer Science (COMSCI). :1–6.
.
2022. The new paradigm of industrial development, called Industry 4.0, faces the problems of Cybersecurity, and as it has already manifested itself in Information Systems, focuses on the use of Artificial Intelligence tools. The authors of this article build on their experience with the use of the above mentioned tools to increase the resilience of Information Systems against Cyber threats, approached to the choice of an effective structure of Cyber-protection of Industrial Systems, primarily analyzing the objective differences between them and Information Systems. A number of analyzes show increased resilience of the decentralized architecture in the management of large-scale industrial processes to the centralized management architecture. These considerations provide sufficient grounds for the team of the project to give preference to the decentralized structure with flock behavior for further research and experiments. The challenges are to determine the indicators which serve to assess and compare the impacts on the controlled elements.
Semi-supervised Trojan Nets Classification Using Anomaly Detection Based on SCOAP Features. 2022 IEEE International Symposium on Circuits and Systems (ISCAS). :2423—2427.
.
2022. Recently, hardware Trojan has become a serious security concern in the integrated circuit (IC) industry. Due to the globalization of semiconductor design and fabrication processes, ICs are highly vulnerable to hardware Trojan insertion by malicious third-party vendors. Therefore, the development of effective hardware Trojan detection techniques is necessary. Testability measures have been proven to be efficient features for Trojan nets classification. However, most of the existing machine-learning-based techniques use supervised learning methods, which involve time-consuming training processes, need to deal with the class imbalance problem, and are not pragmatic in real-world situations. Furthermore, no works have explored the use of anomaly detection for hardware Trojan detection tasks. This paper proposes a semi-supervised hardware Trojan detection method at the gate level using anomaly detection. We ameliorate the existing computation of the Sandia Controllability/Observability Analysis Program (SCOAP) values by considering all types of D flip-flops and adopt semi-supervised anomaly detection techniques to detect Trojan nets. Finally, a novel topology-based location analysis is utilized to improve the detection performance. Testing on 17 Trust-Hub Trojan benchmarks, the proposed method achieves an overall 99.47% true positive rate (TPR), 99.99% true negative rate (TNR), and 99.99% accuracy.
The Use of Blockchain for Digital Identity Management in Healthcare. 2022 10th International Conference on Cyber and IT Service Management (CITSM). :1—6.
.
2022. Digitalization has occurred in almost all industries, one of them is health industry. Patients” medical records are now easier to be accessed and managed as all related data are stored in data storages or repositories. However, this system is still under development as number of patients still increasing. Lack of standardization might lead to patients losing their right to control their own data. Therefore, implementing private blockchain system with Self-Sovereign Identity (SSI) concept for identity management in health industry is a viable notion. With SSI, the patients will be benefited from having control over their own medical records and stored with higher security protocol. While healthcare providers will benefit in Know You Customer (KYC) process, if they handle new patients, who move from other healthcare providers. It will eliminate and shorten the process of updating patients' medical records from previous healthcare providers. Therefore, we suggest several flows in implementing blockchain for digital identity in healthcare industry to help overcome lack of patient's data control and KYC in current system. Nevertheless, implementing blockchain on health industry requires full attention from surrounding system and stakeholders to be realized.
Library of Fully Homomorphic Encryption on a Microcontroller. 2022 International Conference on Smart Information Systems and Technologies (SIST). :1—5.
.
2022. Fully homomorphic encryption technologies allow you to operate on encrypted data without disclosing it, therefore they have a lot of potential for solving personal data storage and processing issues. Because of the increased interest in these technologies, various software tools and libraries that allow completely homomorphic encryption have emerged. However, because this subject of cryptography is still in its early stages, standards and recommendations for the usage of completely homomorphic encryption algorithms are still being developed. The paper presents the main areas of application of homomorphic encryption. The analysis of existing developments in the field of homomorphic encryption is carried out. The analysis showed that existing library implementations do not support the division and subtraction operation. The analysis revealed the need to develop a library of fully homomorphic encryption, which allows performing all mathematical operations on them (addition, difference, multiplication and division), as well as the relevance of developing its own implementation of a library of homomorphic encryption on integers. Then, implement the development of a fully homomorphic encryption library in C++ and on an ESP 32 microcontroller. The ability to perform four operations (addition, difference, multiplication and division) on encrypted data will expand the scope of application of homomorphic encryption. A method of homomorphic division and subtraction is proposed that allows performing the division and subtraction operation on homomorphically encrypted data. The level of security, the types of operations executed, the maximum length of operands, and the algorithm's running time are all described as a consequence of numerical experimentation with parameters.
A Study on a DDH-Based Keyed Homomorphic Encryption Suitable to Machine Learning in the Cloud. 2022 IEEE International Conference on Consumer Electronics – Taiwan. :167—168.
.
2022. Homomorphic encryption is suitable for a machine learning in the cloud such as a privacy-preserving machine learning. However, ordinary homomorphic public key encryption has a problem that public key holders can generate ciphertexts and anyone can execute homomorphic operations. In this paper, we will propose a solution based on the Keyed Homomorphic-Public Key Encryption proposed by Emura et al.
Improving Anomaly Detection with a Self-Supervised Task Based on Generative Adversarial Network. ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :3563–3567.
.
2022. Existing anomaly detection models show success in detecting abnormal images with generative adversarial networks on the insufficient annotation of anomalous samples. However, existing models cannot accurately identify the anomaly samples which are close to the normal samples. We assume that the main reason is that these methods ignore the diversity of patterns in normal samples. To alleviate the above issue, this paper proposes a novel anomaly detection framework based on generative adversarial network, called ADe-GAN. More concretely, we construct a self-supervised learning task to fully explore the pattern information and latent representations of input images. In model inferring stage, we design a new abnormality score approach by jointly considering the pattern information and reconstruction errors to improve the performance of anomaly detection. Extensive experiments show that the ADe-GAN outperforms the state-of-the-art methods over several real-world datasets.
ISSN: 2379-190X
Adversarial AutoEncoder and Generative Adversarial Networks for Semi-Supervised Learning Intrusion Detection System. 2022 RIVF International Conference on Computing and Communication Technologies (RIVF). :584–589.
.
2022. As one of the defensive solutions against cyberattacks, an Intrusion Detection System (IDS) plays an important role in observing the network state and alerting suspicious actions that can break down the system. There are many attempts of adopting Machine Learning (ML) in IDS to achieve high performance in intrusion detection. However, all of them necessitate a large amount of labeled data. In addition, labeling attack data is a time-consuming and expensive human-labor operation, it makes existing ML methods difficult to deploy in a new system or yields lower results due to a lack of labels on pre-trained data. To address these issues, we propose a semi-supervised IDS model that leverages Generative Adversarial Networks (GANs) and Adversarial AutoEncoder (AAE), called a semi-supervised adversarial autoencoder (SAAE). Our SAAE experimental results on two public datasets for benchmarking ML-based IDS, including NF-CSE-CIC-IDS2018 and NF-UNSW-NB15, demonstrate the effectiveness of AAE and GAN in case of using only a small number of labeled data. In particular, our approach outperforms other ML methods with the highest detection rates in spite of the scarcity of labeled data for model training, even with only 1% labeled data.
ISSN: 2162-786X
Security-Alert Screening with Oversampling Based on Conditional Generative Adversarial Networks. 2022 17th Asia Joint Conference on Information Security (AsiaJCIS). :1–7.
.
2022. Imbalanced class distribution can cause information loss and missed/false alarms for deep learning and machine-learning algorithms. The detection performance of traditional intrusion detection systems tend to degenerate due to skewed class distribution caused by the uneven allocation of observations in different kinds of attacks. To combat class imbalance and improve network intrusion detection performance, we adopt the conditional generative adversarial network (CTGAN) that enables the generation of samples of specific classes of interest. CTGAN builds on the generative adversarial networks (GAN) architecture to model tabular data and generate high quality synthetic data by conditionally sampling rows from the generated model. Oversampling using CTGAN adds instances to the minority class such that both data in the majority and the minority class are of equal distribution. The generated security alerts are used for training classifiers that realize critical alert detection. The proposed scheme is evaluated on a real-world dataset collected from security operation center of a large enterprise. The experiment results show that detection accuracy can be substantially improved when CTGAN is adopted to produce a balanced security-alert dataset. We believe the proposed CTGAN-based approach can cast new light on building effective systems for critical alert detection with reduced missed/false alarms.
ISSN: 2765-9712
Real-Time FPGA Investigation of Interplay Between Probabilistic Shaping and Forward Error Correction. Journal of Lightwave Technology. 40:1339—1345.
.
2022. In this work, we implement a complete probabilistic amplitude shaping (PAS) architecture on a field-programmable gate array (FPGA) platform to study the interplay between probabilistic shaping (PS) and forward error correction (FEC). Due to the fully parallelized input–output interfaces based on look up table (LUT) and low computational complexity without high-precision multiplication, hierarchical distribution matching (HiDM) is chosen as the solution for real time probabilistic shaping. In terms of FEC, we select two kinds of the mainstream soft decision-forward error correction (SD-FEC) algorithms currently used in optical communication system, namely Open FEC (OFEC) and soft-decision quasi-cyclic low-density parity-check (SD-QC-LDPC) codes. Through FPGA experimental investigation, we studied the impact of probabilistic shaping on OFEC and LDPC, respectively, based on PS-16QAM under moderate shaping, and also the impact of probabilistic shaping on LDPC code based on PS-64QAM under weak/strong shaping. The FPGA experimental results show that if pre-FEC bit error rate (BER) is used as the predictor, moderate shaping induces no degradation on the OFEC performance, while strong shaping slightly degrades the error correction performance of LDPC. Nevertheless, there is no error floor when the output BER is around 10-15. However, if normalized generalized mutual information (NGMI) is selected as the predictor, the performance degradation of LDPC will become insignificant, which means pre-FEC BER may not a good predictor for LDPC in probabilistic shaping scenario. We also studied the impact of residual errors after FEC decoding on HiDM. The FPGA experimental results show that the increased BER after HiDM decoding is within 10 times compared to post-FEC BER.
Conference Name: Journal of Lightwave Technology
Investigation of Potential FEC Schemes for 800G-ZR Forward Error Correction. 2022 Optical Fiber Communications Conference and Exhibition (OFC). :1—3.
.
2022. With a record 400Gbps 100-piece-FPGA implementation, we investigate performance of the potential FEC schemes for OIF-800GZR. By comparing the power dissipation and correction threshold at 10−15 BER, we proposed the simplified OFEC for the 800G-ZR FEC.
Low-complexity Forward Error Correction For 800G Unamplified Campus Link. 2022 20th International Conference on Optical Communications and Networks (ICOCN). :1—3.
.
2022. The discussion about forward error correction (FEC) used for 800G unamplified link (800LR) is ongoing. Aiming at two potential options for FEC bit error ratio (BER) threshold, we propose two FEC schemes, respectively based on channel-polarized (CP) multilevel coding (MLC) and bit interleaved coded modulation (BICM), with the same inner FEC code. The field-programmable gate array (FPGA) verification results indicate that with the same FEC overhead (OH), proposed CP-MLC outperforms BICM scheme with less resource and power consumption.
Access Distribution to the Evaluation System Based on Fuzzy Logic. 2022 12th International Conference on Advanced Computer Information Technologies (ACIT). :564—567.
.
2022. In order to control users’ access to the information system, it is necessary to develop a security system that can work in real time and easily reconfigure. This problem can be solved using a fuzzy logic. In this paper the authors propose a fuzzy distribution system for access to the student assessment system, which takes into account the level of user access, identifier and the risk of attack during the request. This approach allows process fuzzy or incomplete information about the user and implement a sufficient level of confidential information protection.
Analysis Of The Small UAV Trajectory Detection Algorithm Based On The “l/n-d” Criterion Using Kalman Filtering Due To FMCW Radar Data. 2022 IEEE 16th International Conference on Advanced Trends in Radioelectronics, Telecommunications and Computer Engineering (TCSET). :741—745.
.
2022. Promising means of detecting small UAVs are FMCW radar systems. Small UAVs with an RCS value of the order of 10−3••• 10−1m2 are characterized by a low SNR (less than 10 dB). To ensure an acceptable probability of detection in the resolution element (more than 0.9), it becomes necessary to reduce the detection threshold. However, this leads to a significant increase in the probability of false alarms (more than 10−3) and is accompanied by the appearance of a large number of false plots. The work describes an algorithm for trajectory detecting of a small UAV based on a “l/n-d” criterion using Kalman filtering in a spherical coordinate system due to FMCW radar data. Statistical analysis of algorithms based on two types of criteria “3/5-2” and “5/9-2” is performed. It is shown that the algorithms allow to achieve the probability of target trajectory detection greater than 0.9 and low probability of false detection of the target trajectory less than 10−4 with the false alarm probability in the resolution element 10−3••• 10−2•