Visible to the public Biblio

Filters: Keyword is Information Reuse and Security  [Clear All Filters]
2021-04-08
Ayub, M. A., Continella, A., Siraj, A..  2020.  An I/O Request Packet (IRP) Driven Effective Ransomware Detection Scheme using Artificial Neural Network. 2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI). :319–324.
In recent times, there has been a global surge of ransomware attacks targeted at industries of various types and sizes from retail to critical infrastructure. Ransomware researchers are constantly coming across new kinds of ransomware samples every day and discovering novel ransomware families out in the wild. To mitigate this ever-growing menace, academia and industry-based security researchers have been utilizing unique ways to defend against this type of cyber-attacks. I/O Request Packet (IRP), a low-level file system I/O log, is a newly found research paradigm for defense against ransomware that is being explored frequently. As such in this study, to learn granular level, actionable insights of ransomware behavior, we analyze the IRP logs of 272 ransomware samples belonging to 18 different ransomware families captured during individual execution. We further our analysis by building an effective Artificial Neural Network (ANN) structure for successful ransomware detection by learning the underlying patterns of the IRP logs. We evaluate the ANN model with three different experimental settings to prove the effectiveness of our approach. The model demonstrates outstanding performance in terms of accuracy, precision score, recall score, and F1 score, i.e., in the range of 99.7%±0.2%.
Cheng, J., He, R., Yuepeng, E., Wu, Y., You, J., Li, T..  2020.  Real-Time Encrypted Traffic Classification via Lightweight Neural Networks. GLOBECOM 2020 - 2020 IEEE Global Communications Conference. :1–6.
The fast growth of encrypted traffic puts forward burning requirements on the efficiency of traffic classification. Although deep learning models perform well in the classification, they sacrifice the efficiency to obtain high-precision results. To reduce the resource and time consumption, a novel and lightweight model is proposed in this paper. Our design principle is to “maximize the reuse of thin modules”. A thin module adopts the multi-head attention and the 1D convolutional network. Attributed to the one-step interaction of all packets and the parallelized computation of the multi-head attention mechanism, a key advantage of our model is that the number of parameters and running time are significantly reduced. In addition, the effectiveness and efficiency of 1D convolutional networks are proved in traffic classification. Besides, the proposed model can work well in a real time manner, since only three consecutive packets of a flow are needed. To improve the stability of the model, the designed network is trained with the aid of ResNet, layer normalization and learning rate warmup. The proposed model outperforms the state-of-the-art works based on deep learning on two public datasets. The results show that our model has higher accuracy and running efficiency, while the number of parameters used is 1.8% of the 1D convolutional network and the training time halves.
Bouzar-Benlabiod, L., Rubin, S. H., Belaidi, K., Haddar, N. E..  2020.  RNN-VED for Reducing False Positive Alerts in Host-based Anomaly Detection Systems. 2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI). :17–24.
Host-based Intrusion Detection Systems HIDS are often based on anomaly detection. Several studies deal with anomaly detection by analyzing the system-call traces and get good detection rates but also a high rate off alse positives. In this paper, we propose a new anomaly detection approach applied on the system-call traces. The normal behavior learning is done using a Sequence to sequence model based on a Variational Encoder-Decoder (VED) architecture that integrates Recurrent Neural Networks (RNN) cells. We exploit the semantics behind the invoking order of system-calls that are then seen as sentences. A preprocessing phase is added to structure and optimize the model input-data representation. After the learning step, a one-class classification is run to categorize the sequences as normal or abnormal. The architecture may be used for predicting abnormal behaviors. The tests are achieved on the ADFA-LD dataset.
Ameer, S., Benson, J., Sandhu, R..  2020.  The EGRBAC Model for Smart Home IoT. 2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI). :457–462.
The Internet of Things (IoT) is enabling smart houses, where multiple users with complex social relationships interact with smart devices. This requires sophisticated access control specification and enforcement models, that are currently lacking. In this paper, we introduce the extended generalized role based access control (EGRBAC) model for smart home IoT. We provide a formal definition for EGRBAC and illustrate its features with a use case. A proof-of-concept demonstration utilizing AWS-IoT Greengrass is discussed in the appendix. EGRBAC is a first step in developing a comprehensive family of access control models for smart home IoT.
Xingjie, F., Guogenp, W., ShiBIN, Z., ChenHAO.  2020.  Industrial Control System Intrusion Detection Model based on LSTM Attack Tree. 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP). :255–260.
With the rapid development of the Industrial Internet, the network security risks faced by industrial control systems (ICSs) are becoming more and more intense. How to do a good job in the security protection of industrial control systems is extremely urgent. For traditional network security, industrial control systems have some unique characteristics, which results in traditional intrusion detection systems that cannot be directly reused on it. Aiming at the industrial control system, this paper constructs all attack paths from the hacker's perspective through the attack tree model, and uses the LSTM algorithm to identify and classify the attack behavior, and then further classify the attack event by extracting atomic actions. Finally, through the constructed attack tree model, the results are reversed and predicted. The results show that the model has a good effect on attack recognition, and can effectively analyze the hacker attack path and predict the next attack target.
Wang, P., Zhang, J., Wang, S., Wu, D..  2020.  Quantitative Assessment on the Limitations of Code Randomization for Legacy Binaries. 2020 IEEE European Symposium on Security and Privacy (EuroS P). :1–16.
Software development and deployment are generally fast-pacing practices, yet to date there is still a significant amount of legacy software running in various critical industries with years or even decades of lifespans. As the source code of some legacy software became unavailable, it is difficult for maintainers to actively patch the vulnerabilities, leaving the outdated binaries appealing targets of advanced security attacks. One of the most powerful attacks today is code reuse, a technique that can circumvent most existing system-level security facilities. While there have been various countermeasures against code reuse, applying them to sourceless software appears to be exceptionally challenging. Fine-grained code randomization is considered to be an effective strategy to impede modern code-reuse attacks. To apply it to legacy software, a technique called binary rewriting is employed to directly reconstruct binaries without symbol or relocation information. However, we found that current rewriting-based randomization techniques, regardless of their designs and implementations, share a common security defect such that the randomized binaries may remain vulnerable in certain cases. Indeed, our finding does not invalidate fine-grained code randomization as a meaningful defense against code reuse attacks, for it significantly raises the bar for exploits to be successful. Nevertheless, it is critical for the maintainers of legacy software systems to be aware of this problem and obtain a quantitative assessment of the risks in adopting a potentially incomprehensive defense. In this paper, we conducted a systematic investigation into the effectiveness of randomization techniques designed for hardening outdated binaries. We studied various state-of-the-art, fine-grained randomization tools, confirming that all of them can leave a certain part of the retrofitted binary code still reusable. To quantify the risks, we proposed a set of concrete criteria to classify gadgets immune to rewriting-based randomization and investigated their availability and capability.
Westland, T., Niu, N., Jha, R., Kapp, D., Kebede, T..  2020.  Relating the Empirical Foundations of Attack Generation and Vulnerability Discovery. 2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI). :37–44.
Automatically generating exploits for attacks receives much attention in security testing and auditing. However, little is known about the continuous effect of automatic attack generation and detection. In this paper, we develop an analytic model to understand the cost-benefit tradeoffs in light of the process of vulnerability discovery. We develop a three-phased model, suggesting that the cumulative malware detection has a productive period before the rate of gain flattens. As the detection mechanisms co-evolve, the gain will likely increase. We evaluate our analytic model by using an anti-virus tool to detect the thousands of Trojans automatically created. The anti-virus scanning results over five months show the validity of the model and point out future research directions.
Walia, K. S., Shenoy, S., Cheng, Y..  2020.  An Empirical Analysis on the Usability and Security of Passwords. 2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI). :1–8.
Security and usability are two essential aspects of a system, but they usually move in opposite directions. Sometimes, to achieve security, usability has to be compromised, and vice versa. Password-based authentication systems require both security and usability. However, to increase password security, absurd rules are introduced, which often drive users to compromise the usability of their passwords. Users tend to forget complex passwords and use techniques such as writing them down, reusing them, and storing them in vulnerable ways. Enhancing the strength while maintaining the usability of a password has become one of the biggest challenges for users and security experts. In this paper, we define the pronounceability of a password as a means to measure how easy it is to memorize - an aspect we associate with usability. We examine a dataset of more than 7 million passwords to determine whether the usergenerated passwords are secure. Moreover, we convert the usergenerated passwords into phonemes and measure the pronounceability of the phoneme-based representations. We then establish a relationship between the two and suggest how password creation strategies can be adapted to better align with both security and usability.
Ekşim, A., Demirci, T..  2020.  Ultimate Secrecy in Cooperative and Multi-hop Wireless Communications. 2020 XXXIIIrd General Assembly and Scientific Symposium of the International Union of Radio Science. :1–4.
In this work, communication secrecy in cooperative and multi-hop wireless communications for various radio frequencies are examined. Attenuation lines and ranges of both detection and ultimate secrecy regions were calculated for cooperative communication channel and multi-hop channel with various number of hops. From results, frequency ranges with the highest potential to apply bandwidth saving method known as frequency reuse were determined and compared to point-to-point channel. Frequencies with the highest attenuation were derived and their ranges of both detection and ultimate secrecy are calculated. Point-to-point, cooperative and multi-hop channels were compared in terms of ultimate secrecy ranges. Multi-hop channel measurements were made with different number of hops and the relation between the number of hops and communication security is examined. Ultimate secrecy ranges were calculated up to 1 Terahertz and found to be less than 13 meters between 550-565 GHz frequency range. Therefore, for short-range wireless communication systems such as indoor and in-device communication systems (board-to-board or chip-to-chip communications), it is shown that various bands in the Terahertz band can be used to reuse the same frequency in different locations to obtain high security and high bandwidth.
Feng, X., Wang, D., Lin, Z., Kuang, X., Zhao, G..  2020.  Enhancing Randomization Entropy of x86-64 Code while Preserving Semantic Consistency. 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). :1–12.

Code randomization is considered as the basis of mitigation against code reuse attacks, fundamentally supporting some recent proposals such as execute-only memory (XOM) that aims at dynamic return-oriented programming (ROP) attacks. However, existing code randomization methods are hard to achieve a good balance between high-randomization entropy and semantic consistency. In particular, they always ignore code semantic consistency, incurring performance loss and incompatibility with current security schemes, e.g., control flow integrity (CFI). In this paper, we present an enhanced code randomization method termed as HCRESC, which can improve the randomization entropy significantly, meanwhile ensure the semantic consistency between variants and the original code. HCRESC reschedules instructions within the range of functions rather than basic blocks, thus producing more variants of the original code and preserving the code's semantic. We implement HCRESC on Linux platform of x86-64 architecture and demonstrate that HCRESC can increase the randomization entropy of x86-64 code over than 120% compared with existing methods while ensuring control flow and size of the code unaltered.

2021-03-29
Yilmaz, I., Masum, R., Siraj, A..  2020.  Addressing Imbalanced Data Problem with Generative Adversarial Network For Intrusion Detection. 2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI). :25–30.

Machine learning techniques help to understand underlying patterns in datasets to develop defense mechanisms against cyber attacks. Multilayer Perceptron (MLP) technique is a machine learning technique used in detecting attack vs. benign data. However, it is difficult to construct any effective model when there are imbalances in the dataset that prevent proper classification of attack samples in data. In this research, we use UGR'16 dataset to conduct data wrangling initially. This technique helps to prepare a test set from the original dataset to train the neural network model effectively. We experimented with a series of inputs of varying sizes (i.e. 10000, 50000, 1 million) to observe the performance of the MLP neural network model with distribution of features over accuracy. Later, we use Generative Adversarial Network (GAN) model that produces samples of different attack labels (e.g. blacklist, anomaly spam, ssh scan) for balancing the dataset. These samples are generated based on data from the UGR'16 dataset. Further experiments with MLP neural network model shows that a balanced attack sample dataset, made possible with GAN, produces more accurate results than an imbalanced one.

2020-11-09
Pflanzner, T., Feher, Z., Kertesz, A..  2019.  A Crawling Approach to Facilitate Open IoT Data Archiving and Reuse. 2019 Sixth International Conference on Internet of Things: Systems, Management and Security (IOTSMS). :235–242.
Several cloud providers have started to offer specific data management services by responding to the new trend called the Internet of Things (IoT). In recent years, we have already seen that cloud computing has managed to serve IoT needs for data retrieval, processing and visualization transparent for the user side. IoT-Cloud systems for smart cities and smart regions can be very complex, therefore their design and analysis should be supported by means of simulation. Nevertheless, the models used in simulation environments should be as close as to the real world utilization to provide reliable results. To facilitate such simulations, in earlier works we proposed an IoT trace archiving service called SUMMON that can be used to gather real world datasets, and to reuse them for simulation experiments. In this paper we provide an extension to SUMMON with an automated web crawling service that gathers IoT and sensor data from publicly available websites. We introduce the architecture and operation of this approach, and exemplify it utilization with three use cases. The provided archiving solution can be used by simulators to perform realistic evaluations.
Bouzar-Benlabiod, L., Méziani, L., Rubin, S. H., Belaidi, K., Haddar, N. E..  2019.  Variational Encoder-Decoder Recurrent Neural Network (VED-RNN) for Anomaly Prediction in a Host Environment. 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI). :75–82.
Intrusion detection systems (IDS) are important security tools. NIDS monitors network's traffic and HIDS filters local one. HIDS are often based on anomaly detection. Several studies deal with anomaly detection using system-call traces. In this paper, we propose an anomaly detection and prediction approach. System-call traces, invoked by the running programs, are analyzed in real time. For prediction, we use a Sequence to sequence model based on variational encoder-decoder (VED) and variants of Recurrent Neural Networks (RNN), these architectures showed their performance on natural language processing. To make the analogy, we exploit the semantics behind the invoking order of system-calls that are then seen as sentences. A preprocessing phase is added to optimize the prediction model input data representation. A one-class classification is done to categorize the sequences into normal or abnormal. Tests are achieved on the ADFA-LD dataset and showed the advantage of the prediction for the intrusion detection/prediction task.
Ekşim, A., Demirci, T..  2019.  Ultimate Secrecy in Wireless Communications. 2019 11th International Conference on Electrical and Electronics Engineering (ELECO). :682–686.
In this work, communication secrecy in the physical layer for various radio frequencies is examined. Frequencies with the highest level of secrecy in 1-1000 GHz range and their level of communication secrecy are derived. The concept of ultimate secrecy in wireless communications is proposed. Attenuation lines and ranges of both detection and ultimate secrecy are calculated for transmitter powers from 1 W to 1000 W. From results, frequencies with the highest potential to apply bandwidth saving method known as frequency reuse are devised. Commonly used secrecy benchmarks for the given conditions are calculated. Frequencies with the highest attenuation are devised and their ranges of both detection and ultimate secrecy are calculated.
Farhadi, M., Haddad, H., Shahriar, H..  2019.  Compliance Checking of Open Source EHR Applications for HIPAA and ONC Security and Privacy Requirements. 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC). 1:704–713.
Electronic Health Record (EHR) applications are digital versions of paper-based patient's health information. They are increasingly adopted to improved quality in healthcare, such as convenient access to histories of patient medication and clinic visits, easier follow up of patient treatment plans, and precise medical decision-making process. EHR applications are guided by measures of the Health Insurance Portability and Accountability Act (HIPAA) to ensure confidentiality, integrity, and availability. Furthermore, Office of the National Coordinator (ONC) for Health Information Technology (HIT) certification criteria for usability of EHRs. A compliance checking approach attempts to identify whether or not an adopted EHR application meets the security and privacy criteria. There is no study in the literature to understand whether traditional static code analysis-based vulnerability discovered can assist in compliance checking of regulatory requirements of HIPAA and ONC. This paper attempts to address this issue. We identify security and privacy requirements for HIPAA technical requirements, and identify a subset of ONC criteria related to security and privacy, and then evaluate EHR applications for security vulnerabilities. Finally propose mitigation of security issues towards better compliance and to help practitioners reuse open source tools towards certification compliance.
Zhu, L., Zhang, Z., Xia, G., Jiang, C..  2019.  Research on Vulnerability Ontology Model. 2019 IEEE 8th Joint International Information Technology and Artificial Intelligence Conference (ITAIC). :657–661.
In order to standardize and describe vulnerability information in detail as far as possible and realize knowledge sharing, reuse and extension at the semantic level, a vulnerability ontology is constructed based on the information security public databases such as CVE, CWE and CAPEC and industry public standards like CVSS. By analyzing the relationship between vulnerability class and weakness class, inference rules are defined to realize knowledge inference from vulnerability instance to its consequence and from one vulnerability instance to another vulnerability instance. The experimental results show that this model can analyze the causal and congeneric relationships between vulnerability instances, which is helpful to repair vulnerabilities and predict attacks.
Fischer, T., Lesjak, C., Pirker, D., Steger, C..  2019.  RPC Based Framework for Partitioning IoT Security Software for Trusted Execution Environments. 2019 IEEE 10th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON). :0430–0435.
Partitioning security components of IoT devices to enable the use of Trusted Execution Environments adds resilience against side-channel attacks. Devices are hardened against extraction of sensitive information, but at the same time additional effort must be spent for the integration of the TEE and software partitioning. To perform partitioning, the developer typically inserts Remote Procedure Calls into the software. Existing RPC-based solutions require the developer to write Interface Definition Language files to generate RPC stubs. In this work, we present an RPC-based framework that supports software partitioning via a graphical user interface. The framework extracts required information about the interfaces from source-code header files to eliminate the need for IDL files. With this approach the TEE integration time is reduced and reuse of existing libraries is supported. We evaluate a Proof-of-Concept by partitioning a TLS library for IoT devices and compare our approach to other RPC-based solutions.
Kemp, C., Calvert, C., Khoshgoftaar, T..  2018.  Utilizing Netflow Data to Detect Slow Read Attacks. 2018 IEEE International Conference on Information Reuse and Integration (IRI). :108–116.
Attackers can leverage several techniques to compromise computer networks, ranging from sophisticated malware to DDoS (Distributed Denial of Service) attacks that target the application layer. Application layer DDoS attacks, such as Slow Read, are implemented with just enough traffic to tie up CPU or memory resources causing web and application servers to go offline. Such attacks can mimic legitimate network requests making them difficult to detect. They also utilize less volume than traditional DDoS attacks. These low volume attack methods can often go undetected by network security solutions until it is too late. In this paper, we explore the use of machine learners for detecting Slow Read DDoS attacks on web servers at the application layer. Our approach uses a generated dataset based upon Netflow data collected at the application layer on a live network environment. Our Netflow data uses the IP Flow Information Export (IPFIX) standard providing significant flexibility and features. These Netflow features can process and handle a growing amount of traffic and have worked well in our previous DDoS work detecting evasion techniques. Our generated dataset consists of real-world network data collected from a production network. We use eight different classifiers to build Slow Read attack detection models. Our wide selection of learners provides us with a more comprehensive analysis of Slow Read detection models. Experimental results show that the machine learners were quite successful in identifying the Slow Read attacks with a high detection and low false alarm rate. The experiment demonstrates that our chosen Netflow features are discriminative enough to detect such attacks accurately.
Ya'u, B. I., Nordin, A., Salleh, N., Aliyu, I..  2018.  Requirements Patterns Structure for Specifying and Reusing Software Product Line Requirements. 2018 International Conference on Information and Communication Technology for the Muslim World (ICT4M). :185–190.
A well-defined structure is essential in all software development, thus providing an avenue for smooth execution of the processes involved during various software development phases. One of the potential benefits provided by a well-defined structure is systematic reuse of software artifacts. Requirements pattern approach provides guidelines and modality that enables a systematic way of specifying and documenting requirements, which in turn supports a systematic reuse. Although there is a great deal of research concerning requirements pattern in the literature, the research focuses are not on requirement engineering (RE) activities of SPLE. In this paper, we proposed a software requirement pattern (SRP) structure based on RePa Requirements Pattern Template, which was adapted to best suit RE activities in SPLE. With this requirement pattern structure, RE activities such as elicitation and identification of common and variable requirements as well as the specification, documentation, and reuse in SPLE could be substantially improved.
Zhang, T., Wang, R., Ding, J., Li, X., Li, B..  2018.  Face Recognition Based on Densely Connected Convolutional Networks. 2018 IEEE Fourth International Conference on Multimedia Big Data (BigMM). :1–6.
The face recognition methods based on convolutional neural network have achieved great success. The existing model usually used the residual network as the core architecture. The residual network is good at reusing features, but it is difficult to explore new features. And the densely connected network can be used to explore new features. We proposed a face recognition model named Dense Face to explore the performance of densely connected network in face recognition. The model is based on densely connected convolutional neural network and composed of Dense Block layers, transition layers and classification layer. The model was trained with the joint supervision of center loss and softmax loss through feature normalization and enabled the convolutional neural network to learn more discriminative features. The Dense Face model was trained using the public available CASIA-WebFace dataset and was tested on the LFW and the CAS-PEAL-Rl datasets. Experimental results showed that the densely connected convolutional neural network has achieved higher face verification accuracy and has better robustness than other model such as VGG Face and ResNet model.
Muller, T., Walz, A., Kiefer, M., Doran, H. Dermot, Sikora, A..  2018.  Challenges and prospects of communication security in real-time ethernet automation systems. 2018 14th IEEE International Workshop on Factory Communication Systems (WFCS). :1–9.
Real-Time Ethernet has become the major communication technology for modern automation and industrial control systems. On the one hand, this trend increases the need for an automation-friendly security solution, as such networks can no longer be considered sufficiently isolated. On the other hand, it shows that, despite diverging requirements, the domain of Operational Technology (OT) can derive advantage from high-volume technology of the Information Technology (IT) domain. Based on these two sides of the same coin, we study the challenges and prospects of approaches to communication security in real-time Ethernet automation systems. In order to capitalize the expertise aggregated in decades of research and development, we put a special focus on the reuse of well-established security technology from the IT domain. We argue that enhancing such technology to become automation-friendly is likely to result in more robust and secure designs than greenfield designs. Because of its widespread deployment and the (to this date) nonexistence of a consistent security architecture, we use PROFINET as a showcase of our considerations. Security requirements for this technology are defined and different well-known solutions are examined according their suitability for PROFINET. Based on these findings, we elaborate the necessary adaptions for the deployment on PROFINET.
Ankam, D., Bouguila, N..  2018.  Compositional Data Analysis with PLS-DA and Security Applications. 2018 IEEE International Conference on Information Reuse and Integration (IRI). :338–345.
In Compositional data, the relative proportions of the components contain important relevant information. In such case, Euclidian distance fails to capture variation when considered within data science models and approaches such as partial least squares discriminant analysis (PLS-DA). Indeed, the Euclidean distance assumes implicitly that the data is normally distributed which is not the case of compositional vectors. Aitchison transformation has been considered as a standard in compositional data analysis. In this paper, we consider two other transformation methods, Isometric log ratio (ILR) transformation and data-based power (alpha) transformation, before feeding the data to PLS-DA algorithm for classification [1]. In order to investigate the merits of both methods, we apply them in two challenging information system security applications namely spam filtering and intrusion detection.
Yang, J., Kang, X., Wong, E. K., Shi, Y..  2018.  Deep Learning with Feature Reuse for JPEG Image Steganalysis. 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). :533–538.
It is challenging to detect weak hidden information in a JPEG compressed image. In this paper, we propose a 32-layer convolutional neural networks (CNNs) with feature reuse by concatenating all features from previous layers. The proposed method can improve the flow of gradient and information, and the shared features and bottleneck layers in the proposed CNN model further reduce the number of parameters dramatically. The experimental results shown that the proposed method significantly reduce the detection error rate compared with the existing JPEG steganalysis methods, e.g. state-of-the-art XuNet method and the conventional SCA-GFR method. Compared with XuNet method and conventional method SCA-GFR in detecting J-UNIWARD at 0.1 bpnzAC (bit per non-zero AC DCT coefficient), the proposed method can reduce detection error rate by 4.33% and 6.55% respectively.
Wheelus, C., Bou-Harb, E., Zhu, X..  2018.  Tackling Class Imbalance in Cyber Security Datasets. 2018 IEEE International Conference on Information Reuse and Integration (IRI). :229–232.
It is clear that cyber-attacks are a danger that must be addressed with great resolve, as they threaten the information infrastructure upon which we all depend. Many studies have been published expressing varying levels of success with machine learning approaches to combating cyber-attacks, but many modern studies still focus on training and evaluating with very outdated datasets containing old attacks that are no longer a threat, and also lack data on new attacks. Recent datasets like UNSW-NB15 and SANTA have been produced to address this problem. Even so, these modern datasets suffer from class imbalance, which reduces the efficacy of predictive models trained using these datasets. Herein we evaluate several pre-processing methods for addressing the class imbalance problem; using several of the most popular machine learning algorithms and a variant of UNSW-NB15 based upon the attributes from the SANTA dataset.
Göktaş, E., Kollenda, B., Koppe, P., Bosman, E., Portokalidis, G., Holz, T., Bos, H., Giuffrida, C..  2018.  Position-Independent Code Reuse: On the Effectiveness of ASLR in the Absence of Information Disclosure. 2018 IEEE European Symposium on Security and Privacy (EuroS P). :227–242.
Address-space layout randomization is a wellestablished defense against code-reuse attacks. However, it can be completely bypassed by just-in-time code-reuse attacks that rely on information disclosure of code addresses via memory or side-channel exposure. To address this fundamental weakness, much recent research has focused on detecting and mitigating information disclosure. The assumption being that if we perfect such techniques, we will not only maintain layout secrecy but also stop code reuse. In this paper, we demonstrate that an advanced attacker can mount practical code-reuse attacks even in the complete absence of information disclosure. To this end, we present Position-Independent Code-Reuse Attacks, a new class of codereuse attacks relying on the relative rather than absolute location of code gadgets in memory. By means of memory massaging, the attacker first makes the victim program generate a rudimentary ROP payload (for instance, containing code pointers that target instructions "close" to relevant gadgets). Afterwards, the addresses in this payload are patched with small offsets via relative memory writes. To establish the practicality of such attacks, we present multiple Position-Independent ROP exploits against real-world software. After showing that we can bypass ASLR in current systems without requiring information disclosures, we evaluate the impact of our technique on other defenses, such as fine-grained ASLR, multi-variant execution, execute-only memory and re-randomization. We conclude by discussing potential mitigations.