A Modern Implementation of System Call Sequence Based Host-based Intrusion Detection Systems

Submitted by grigby1 on Tue, 08/17/2021 - 3:57pm

Title	A Modern Implementation of System Call Sequence Based Host-based Intrusion Detection Systems
Publication Type	Conference Paper
Year of Publication	2020
Authors	Byrnes, Jeffrey, Hoang, Thomas, Mehta, Nihal Nitin, Cheng, Yuan
Conference Name	2020 Second IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA)
Date Published	Oct. 2020
Publisher	IEEE
ISBN Number	978-1-7281-8543-9
Keywords	anomaly detection, Band-pass filters, composability, Computational modeling, hidden Markov model, Hidden Markov models, host-based intrusion detection, Intrusion detection, Kernel, Linux operating system, Linux Operating System Security, Metrics, pubcrawl, resilience, Resiliency, Standards, system calls
Abstract	Much research is concentrated on improving models for host-based intrusion detection systems (HIDS). Typically, such research aims at improving a model's results (e.g., reducing the false positive rate) in the familiar static training/testing environment using the standard data sources. Matching advancements in the machine learning community, researchers in the syscall HIDS domain have developed many complex and powerful syscall-based models to serve as anomaly detectors. These models typically show an impressive level of accuracy while emphasizing on minimizing the false positive rate. However, with each proposed model iteration, we get further from the setting in which these models are intended to operate. As kernels become more ornate and hardened, the implementation space for anomaly detection models is narrowing. Furthermore, the rapid advancement of operating systems and the underlying complexity introduced dictate that the sometimes decades-old datasets have long been obsolete. In this paper, we attempt to bridge the gap between theoretical models and their intended application environments by examining the recent Linux kernel 5.7.0-rc1. In this setting, we examine the feasibility of syscall-based HIDS in modern operating systems and the constraints imposed on the HIDS developer. We discuss how recent advancements to the kernel have eliminated the previous syscall trace collect method of writing syscall table wrappers, and propose a new approach to generate data and place our detection model. Furthermore, we present the specific execution time and memory constraints that models must meet in order to be operable within their intended settings. Finally, we conclude with preliminary results from our model, which primarily show that in-kernel machine learning models are feasible, depending on their complexity.
URL	https://ieeexplore.ieee.org/document/9325401
DOI	10.1109/TPS-ISA50397.2020.00037
Citation Key	byrnes_modern_2020

Groups:

Science of Security VO