Towards a neural language model for signature extraction from forensic logs

Submitted by K_Hooper on Wed, 01/10/2018 - 11:14am

Title	Towards a neural language model for signature extraction from forensic logs
Publication Type	Conference Paper
Year of Publication	2017
Authors	Thaler, S., Menkonvski, V., Petkovic, M.
Conference Name	2017 5th International Symposium on Digital Forensic and Security (ISDFS)
Date Published	apr
Keywords	Clustering algorithms, complex relationship learning, Data analysis, digital forensics, error-prone, forensic log analysis, Forensics, handcrafted algorithms, heuristics, Human Behavior, knowledge based systems, learning (artificial intelligence), log line clustering, log message, natural language processing, natural language text, neural language model, neural nets, Neural networks, nonmutable part identification, pattern clustering, Predictive models, pubcrawl, Resiliency, rule-based approaches, rule-based systems, Scalability, signature extraction frameworks, Software, text analysis, use cases
Abstract	Signature extraction is a critical preprocessing step in forensic log analysis because it enables sophisticated analysis techniques to be applied to logs. Currently, most signature extraction frameworks either use rule-based approaches or handcrafted algorithms. Rule-based systems are error-prone and require high maintenance effort. Hand-crafted algorithms use heuristics and tend to work well only for specialized use cases. In this paper we present a novel approach to extract signatures from forensic logs that is based on a neural language model. This language model learns to identify mutable and non-mutable parts in a log message. We use this information to extract signatures. Neural language models have shown to work extremely well for learning complex relationships in natural language text. We experimentally demonstrate that our model can detect which parts are mutable with an accuracy of 86.4%. We also show how extracted signatures can be used for clustering log lines.
DOI	10.1109/ISDFS.2017.7916497
Citation Key	thaler_towards_2017

Groups:

Science of Security VO