Visible to the public Biblio

Filters: Keyword is log line clustering  [Clear All Filters]
2018-01-10
Thaler, S., Menkonvski, V., Petkovic, M..  2017.  Towards a neural language model for signature extraction from forensic logs. 2017 5th International Symposium on Digital Forensic and Security (ISDFS). :1–6.
Signature extraction is a critical preprocessing step in forensic log analysis because it enables sophisticated analysis techniques to be applied to logs. Currently, most signature extraction frameworks either use rule-based approaches or handcrafted algorithms. Rule-based systems are error-prone and require high maintenance effort. Hand-crafted algorithms use heuristics and tend to work well only for specialized use cases. In this paper we present a novel approach to extract signatures from forensic logs that is based on a neural language model. This language model learns to identify mutable and non-mutable parts in a log message. We use this information to extract signatures. Neural language models have shown to work extremely well for learning complex relationships in natural language text. We experimentally demonstrate that our model can detect which parts are mutable with an accuracy of 86.4%. We also show how extracted signatures can be used for clustering log lines.