Biblio
While automated methods are the first line of defense for detecting attacks on webservers, a human agent is required to understand the attacker's intent and the attack process. The goal of this research is to understand the value of various log fields and the cognitive processes by which log information is grouped, searched, and correlated. Such knowledge will enable the development of human-focused log file investigation technologies. We performed controlled experiments with 65 subjects (IT professionals and novices) who investigated excerpts from six webserver log files. Quantitative and qualitative data were gathered to: 1) analyze subject accuracy in identifying malicious activity; 2) identify the most useful pieces of log file information; and 3) understand the techniques and strategies used by subjects to process the information. Statistically significant effects were observed in the accuracy of identifying attacks and time taken depending on the type of attack. Systematic differences were also observed in the log fields used by high-performing and low-performing groups. The findings include: 1) new insights into how specific log data fields are used to effectively assess potentially malicious activity; 2) obfuscating factors in log data from a human cognitive perspective; and 3) practical implications for tools to support log file investigations.