Biblio
The malicious JavaScript is a common springboard for attackers to launch several types of network attacks, such as Drive-by-Download and malicious PDF delivery attack. In order to elude detection of signature matching, malicious JavaScript is often packed (so-called "obfuscation") with diversified algorithms therefore the occurrence of obfuscation is always a good pointer for potential maliciousness. In this investigation, we propose a light weight approach for quickly filtering obfuscated JavaScript by a novel method of tokenizing JavaScript text at letter level and information-theoretic measures, based on the previous work in the domain of detecting obfuscated malicious code as well as the pattern analysis of natural languages. The new approach is apparently time efficient compared to existing systems since it processes much less objects while it is also proved to be able to reach the acceptable detection accuracies.