Biblio
Whenever any internet user visits a website, a scripting language runs in the background known as JavaScript. The embedding of malicious activities within the script poses a great threat to the cyberworld. Attackers take advantage of the dynamic nature of the JavaScript and embed malicious code within the website to download malware and damage the host. JavaScript developers obfuscate the script to keep it shielded from getting detected by the malware detectors. In this paper, we propose a novel technique for analysing and detecting JavaScript using sandbox assisted ensemble model. We extract the payload using malware-jail sandbox to get the real script. Upon getting the extracted script, we analyse it to define the features that are needed for creating the dataset. We compute Pearson's r between every feature for feature extraction. An ensemble model consisting of Sequential Minimal Optimization (SMO), Voted Perceptron and AdaBoost algorithm is used with voting technique to detect malicious JavaScript. Experimental results show that our proposed model can detect obfuscated and de-obfuscated malicious JavaScript with an accuracy of 99.6% and 0.03s detection time. Our model performs better than other state-of-the-art models in terms of accuracy and least training and detection time.
JavaScript is a popular attack vector for releasing malicious payloads on unsuspecting Internet users. Authors of this malicious JavaScript often employ numerous obfuscation techniques in order to prevent the automatic detection by antivirus and hinder manual analysis by professional malware analysts. Consequently, this paper presents SAFE-DEOBS, a JavaScript deobfuscation tool that we have built. The aim of SAFE-DEOBS is to automatically deobfuscate JavaScript malware such that an analyst can more rapidly determine the malicious script's intent. This is achieved through a number of static analyses, inspired by techniques from compiler theory. We demonstrate the utility of SAFE-DEOBS through a case study on real-world JavaScript malware, and show that it is a useful addition to a malware analyst's toolset.
The biometric system of access to information resources has been developed. The software and hardware complex are designed to protect information resources and personal data from unauthorized access using the principle of user authentication by fingerprints. In the developed complex, the traditional input of login and password was replaced by applying a finger to the fingerprint scanner. The system automatically recognizes the fingerprint and provides access to the information resource, provides encryption of personal data and automation of the authorization process on the web resource. The web application was implemented using the Bootstrap framework, the 000webhost web server, the phpMyAdmin database server, the PHP scripting language, the HTML hypertext markup language, along with cascading style sheets and embedded scripts (JavaScript), which created a full-fledged web-site and Google Chrome extension with the ability to integrate it into other systems. The structural schematic diagram was performed. The design of the device is offered. The algorithm of the program operation and the program of the device operation in the C language are developed.
The early discovery of security bugs in JavaScript (JS) engines is crucial for protecting Internet users from adversaries abusing zero-day vulnerabilities. Browser vendors, bug bounty hunters, and security researchers have been eager to find such security bugs by leveraging state-of-the-art fuzzers as well as their domain expertise. They report a bug when observing a crash after executing their JS test since a crash is an early indicator of a potential bug. However, it is difficult to identify whether such a crash indeed invokes security bugs in JS engines. Thus, unskilled bug reporters are unable to assess the security severity of their new bugs with JS engine crashes. Today, this classification of a reported security bug is completely manual, depending on the verdicts from JS engine vendors. We investigated the feasibility of applying various machine learning classifiers to determine whether an observed crash triggers a security bug. We designed and implemented CRScope, which classifies security and non-security bugs from given crash-dump files. Our experimental results on 766 crash instances demonstrate that CRScope achieved 0.85, 0.89, and 0.93 Area Under Curve (AUC) for Chakra, V8, and SpiderMonkey crashes, respectively. CRScope also achieved 0.84, 0.89, and 0.95 precision for Chakra, V8, and SpiderMonkey crashes, respectively. This outperforms the previous study and existing tools including Exploitable and AddressSanitizer. CRScope is capable of learning domain-specific expertise from the past verdicts on reported bugs and automatically classifying JS engine security bugs, which helps improve the scalable classification of security bugs.
The rapid rise of cyber-crime activities and the growing number of devices threatened by them place software security issues in the spotlight. As around 90% of all attacks exploit known types of security issues, finding vulnerable components and applying existing mitigation techniques is a viable practical approach for fighting against cyber-crime. In this paper, we investigate how the state-of-the-art machine learning techniques, including a popular deep learning algorithm, perform in predicting functions with possible security vulnerabilities in JavaScript programs. We applied 8 machine learning algorithms to build prediction models using a new dataset constructed for this research from the vulnerability information in public databases of the Node Security Project and the Snyk platform, and code fixing patches from GitHub. We used static source code metrics as predictors and an extensive grid-search algorithm to find the best performing models. We also examined the effect of various re-sampling strategies to handle the imbalanced nature of the dataset. The best performing algorithm was KNN, which created a model for the prediction of vulnerable functions with an F-measure of 0.76 (0.91 precision and 0.66 recall). Moreover, deep learning, tree and forest based classifiers, and SVM were competitive with F-measures over 0.70. Although the F-measures did not vary significantly with the re-sampling strategies, the distribution of precision and recall did change. No re-sampling seemed to produce models preferring high precision, while re-sampling strategies balanced the IR measures.
In this paper we examine the use of covert channels based on CPU load in order to achieve persistent user identification through browser sessions. In particular, we demonstrate that an HTML5 video, a GIF image, or CSS animations on a webpage can be used to force the CPU to produce a sequence of distinct load levels, even without JavaScript or any client-side code. These load levels can be then captured either by another browsing session, running on the same or a different browser in parallel to the browsing session we want to identify, or by a malicious app installed on the device. To get a good estimation of the CPU load caused by the target session, the receiver can observe system statistics about CPU activity (app), or constantly measure time it takes to execute a known code segment (app and browser). Furthermore, for mobile devices we propose a sensor-based approach to estimate the CPU load, based on exploiting disturbances of the magnetometer sensor data caused by the high CPU activity. Captured loads can be decoded and translated into an identifying bit string, which is transmitted back to the attacker. Due to the way loads are produced, these methods are applicable even in highly restrictive browsers, such as the Tor Browser, and run unnoticeably to the end user. Therefore, unlike existing ways of web tracking, our methods circumvent most of the existing countermeasures, as they store the identifying information outside the browsing session being targeted. Finally, we also thoroughly evaluate and assess each presented method of generating and receiving the signal, and provide an overview of potential countermeasures.
To add more functionality and enhance usability of web applications, JavaScript (JS) is frequently used. Even with many advantages and usefulness of JS, an annoying fact is that many recent cyberattacks such as drive-by-download attacks exploit vulnerability of JS codes. In general, malicious JS codes are not easy to detect, because they sneakily exploit vulnerabilities of browsers and plugin software, and attack visitors of a web site unknowingly. To protect users from such threads, the development of an accurate detection system for malicious JS is soliciting. Conventional approaches often employ signature and heuristic-based methods, which are prone to suffer from zero-day attacks, i.e., causing many false negatives and/or false positives. For this problem, this paper adopts a machine-learning approach to feature learning called Doc2Vec, which is a neural network model that can learn context information of texts. The extracted features are given to a classifier model (e.g., SVM and neural networks) and it judges the maliciousness of a JS code. In the performance evaluation, we use the D3M Dataset (Drive-by-Download Data by Marionette) for malicious JS codes and JSUPACK for benign ones for both training and test purposes. We then compare the performance to other feature learning methods. Our experimental results show that the proposed Doc2Vec features provide better accuracy and fast classification in malicious JS code detection compared to conventional approaches.
Client-side JavaScript has become ubiquitous in web applications to improve user experience and reduce server load. However, since clients are untrusted, servers cannot rely on the confidentiality or integrity of client-side JavaScript code and the data that it operates on. For example, client-side input validation must be repeated at server side, and confidential business logic cannot be offloaded. In this paper, we present TrustJS, a framework that enables trustworthy execution of security-sensitive JavaScript inside commodity browsers. TrustJS leverages trusted hardware support provided by Intel SGX to protect the client-side execution of JavaScript, enabling a flexible partitioning of web application code. We present the design of TrustJS and provide initial evaluation results, showing that trustworthy JavaScript offloading can further improve user experience and conserve more server resources.
Developers of applications relying on cryptographic libraries can easily make mistakes in their use. Popular dynamic languages such as JavaScript make testing or verifying such applications particularly challenging. In this paper, we present our ongoing work toward a methodology for automatically checking security properties in JavaScript code. Our main idea is to attach security annotations to values that encode properties of interest. We illustrate our idea using examples and, as an initial step in our line of work, we present a formalization of security annotations in a statically typed lambda calculus. As next steps, we will translate our annotations to a dynamically typed formalization of JavaScript such as $łambda$JS and implement a runtime checked type extension using code instrumentation for full JavaScript.
In recent years, with the advances in JavaScript engines and the adoption of HTML5 APIs, web applications begin to show a tendency to shift their functionality from the server side towards the client side, resulting in dense and complex interactions with HTML documents using the Document Object Model (DOM). As a consequence, client-side vulnerabilities become more and more prevalent. In this paper, we focus on DOM-sourced Cross-site Scripting (XSS), which is a kind of severe but not well-studied vulnerability appearing in browser extensions. Comparing with conventional DOM-based XSS, a new attack surface is introduced by DOM-sourced XSS where the DOM could become a vulnerable source as well besides common sources such as URLs and form inputs. To discover such vulnerability, we propose a detecting framework employing hybrid analysis with two phases. The first phase is the lightweight static analysis consisting of a text filter and an abstract syntax tree parser, which produces potential vulnerable candidates. The second phase is the dynamic symbolic execution with an additional component named shadow DOM, generating a document as a proof-of-concept exploit. In our large-scale real-world experiment, 58 previously unknown DOM-sourced XSS vulnerabilities were discovered in user scripts of the popular browser extension Greasemonkey.
Taint analysis has been used in numerous scripting languages such as Perl and Ruby to defend against various form of code injection attacks, such as cross-site scripting (XSS) and SQL-injection. However, most taint analysis systems simply fail when tainted information is used in a possibly unsafe manner. In this paper, we explore how precise taint tracking can be used in order to secure web content. Rather than simply crashing, we propose that a library-writer defined sanitization function can instead be used on the tainted portions of a string. With this approach, library writers or framework developers can design their tools to be resilient, even if inexperienced developers misuse these libraries in unsafe ways. In other words, developer mistakes do not have to result in system crashes to guarantee security. We implement both coarse-grained and precise taint tracking in JavaScript, and show how our precise taint tracking API can be used to defend against SQL injection and XSS attacks. We further evaluate the performance of this approach, showing that precise taint tracking involves an overhead of approximately 22%.