Biblio

List
Filter

Found 346 results

Filters: Keyword is data mining [Clear All Filters]

2020-07-10

Yulianto, Arief Dwi, Sukarno, Parman, Warrdana, Aulia Arif, Makky, Muhammad Al. 2019. Mitigation of Cryptojacking Attacks Using Taint Analysis. 2019 4th International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE). :234—238.

Cryptojacking (also called malicious cryptocurrency mining or cryptomining) is a new threat model using CPU resources covertly “mining” a cryptocurrency in the browser. The impact is a surge in CPU Usage and slows the system performance. In this research, in-browsercryptojacking mitigation has been built as an extension in Google Chrome using Taint analysis method. The method used in this research is attack modeling with abuse case using the Man-In-The-Middle (MITM) attack as a testing for mitigation. The proposed model is designed so that users will be notified if a cryptojacking attack occurs. Hence, the user is able to check the script characteristics that run on the website background. The results of this research show that the taint analysis is a promising method to mitigate cryptojacking attacks. From 100 random sample websites, the taint analysis method can detect 19 websites that are infcted by cryptojacking.

Tahir, Rashid, Durrani, Sultan, Ahmed, Faizan, Saeed, Hammas, Zaffar, Fareed, Ilyas, Saqib. 2019. The Browsers Strike Back: Countering Cryptojacking and Parasitic Miners on the Web. IEEE INFOCOM 2019 - IEEE Conference on Computer Communications. :703—711.

With the recent boom in the cryptocurrency market, hackers have been on the lookout to find novel ways of commandeering users' machine for covert and stealthy mining operations. In an attempt to expose such under-the-hood practices, this paper explores the issue of browser cryptojacking, whereby miners are secretly deployed inside browser code without the knowledge of the user. To this end, we analyze the top 50k websites from Alexa and find a noticeable percentage of sites that are indulging in this exploitative exercise often using heavily obfuscated code. Furthermore, mining prevention plug-ins, such as NoMiner, fail to flag such cleverly concealed instances. Hence, we propose a machine learning solution based on hardware-assisted profiling of browser code in real-time. A fine-grained micro-architectural footprint allows us to classify mining applications with \textbackslashtextgreater99% accuracy and even flags them if the mining code has been heavily obfuscated or encrypted. We build our own browser extension and show that it outperforms other plug-ins. The proposed design has negligible overhead on the user's machine and works for all standard off-the-shelf CPUs.

2020-07-09

Fahrenkrog-Petersen, Stephan A., van der Aa, Han, Weidlich, Matthias. 2019. PRETSA: Event Log Sanitization for Privacy-aware Process Discovery. 2019 International Conference on Process Mining (ICPM). :1—8.

Event logs that originate from information systems enable comprehensive analysis of business processes, e.g., by process model discovery. However, logs potentially contain sensitive information about individual employees involved in process execution that are only partially hidden by an obfuscation of the event data. In this paper, we therefore address the risk of privacy-disclosure attacks on event logs with pseudonymized employee information. To this end, we introduce PRETSA, a novel algorithm for event log sanitization that provides privacy guarantees in terms of k-anonymity and t-closeness. It thereby avoids disclosure of employee identities, their membership in the event log, and their characterization based on sensitive attributes, such as performance information. Through step-wise transformations of a prefix-tree representation of an event log, we maintain its high utility for discovery of a performance-annotated process model. Experiments with real-world data demonstrate that sanitization with PRETSA yields event logs of higher utility compared to methods that exploit frequency-based filtering, while providing the same privacy guarantees.

2020-07-06

Mason, Andrew, Zhao, Yifan, He, Hongmei, Gompelman, Raymon, Mandava, Srikanth. 2019. Online Anomaly Detection of Time Series at Scale. 2019 International Conference on Cyber Situational Awareness, Data Analytics And Assessment (Cyber SA). :1–8.

Cyber breaches can result in disruption to business operations, reputation damage as well as directly affecting the financial stability of the targeted corporations, with potential impacts on future profits and stock values. Automatic network-stream monitoring becomes necessary for cyber situation awareness, and time-series anomaly detection plays an important role in network stream monitoring. This study surveyed recent research on time-series analysis methods in respect of parametric and non-parametric techniques, and popular machine learning platforms for data analysis on streaming data on both single server and cloud computing environments. We believe it provides a good reference for researchers in both academia and industry to select suitable (time series) data analysis techniques, and computing platforms, dependent on the data scale and real-time requirements.

2020-07-03

Jia, Guanbo, Miller, Paul, Hong, Xin, Kalutarage, Harsha, Ban, Tao. 2019. Anomaly Detection in Network Traffic Using Dynamic Graph Mining with a Sparse Autoencoder. 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :458—465.

Network based attacks on ecommerce websites can have serious economic consequences. Hence, anomaly detection in dynamic network traffic has become an increasingly important research topic in recent years. This paper proposes a novel dynamic Graph and sparse Autoencoder based Anomaly Detection algorithm named GAAD. In GAAD, the network traffic over contiguous time intervals is first modelled as a series of dynamic bipartite graph increments. One mode projection is performed on each bipartite graph increment and the adjacency matrix derived. Columns of the resultant adjacency matrix are then used to train a sparse autoencoder to reconstruct it. The sum of squared errors between the reconstructed approximation and original adjacency matrix is then calculated. An online learning algorithm is then used to estimate a Gaussian distribution that models the error distribution. Outlier error values are deemed to represent anomalous traffic flows corresponding to possible attacks. In the experiment, a network emulator was used to generate representative ecommerce traffic flows over a time period of 225 minutes with five attacks injected, including SYN scans, host emulation and DDoS attacks. ROC curves were generated to investigate the influence of the autoencoder hyper-parameters. It was found that increasing the number of hidden nodes and their activation level, and increasing sparseness resulted in improved performance. Analysis showed that the sparse autoencoder was unable to encode the highly structured adjacency matrix structures associated with attacks, hence they were detected as anomalies. In contrast, SVD and variants, such as the compact matrix decomposition, were found to accurately encode the attack matrices, hence they went undetected.

2020-06-26

Jaiswal, Prajwal Kumar, Das, Sayari, Panigrahi, Bijaya Ketan. 2019. PMU Based Data Driven Approach For Online Dynamic Security Assessment in Power Systems. 2019 20th International Conference on Intelligent System Application to Power Systems (ISAP). :1—7.

This paper presents a methodology for utilizing Phasor Measurement units (PMUs) for procuring real time synchronized measurements for assessing the security of the power system dynamically. The concept of wide-area dynamic security assessment considers transient instability in the proposed methodology. Intelligent framework based approach for online dynamic security assessment has been suggested wherein the database consisting of critical features associated with the system is generated for a wide range of contingencies, which is utilized to build the data mining model. This data mining model along with the synchronized phasor measurements is expected to assist the system operator in assessing the security of the system pertaining to a particular contingency, thereby also creating possibility of incorporating control and preventive measures in order to avoid any unforeseen instability in the system. The proposed technique has been implemented on IEEE 39 bus system for accurately indicating the security of the system and is found to be quite robust in the case of noise in the measurement data obtained from the PMUs.

2020-06-08

Sun, Wenhua, Wang, Xiaojuan, Jin, Lei. 2019. An Efficient Hash-Tree-Based Algorithm in Mining Sequential Patterns with Topology Constraint. 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS). :2782–2789.

Warnings happen a lot in real transmission networks. These warnings can affect people's lives. It is significant to analyze the alarm association rules in the network. Many algorithms can help solve this problem but not considering the actual physical significance. Therefore, in this study, we mine the association rules in warning weblogs based on a sequential mining algorithm (GSP) with topology structure. We define a topology constraint from network physical connection data. Under the topology constraint, network nodes have topology relation if they are directly connected or have a common adjacency node. In addition, due to the large amount of data, we implement the hash-tree search method to improve the mining efficiency. The theoretical solution is feasible and the simulation results verify our method. In simulation, the topology constraint improves the accuracy for 86%-96% and decreases the run time greatly at the same time. The hash-tree based mining results show that hash tree efficiency improvements are in 3-30% while the number of patterns remains unchanged. In conclusion, using our method can mine association rules efficiently and accurately in warning weblogs.

Hu, Qin, Wang, Shengling, Cheng, Xiuzhen. 2019. A Game Theoretic Analysis on Block Withholding Attacks Using the Zero-Determinant Strategy. 2019 IEEE/ACM 27th International Symposium on Quality of Service (IWQoS). :1–10.

In Bitcoin's incentive system that supports open mining pools, block withholding attacks incur huge security threats. In this paper, we investigate the mutual attacks among pools as this determines the macroscopic utility of the whole distributed system. Existing studies on pools' interactive attacks usually employ the conventional game theory, where the strategies of the players are considered pure and equal, neglecting the existence of powerful strategies and the corresponding favorable game results. In this study, we take advantage of the Zero-Determinant (ZD) strategy to analyze the block withholding attack between any two pools, where the ZD adopter has the unilateral control on the expected payoffs of its opponent and itself. In this case, we are faced with the following questions: who can adopt the ZD strategy? individually or simultaneously? what can the ZD player achieve? In order to answer these questions, we derive the conditions under which two pools can individually or simultaneously employ the ZD strategy and demonstrate the effectiveness. To the best of our knowledge, we are the first to use the ZD strategy to analyze the block withholding attack among pools.

2020-06-03

Cedillo, Priscila, Camacho, Jessica, Campos, Karina, Bermeo, Alexandra. 2019. A Forensics Activity Logger to Extract User Activity from Mobile Devices. 2019 Sixth International Conference on eDemocracy eGovernment (ICEDEG). :286—290.

Nowadays, mobile devices have become one of the most popular instruments used by a person on its regular life, mainly due to the importance of their applications. In that context, mobile devices store user's personal information and even more data, becoming a personal tracker for daily activities that provides important information about the user. Derived from this gathering of information, many tools are available to use on mobile devices, with the restrain that each tool only provides isolated information about a specific application or activity. Therefore, the present work proposes a tool that allows investigators to obtain a complete report and timeline of the activities that were performed on the device. This report incorporates the information provided by many sources into a unique set of data. Also, by means of an example, it is presented the operation of the solution, which shows the feasibility in the use of this tool and shows the way in which investigators have to apply the tool.

Qawasmeh, Ethar, Al-Saleh, Mohammed I., Al-Sharif, Ziad A.. 2019. Towards a Generic Approach for Memory Forensics. 2019 Sixth HCT Information Technology Trends (ITT). :094—098.

The era of information technology has, unfortunately, contributed to the tremendous rise in the number of criminal activities. However, digital artifacts can be utilized in convicting cybercriminal and exposing their activities. The digital forensics science concerns about all aspects related to cybercrimes. It seeks digital evidence by following standard methodologies to be admitted in court rooms. This paper concerns about memory forensics for the unique artifacts it holds. Memory contains information about the current state of systems and applications. Moreover, an application's data explains how a criminal has been interacting the application just before the memory is acquired. Memory forensics at the application level is currently random and cumbersome. Targeting specific applications is what forensic researchers and practitioner are currently striving to provide. This paper suggests a general solution to investigate any application. Our solution aims to utilize an application's data structures and variables' information in the investigation process. This is because an application's data has to be stored and retrieved in the means of variables. Data structures and variables' information can be generated by compilers for debugging purposes. We show that an application's information is a valuable resource to the investigator.

2020-05-22

Platonov, A.V., Poleschuk, E.A., Bessmertny, I. A., Gafurov, N. R.. 2018. Using quantum mechanical framework for language modeling and information retrieval. 2018 IEEE 12th International Conference on Application of Information and Communication Technologies (AICT). :1—4.

This article shows the analogy between natural language texts and quantum-like systems on the example of the Bell test calculating. The applicability of the well-known Bell test for texts in Russian is investigated. The possibility of using this test for the text separation on the topics corresponding to the user query in information retrieval system is shown.

Geetha, R, Rekha, Pasupuleti, Karthika, S. 2018. Twitter Opinion Mining and Boosting Using Sentiment Analysis. 2018 International Conference on Computer, Communication, and Signal Processing (ICCCSP). :1—4.

Social media has been one of the most efficacious and precise by speakers of public opinion. A strategy which sanctions the utilization and illustration of twitter data to conclude public conviction is discussed in this paper. Sentiments on exclusive entities with diverse strengths and intenseness are stated by public, where these sentiments are strenuously cognate to their personal mood and emotions. To examine the sentiments from natural language texts, addressing various opinions, a lot of methods and lexical resources have been propounded. A path for boosting twitter sentiment classification using various sentiment proportions as meta-level features has been proposed by this article. Analysis of tweets was done on the product iPhone 6.

Khadilkar, Kunal, Kulkarni, Siddhivinayak, Bone, Poojarani. 2018. Plagiarism Detection Using Semantic Knowledge Graphs. 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA). :1—6.

Every day, huge amounts of unstructured text is getting generated. Most of this data is in the form of essays, research papers, patents, scholastic articles, book chapters etc. Many plagiarism softwares are being developed to be used in order to reduce the stealing and plagiarizing of Intellectual Property (IP). Current plagiarism softwares are mainly using string matching algorithms to detect copying of text from another source. The drawback of some of such plagiarism softwares is their inability to detect plagiarism when the structure of the sentence is changed. Replacement of keywords by their synonyms also fails to be detected by these softwares. This paper proposes a new method to detect such plagiarism using semantic knowledge graphs. The method uses Named Entity Recognition as well as semantic similarity between sentences to detect possible cases of plagiarism. The doubtful cases are visualized using semantic Knowledge Graphs for thorough analysis of authenticity. Rules for active and passive voice have also been considered in the proposed methodology.

Devarakonda, Ranjeet, Giansiracusa, Michael, Kumar, Jitendra. 2018. Machine Learning and Social Media to Mine and Disseminate Big Scientific Data. 2018 IEEE International Conference on Big Data (Big Data). :5312—5315.

One of the challenges in supplying the communities with wider access to scientific databases is the need for knowledge of database languages like Structured Query Language (SQL). Although the SQL language has been published in many forms, not everybody is able to write SQL queries. Another challenge is that it might not be practical to make the public aware of the structure of databases. There is a need for novice users to query relational databases using their natural language. To solve this problem, many natural language interfaces to structured databases have been developed. The goal is to provide a more intuitive method for generating database queries and delivering responses. Through social media, which makes it possible to interact with a wide section of the population, and with the help of natural language processing, researchers at the Atmospheric Radiation Measurement (ARM) Data Center at Oak Ridge National Laboratory (ORNL) have developed a concept to enable easy search and retrieval of data from several environmental data centers for the scientific community through social media.Using a machine learning framework that maps natural language text to thousands of datasets, instruments, variables, and data streams, the prototype system would allow users to request data through Twitter and receive a link (via tweet) to applicable data results on the project's search catalog tailored to their key words. This automated identification of relevant data from various petascale archives at ORNL could increase convenience, access, and use of the project's data by the broader community. In this paper we discuss how some data-intensive projects at ORNL are using innovative ways to help in data discovery.

Yan, Donghui, Wang, Yingjie, Wang, Jin, Wang, Honggang, Li, Zhenpeng. 2018. K-nearest Neighbor Search by Random Projection Forests. 2018 IEEE International Conference on Big Data (Big Data). :4775—4781.

K-nearest neighbor (kNN) search has wide applications in many areas, including data mining, machine learning, statistics and many applied domains. Inspired by the success of ensemble methods and the flexibility of tree-based methodology, we propose random projection forests, rpForests, for kNN search. rpForests finds kNNs by aggregating results from an ensemble of random projection trees with each constructed recursively through a series of carefully chosen random projections. rpForests achieves a remarkable accuracy in terms of fast decay in the missing rate of kNNs and that of discrepancy in the kNN distances. rpForests has a very low computational complexity. The ensemble nature of rpForests makes it easily run in parallel on multicore or clustered computers; the running time is expected to be nearly inversely proportional to the number of cores or machines. We give theoretical insights by showing the exponential decay of the probability that neighboring points would be separated by ensemble random projection trees when the ensemble size increases. Our theory can be used to refine the choice of random projections in the growth of trees, and experiments show that the effect is remarkable.

Ahsan, Ramoza, Bashir, Muzammil, Neamtu, Rodica, Rundensteiner, Elke A., Sarkozy, Gabor. 2019. Nearest Neighbor Subsequence Search in Time Series Data. 2019 IEEE International Conference on Big Data (Big Data). :2057—2066.

Continuous growth in sensor data and other temporal sequence data necessitates efficient retrieval and similarity search support on these big time series datasets. However, finding exact similarity results, especially at the granularity of subsequences, is known to be prohibitively costly for large data sets. In this paper, we thus propose an efficient framework for solving this exact subsequence similarity match problem, called TINN (TIme series Nearest Neighbor search). Exploiting the range interval diversity properties of time series datasets, TINN captures similarity at two levels of abstraction, namely, relationships among subsequences within each long time series and relationships across distinct time series in the data set. These relationships are compactly organized in an augmented relationship graph model, with the former relationships encoded in similarity vectors at TINN nodes and the later captured by augmented edge types in the TINN Graph. Query processing strategy deploy novel pruning techniques on the TINN Graph, including node skipping, vertical and horizontal pruning, to significantly reduce the number of time series as well as subsequences to be explored. Comprehensive experiments on synthetic and real world time series data demonstrate that our TINN model consistently outperforms state-of-the-art approaches while still guaranteeing to retrieve exact matches.

2020-05-18

Bakhtin, Vadim V., Isaeva, Ekaterina V.. 2019. New TSBuilder: Shifting towards Cognition. 2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus). :179–181.

The paper reviews a project on the automation of term system construction. TSBuilder (Term System Builder) was developed in 2014 as a multilayer Rosenblatt's perceptron for supervised machine learning, namely 1-3 word terms identification in natural language texts and their rigid categorization. The program is being modified to reduce the rigidity of categorization which will bring text mining more in line with human thinking.We are expanding the range of parameters (semantical, morphological, and syntactical) for categorization, removing the restriction of the term length of three words, using convolution on a continuous sequence of terms, and present the probabilities of a term falling into different categories. The neural network will not assign a single category to a term but give N answers (where N is the number of predefined classes), each of which O ∈ [0, 1] is the probability of the term to belong to a given class.

Kermani, Fatemeh Hojati, Ghanbari, Shirin. 2019. Extractive Persian Summarizer for News Websites. 2019 5th International Conference on Web Research (ICWR). :85–89.

Automatic extractive text summarization is the process of condensing textual information while preserving the important concepts. The proposed method after performing pre-processing on input Persian news articles generates a feature vector of salient sentences from a combination of statistical, semantic and heuristic methods and that are scored and concatenated accordingly. The scoring of the salient features is based on the article's title, proper nouns, pronouns, sentence length, keywords, topic words, sentence position, English words, and quotations. Experimental results on measurements including recall, F-measure, ROUGE-N are presented and compared to other Persian summarizers and shown to provide higher performance.

2020-05-11

Memon, Raheel Ahmed, Li, Jianping, Ahmed, Junaid, Khan, Asif, Nazir, M. Irshad, Mangrio, M. Ismail. 2018. Modeling of Blockchain Based Systems Using Queuing Theory Simulation. 2018 15th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP). :107–111.

Blockchain is the one of leading technology of this time; it has started to revolutionize several fields like, finance, business, industry, smart home, healthcare, social networks, Internet and the Internet of Things. It has many benefits like, decentralized network, robustness, availability, stability, anonymity, auditability and accountability. The applications of Blockchain are emerging, and it is found that most of the work is focused on its engineering implementation. While the theoretical part is very less considered and explored. In this paper we implemented the simulation of mining process in Blockchain based systems using queuing theory. We took the parameters of one of the mature Cryptocurrency, Bitcoin's real data and simulated using M/M/n/L queuing system in JSIMgraph. We have achieved realistic results; and expect that it will open up new research direction in theoretical research of Blockchain based systems.

Nagamani, Ch., Chittineni, Suneetha. 2018. Network Intrusion Detection Mechanisms Using Outlier Detection. 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT). :1468–1473.

The recognition of intrusions has increased impressive enthusiasm for information mining with the acknowledgment that anomalies can be the key disclosure to be produced using extensive network databases. Intrusions emerge because of different reasons, for example, mechanical deficiencies, changes in framework conduct, fake conduct, human blunder and instrument mistake. Surely, for some applications the revelation of Intrusions prompts more intriguing and helpful outcomes than the disclosure of inliers. Discovery of anomalies can prompt recognizable proof of framework blames with the goal that executives can take preventive measures previously they heighten. A network database framework comprises of a sorted out posting of pages alongside programming to control the network information. This database framework has been intended to empower network operations, oversee accumulations of information, show scientific outcomes and to get to these information utilizing networks. It likewise empowers network clients to gather limitless measure of information on unbounded territories of utilization, break down it and return it into helpful data. Network databases are ordinarily used to help information control utilizing dynamic capacities on sites or for putting away area subordinate data. This database holds a surrogate for each network route. The formation of these surrogates is called ordering and each network database does this errand in an unexpected way. In this paper, a structure for compelling access control and Intrusion Detection using outliers has been proposed and used to give viable Security to network databases. The design of this framework comprises of two noteworthy subsystems to be specific, Access Control Subsystem and Intrusion Detection Subsystem. In this paper preprocessing module is considered which clarifies the preparing of preprocessing the accessible information. And rain forest method is discussed which is used for intrusion detection.

2020-05-08

Katasev, Alexey S., Emaletdinova, Lilia Yu., Kataseva, Dina V.. 2018. Neural Network Spam Filtering Technology. 2018 International Conference on Industrial Engineering, Applications and Manufacturing (ICIEAM). :1—5.

In this paper we solve the problem of neural network technology development for e-mail messages classification. We analyze basic methods of spam filtering such as a sender IP-address analysis, spam messages repeats detection and the Bayesian filtering according to words. We offer the neural network technology for solving this problem because the neural networks are universal approximators and effective in addressing the problems of classification. Also, we offer the scheme of this technology for e-mail messages “spam”/“not spam” classification. The creation of effective neural network model of spam filtering is performed within the databases knowledge discovery technology. For this training set is formed, the neural network model is trained, its value and classifying ability are estimated. The experimental studies have shown that a developed artificial neural network model is adequate and it can be effectively used for the e-mail messages classification. Thus, in this paper we have shown the possibility of the effective neural network model use for the e-mail messages filtration and have shown a scheme of artificial neural network model use as a part of the e-mail spam filtering intellectual system.

2020-04-17

Alim, Adil, Zhao, Xujiang, Cho, Jin-Hee, Chen, Feng. 2019. Uncertainty-Aware Opinion Inference Under Adversarial Attacks. 2019 IEEE International Conference on Big Data (Big Data). :6—15.

Inference of unknown opinions with uncertain, adversarial (e.g., incorrect or conflicting) evidence in large datasets is not a trivial task. Without proper handling, it can easily mislead decision making in data mining tasks. In this work, we propose a highly scalable opinion inference probabilistic model, namely Adversarial Collective Opinion Inference (Adv-COI), which provides a solution to infer unknown opinions with high scalability and robustness under the presence of uncertain, adversarial evidence by enhancing Collective Subjective Logic (CSL) which is developed by combining SL and Probabilistic Soft Logic (PSL). The key idea behind the Adv-COI is to learn a model of robust ways against uncertain, adversarial evidence which is formulated as a min-max problem. We validate the out-performance of the Adv-COI compared to baseline models and its competitive counterparts under possible adversarial attacks on the logic-rule based structured data and white and black box adversarial attacks under both clean and perturbed semi-synthetic and real-world datasets in three real world applications. The results show that the Adv-COI generates the lowest mean absolute error in the expected truth probability while producing the lowest running time among all.

Burgess, Jonah, Carlin, Domhnall, O'Kane, Philip, Sezer, Sakir. 2019. MANiC: Multi-step Assessment for Crypto-miners. 2019 International Conference on Cyber Security and Protection of Digital Services (Cyber Security). :1—8.

Modern Browsers have become sophisticated applications, providing a portal to the web. Browsers host a complex mix of interpreters such as HTML and JavaScript, allowing not only useful functionality but also malicious activities, known as browser-hijacking. These attacks can be particularly difficult to detect, as they usually operate within the scope of normal browser behaviour. CryptoJacking is a form of browser-hijacking that has emerged as a result of the increased popularity and profitability of cryptocurrencies, and the introduction of new cryptocurrencies that promote CPU-based mining. This paper proposes MANiC (Multi-step AssessmeNt for Crypto-miners), a system to detect CryptoJacking websites. It uses regular expressions that are compiled in accordance with the API structure of different miner families. This allows the detection of crypto-mining scripts and the extraction of parameters that could be used to detect suspicious behaviour associated with CryptoJacking. When MANiC was used to analyse the Alexa top 1m websites, it detected 887 malicious URLs containing miners from 11 different families and demonstrated favourable results when compared to related CryptoJacking research. We demonstrate that MANiC can be used to provide insights into this new threat, to identify new potential features of interest and to establish a ground-truth dataset, assisting future research.

2020-04-13

Chowdhury, Nahida Sultana, Raje, Rajeev R.. 2019. SERS: A Security-Related and Evidence-Based Ranking Scheme for Mobile Apps. 2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA). :130–139.

In recent years, the number of smart mobile devices has rapidly increased worldwide. This explosion of continuously connected mobile devices has resulted in an exponential growth in the number of publically available mobile Apps. To facilitate the selection of mobile Apps, from various available choices, the App distribution platforms typically rank/recommend Apps based on average star ratings, the number of downloads, and associated reviews - the external aspect of an App. However, these ranking schemes typically tend to ignore critical internal aspects (e.g., security vulnerabilities) of the Apps. Such an omission of internal aspects is certainly not desirable, especially when many of the users do not possess the necessary skills to evaluate the internal aspects and choose an App based on the default ranking scheme which uses the external aspect. In this paper, we build upon our earlier efforts by focusing specifically on the security-related internal aspect of an App and its combination with the external aspect computed from the user reviews by identifying security-related comments.We use this combination to rank-order similar Apps. We evaluate our approach on publicly available Apps from the Google PlayStore and compare our ranking with prevalent ranking techniques such as the average star ratings. The experimental results indicate the effectiveness of our proposed approach.

2020-04-10

Chapla, Happy, Kotak, Riddhi, Joiser, Mittal. 2019. A Machine Learning Approach for URL Based Web Phishing Using Fuzzy Logic as Classifier. 2019 International Conference on Communication and Electronics Systems (ICCES). :383—388.

Phishing is the major problem of the internet era. In this era of internet the security of our data in web is gaining an increasing importance. Phishing is one of the most harmful ways to unknowingly access the credential information like username, password or account number from the users. Users are not aware of this type of attack and later they will also become a part of the phishing attacks. It may be the losses of financial found, personal information, reputation of brand name or trust of brand. So the detection of phishing site is necessary. In this paper we design a framework of phishing detection using URL.