Visible to the public Biblio

Filters: Keyword is malware datasets  [Clear All Filters]
2020-10-29
Tran, Trung Kien, Sato, Hiroshi, Kubo, Masao.  2019.  Image-Based Unknown Malware Classification with Few-Shot Learning Models. 2019 Seventh International Symposium on Computing and Networking Workshops (CANDARW). :401—407.

Knowing malware types in every malware attacks is very helpful to the administrators to have proper defense policies for their system. It must be a massive benefit for the organization as well as the social if the automatic protection systems could themselves detect, classify an existence of new malware types in the whole network system with a few malware samples. This feature helps to prevent the spreading of malware as soon as any damage is caused to the networks. An approach introduced in this paper takes advantage of One-shot/few-shot learning algorithms in solving the malware classification problems by using some well-known models such as Matching Networks, Prototypical Networks. To demonstrate an efficiency of the approach, we run the experiments on the two malware datasets (namely, MalImg and Microsoft Malware Classification Challenge), and both experiments all give us very high accuracies. We confirm that if applying models correctly from the machine learning area could bring excellent performance compared to the other traditional methods, open a new area of malware research.

2018-05-01
Korczynski, M., Tajalizadehkhoob, S., Noroozian, A., Wullink, M., Hesselman, C., v Eeten, M..  2017.  Reputation Metrics Design to Improve Intermediary Incentives for Security of TLDs. 2017 IEEE European Symposium on Security and Privacy (EuroS P). :579–594.

Over the years cybercriminals have misused the Domain Name System (DNS) - a critical component of the Internet - to gain profit. Despite this persisting trend, little empirical information about the security of Top-Level Domains (TLDs) and of the overall 'health' of the DNS ecosystem exists. In this paper, we present security metrics for this ecosystem and measure the operational values of such metrics using three representative phishing and malware datasets. We benchmark entire TLDs against the rest of the market. We explicitly distinguish these metrics from the idea of measuring security performance, because the measured values are driven by multiple factors, not just by the performance of the particular market player. We consider two types of security metrics: occurrence of abuse and persistence of abuse. In conjunction, they provide a good understanding of the overall health of a TLD. We demonstrate that attackers abuse a variety of free services with good reputation, affecting not only the reputation of those services, but of entire TLDs. We find that, when normalized by size, old TLDs like .com host more bad content than new generic TLDs. We propose a statistical regression model to analyze how the different properties of TLD intermediaries relate to abuse counts. We find that next to TLD size, abuse is positively associated with domain pricing (i.e. registries who provide free domain registrations witness more abuse). Last but not least, we observe a negative relation between the DNSSEC deployment rate and the count of phishing domains.

2018-04-30
Korczynski, M., Tajalizadehkhoob, S., Noroozian, A., Wullink, M., Hesselman, C., v Eeten, M..  2017.  Reputation Metrics Design to Improve Intermediary Incentives for Security of TLDs. 2017 IEEE European Symposium on Security and Privacy (EuroS P). :579–594.

Over the years cybercriminals have misused the Domain Name System (DNS) - a critical component of the Internet - to gain profit. Despite this persisting trend, little empirical information about the security of Top-Level Domains (TLDs) and of the overall 'health' of the DNS ecosystem exists. In this paper, we present security metrics for this ecosystem and measure the operational values of such metrics using three representative phishing and malware datasets. We benchmark entire TLDs against the rest of the market. We explicitly distinguish these metrics from the idea of measuring security performance, because the measured values are driven by multiple factors, not just by the performance of the particular market player. We consider two types of security metrics: occurrence of abuse and persistence of abuse. In conjunction, they provide a good understanding of the overall health of a TLD. We demonstrate that attackers abuse a variety of free services with good reputation, affecting not only the reputation of those services, but of entire TLDs. We find that, when normalized by size, old TLDs like .com host more bad content than new generic TLDs. We propose a statistical regression model to analyze how the different properties of TLD intermediaries relate to abuse counts. We find that next to TLD size, abuse is positively associated with domain pricing (i.e. registries who provide free domain registrations witness more abuse). Last but not least, we observe a negative relation between the DNSSEC deployment rate and the count of phishing domains.

2017-12-20
Paranthaman, R., Thuraisingham, B..  2017.  Malware Collection and Analysis. 2017 IEEE International Conference on Information Reuse and Integration (IRI). :26–31.

This paper describes the various malware datasets that we have obtained permissions to host at the University of Arizona as part of a National Science Foundation funded project. It also describes some other malware datasets that we are in the process of obtaining permissions to host at the University of Arizona. We have also discussed some preliminary work we have carried out on malware analysis using big data platforms.