Visible to the public Biblio

Filters: Keyword is Payloads  [Clear All Filters]
2023-09-18
Cao, Michael, Ahmed, Khaled, Rubin, Julia.  2022.  Rotten Apples Spoil the Bunch: An Anatomy of Google Play Malware. 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE). :1919—1931.
This paper provides an in-depth analysis of Android malware that bypassed the strictest defenses of the Google Play application store and penetrated the official Android market between January 2016 and July 2021. We systematically identified 1,238 such malicious applications, grouped them into 134 families, and manually analyzed one application from 105 distinct families. During our manual analysis, we identified malicious payloads the applications execute, conditions guarding execution of the payloads, hiding techniques applications employ to evade detection by the user, and other implementation-level properties relevant for automated malware detection. As most applications in our dataset contain multiple payloads, each triggered via its own complex activation logic, we also contribute a graph-based representation showing activation paths for all application payloads in form of a control- and data-flow graph. Furthermore, we discuss the capabilities of existing malware detection tools, put them in context of the properties observed in the analyzed malware, and identify gaps and future research directions. We believe that our detailed analysis of the recent, evasive malware will be of interest to researchers and practitioners and will help further improve malware detection tools.
2023-08-03
Liu, Zhichao, Jiang, Yi.  2022.  Cross-Layer Design for UAV-Based Streaming Media Transmission. IEEE Transactions on Circuits and Systems for Video Technology. 32:4710–4723.
Unmanned Aerial Vehicle (UAV)-based streaming media transmission may become unstable when the bit rate generated by the source load exceeds the channel capacity owing to the UAV location and speed change. The change of the location can affect the network connection, leading to reduced transmission rate; the change of the flying speed can increase the video payload due to more I-frames. To improve the transmission reliability, in this paper we design a Client-Server-Ground&User (C-S-G&U) framework, and propose an algorithm of splitting-merging stream (SMS) for multi-link concurrent transmission. We also establish multiple transport links and configure the routing rules for the cross-layer design. The multi-link transmission can achieve higher throughput and significantly smaller end-to-end delay than a single-link especially in a heavy load situation. The audio and video data are packaged into the payload by the Real-time Transport Protocol (RTP) before being transmitted over the User Datagram Protocol (UDP). The forward error correction (FEC) algorithm is implemented to promote the reliability of the UDP transmission, and an encryption algorithm to enhance security. In addition, we propose a Quality of Service (QoS) strategy so that the server and the user can control the UAV to adapt its transmission mode dynamically, according to the load, delay, and packet loss. Our design has been implemented on an engineering platform, whose efficacy has been verified through comprehensive experiments.
Conference Name: IEEE Transactions on Circuits and Systems for Video Technology
2023-06-23
Özdel, Süleyman, Damla Ateş, Pelin, Ateş, Çağatay, Koca, Mutlu, Anarım, Emin.  2022.  Network Anomaly Detection with Payload-based Analysis. 2022 30th Signal Processing and Communications Applications Conference (SIU). :1–4.
Network attacks become more complicated with the improvement of technology. Traditional statistical methods may be insufficient in detecting constantly evolving network attack. For this reason, the usage of payload-based deep packet inspection methods is very significant in detecting attack flows before they damage the system. In the proposed method, features are extracted from the byte distributions in the payload and these features are provided to characterize the flows more deeply by using N-Gram analysis methods. The proposed procedure has been tested on IDS 2012 and 2017 datasets, which are widely used in the literature.
ISSN: 2165-0608
2022-10-20
Butora, Jan, Fridrich, Jessica.  2020.  Steganography and its Detection in JPEG Images Obtained with the "TRUNC" Quantizer. ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :2762—2766.
Many portable imaging devices use the operation of "trunc" (rounding towards zero) instead of rounding as the final quantizer for computing DCT coefficients during JPEG compression. We show that this has rather profound consequences for steganography and its detection. In particular, side-informed steganography needs to be redesigned due to the different nature of the rounding error. The steganographic algorithm J-UNIWARD becomes vulnerable to steganalysis with the JPEG rich model and needs to be adjusted for this source. Steganalysis detectors need to be retrained since a steganalyst unaware of the existence of the trunc quantizer will experience 100% false alarm.
Jan, Aiman, Parah, Shabir A., Malik, Bilal A..  2020.  A Novel Laplacian of Gaussian (LoG) and Chaotic Encryption Based Image Steganography Technique. 2020 International Conference for Emerging Technology (INCET). :1—4.
Information sharing through internet has becoming challenge due to high-risk factor of attacks to the information being transferred. In this paper, a novel image-encryption edge based Image steganography technique is proposed. The proposed algorithm uses logistic map for encrypting the information prior to transmission. Laplacian of Gaussian (LoG) edge operator is used to find edge areas of the colored-cover-image. Simulation analysis demonstrates that the proposed algorithm has a good amount of payload along with better results of security analysis. The proposed scheme is compared with the existing-methods.
2022-04-19
Johnson, Andrew, Haddad, Rami J..  2021.  Evading Signature-Based Antivirus Software Using Custom Reverse Shell Exploit. SoutheastCon 2021. :1–6.
Antivirus software is considered to be the primary line of defense against malicious software in modern computing systems. The purpose of this paper is to expose exploitation that can evade Antivirus software that uses signature-based detection algorithms. In this paper, a novel approach was proposed to change the source code of a common Metasploit-Framework used to compile the reverse shell payload without altering its functionality but changing its signature. The proposed method introduced an additional stage to the shellcode program. Instead of the shellcode being generated and stored within the program, it was generated separately and stored on a remote server and then only accessed when the program is executed. This approach was able to reduce its detectability by the Antivirus software by 97% compared to a typical reverse shell program.
A, Meharaj Begum, Arock, Michael.  2021.  Efficient Detection Of SQL Injection Attack(SQLIA) Using Pattern-based Neural Network Model. 2021 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS). :343–347.
Web application vulnerability is one of the major causes of cyber attacks. Cyber criminals exploit these vulnerabilities to inject malicious commands to the unsanitized user input in order to bypass authentication of the database through some cyber-attack techniques like cross site scripting (XSS), phishing, Structured Query Language Injection Attack (SQLIA), malware etc., Although many research works have been conducted to resolve the above mentioned attacks, only few challenges with respect to SQLIA could be resolved. Ensuring security against complete set of malicious payloads are extremely complicated and demanding. It requires appropriate classification of legitimate and injected SQL commands. The existing approaches dealt with limited set of signatures, keywords and symbols of SQL queries to identify the injected queries. This work focuses on extracting SQL injection patterns with the help of existing parsing and tagging techniques. Pattern-based tags are trained and modeled using Multi-layer Perceptron which significantly performs well in classification of queries with accuracy of 94.4% which is better than the existing approaches.
2022-04-01
Kumar Gupta, Lalit, Singh, Aniket, Kushwaha, Abhishek, Vishwakarma, Ashish.  2021.  Analysis of Image Steganography Techniques for Different Image Format. 2021 International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT). :1—6.
Steganography is the method of hiding one type of information into other type of information, hiding a secret a message in a cover so that others can't know the presence of the secret information. It provides an extra layer of security in communication and information sharing. Security is an important aspect of the communication process; everyone want security in communication. The main purpose of this paper is to introduce security of information that people share among them. In this paper we are presenting different methods of substitution techniques of image steganography and their comparison. Least significant bit and most significant bit substitution techniques are used. Information is hidden in an image file and then decoded back for the secret message. Hiding the presence of any hidden information makes this more secure. This implementation can be used by secret service agencies and also common people for secure communication.
2022-03-01
Chaves, Cesar G., Sepulveda, Johanna, Hollstein, Thomas.  2021.  Lightweight Monitoring Scheme for Flooding DoS Attack Detection in Multi-Tenant MPSoCs. 2021 IEEE International Symposium on Circuits and Systems (ISCAS). :1–5.
The increasing use of Multiprocessor Systems-on-Chip (MPSoCs) within scalable multi-tenant systems, such as fog/cloud computing, faces the challenge of potential attacks originated by the execution of malicious tasks. Flooding Denial- of-Service (FDoS) attacks are one of the most common and powerful threats for Network-on-Chip (NoC)-based MPSoCs. Since, by overwhelming the NoC, the system is unable to forward legitimate traffic. However, the effectiveness of FDoS attacks depend on the NoC configuration. Moreover, designing a secure MPSoC capable of detecting such attacks while avoiding excessive power/energy and area costs is challenging. To this end, we present two contributions. First, we demonstrate two types of FDoS attacks: based on the packet injection rate (PIR-based FDoS) and based on the packet's payload length (PPL-based FDoS). We show that fair round-robin NoCs are intrinsically protected against PIR-based FDoS. Instead, PPL-based FDoS attacks represent a real threat to MPSoCs. Second, we propose a novel lightweight monitoring method for detecting communication disruptions. Simulation and synthesis results show the feasibility and efficiency of the presented approach.
2022-02-24
Gondron, Sébastien, Mödersheim, Sebastian.  2021.  Vertical Composition and Sound Payload Abstraction for Stateful Protocols. 2021 IEEE 34th Computer Security Foundations Symposium (CSF). :1–16.
This paper deals with a problem that arises in vertical composition of protocols, i.e., when a channel protocol is used to encrypt and transport arbitrary data from an application protocol that uses the channel. Our work proves that we can verify that the channel protocol ensures its security goals independent of a particular application. More in detail, we build a general paradigm to express vertical composition of an application protocol and a channel protocol, and we give a transformation of the channel protocol where the application payload messages are replaced by abstract constants in a particular way that is feasible for standard automated verification tools. We prove that this transformation is sound for a large class of channel and application protocols. The requirements that channel and application have to satisfy for the vertical composition are all of an easy-to-check syntactic nature.
2022-02-07
Çelık, Abdullah Emre, Dogru, Ibrahim Alper, Uçtu, Göksel.  2021.  Automatic Generation of Different Malware. 2021 29th Signal Processing and Communications Applications Conference (SIU). :1–4.
The use of mobile devices has increased dramatically in recent years. These smart devices allow us to easily perform many functions such as e-mail, internet, Bluetooth, SMS and MMS without restriction of time and place. Thus, these devices have become an indispensable part of our lives today. Due to this high usage, malware developers have turned to this platform and many mobile malware has emerged in recent years. Many security companies and experts have developed methods to protect our mobile devices. In this study, in order to contribute to mobile malware detection and analysis, an application has been implemented that automatically injects payload into normal apk. With this application, it is aimed to create a data set that can be used by security companies and experts.
2021-10-04
Reshikeshan, Sree Subiksha M., Illindala, Mahesh S..  2020.  Systematically Encoded Polynomial Codes to Detect and Mitigate High-Status-Number Attacks in Inter-Substation GOOSE Communications. 2020 IEEE Industry Applications Society Annual Meeting. :1–7.
Inter-substation Generic Object Oriented Substation Events (GOOSE) communications that are used for critical protection functions have several cyber-security vulnerabilities. GOOSE messages are directly mapped to the Layer 2 Ethernet without network and transport layer headers that provide data encapsulation. The high-status-number attack is a malicious attack on GOOSE messages that allows hackers to completely take over intelligent electronic devices (IEDs) subscribing to GOOSE communications. The status-number parameter of GOOSE messages, stNum is tampered with in these attacks. Given the strict delivery time requirement of 3 ms for GOOSE messaging, it is infeasible to encrypt the GOOSE payload. This work proposes to secure the sensitive stNum parameter of the GOOSE payload using systematically encoded polynomial codes. Exploiting linear codes allows for the security features to be encoded in linear time, in contrast to complex hashing algorithms. At the subscribing IED, the security feature is used to verify that the stNum parameter has not been tampered with during transmission in the insecure medium. The decoding and verification using syndrome computation at the subscriber IED is also accomplished in linear time.
2021-09-30
Meraj Ahmed, M, Dhavlle, Abhijitt, Mansoor, Naseef, Sutradhar, Purab, Pudukotai Dinakarrao, Sai Manoj, Basu, Kanad, Ganguly, Amlan.  2020.  Defense Against on-Chip Trojans Enabling Traffic Analysis Attacks. 2020 Asian Hardware Oriented Security and Trust Symposium (AsianHOST). :1–6.
Interconnection networks for multi/many-core processors or server systems are the backbone of the system as they enable data communication among the processing cores, caches, memory and other peripherals. Given the criticality of the interconnects, the system can be severely subverted if the interconnection is compromised. The threat of Hardware Trojans (HTs) penetrating complex hardware systems such as multi/many-core processors are increasing due to the increasing presence of third party players in a System-on-chip (SoC) design. Even by deploying naïve HTs, an adversary can exploit the Network-on-Chip (NoC) backbone of the processor and get access to communication patterns in the system. This information, if leaked to an attacker, can reveal important insights regarding the application suites running on the system; thereby compromising the user privacy and paving the way for more severe attacks on the entire system. In this paper, we demonstrate that one or more HTs embedded in the NoC of a multi/many-core processor is capable of leaking sensitive information regarding traffic patterns to an external malicious attacker; who, in turn, can analyze the HT payload data with machine learning techniques to infer the applications running on the processor. Furthermore, to protect against such attacks, we propose a Simulated Annealing-based randomized routing algorithm in the system. The proposed defense is capable of obfuscating the attacker's data processing capabilities to infer the user profiles successfully. Our experimental results demonstrate that the proposed randomized routing algorithm could reduce the accuracy of identifying user profiles by the attacker from \textbackslashtextgreater98% to \textbackslashtextless; 15% in multi/many-core systems.
2021-09-16
Sah, Love Kumar, Polnati, Srivarsha, Islam, Sheikh Ariful, Katkoori, Srinivas.  2020.  Basic Block Encoding Based Run-Time CFI Check for Embedded Software. 2020 IFIP/IEEE 28th International Conference on Very Large Scale Integration (VLSI-SOC). :135–140.
Modern control flow attacks circumvent existing defense mechanisms to transfer the program control to attacker chosen malicious code in the program, leaving application vulnerable to attack. Advanced attacks such as Return-Oriented Programming (ROP) attack and its variants, transfer program execution to gadgets (code-snippet that ends with return instruction). The code space to generate gadgets is large and attacks using these gadgets are Turing-complete. One big challenge to harden the program against ROP attack is to confine gadget selection to a limited locations, thus leaving the attacker to search entire code space according to payload criteria. In this paper, we present a novel approach to label the nodes of the Control-Flow Graph (CFG) of a program such that labels of the nodes on a valid control flow edge satisfy a Hamming distance property. The newly encoded CFG enables detection of illegal control flow transitions during the runtime in the processor pipeline. Experimentally, we have demonstrated that the proposed Control Flow Integrity (CFI) implementation is effective against control-flow hijacking and the technique can reduce the search space of the ROP gadgets upto 99.28%. We have also validated our technique on seven applications from MiBench and the proposed labeling mechanism incurs no instruction count overhead while, on average, it increases instruction width to a maximum of 12.13%.
2021-06-24
Abirami, R., Wise, D. C. Joy Winnie, Jeeva, R., Sanjay, S..  2020.  Detecting Security Vulnerabilities in Website using Python. 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC). :844–846.
On the current website, there are many undeniable conditions and there is the existence of new plot holes. If data link is normally extracted on each of the websites, it becomes difficult to evaluate each vulnerability, with tolls such as XS S, SQLI, and other such existing tools for vulnerability assessment. Integrated testing criteria for vulnerabilities are met. In addition, the response should be automated and systematic. The primary value of vulnerability Buffer will be made of predefined and self-formatted code written in python, and the software is automated to send reports to their respective users. The vulnerabilities are tried to be classified as accessible. OWASP is the main resource for developing and validating web security processes.
2021-05-05
Kishore, Pushkar, Barisal, Swadhin Kumar, Prasad Mohapatra, Durga.  2020.  JavaScript malware behaviour analysis and detection using sandbox assisted ensemble model. 2020 IEEE REGION 10 CONFERENCE (TENCON). :864—869.

Whenever any internet user visits a website, a scripting language runs in the background known as JavaScript. The embedding of malicious activities within the script poses a great threat to the cyberworld. Attackers take advantage of the dynamic nature of the JavaScript and embed malicious code within the website to download malware and damage the host. JavaScript developers obfuscate the script to keep it shielded from getting detected by the malware detectors. In this paper, we propose a novel technique for analysing and detecting JavaScript using sandbox assisted ensemble model. We extract the payload using malware-jail sandbox to get the real script. Upon getting the extracted script, we analyse it to define the features that are needed for creating the dataset. We compute Pearson's r between every feature for feature extraction. An ensemble model consisting of Sequential Minimal Optimization (SMO), Voted Perceptron and AdaBoost algorithm is used with voting technique to detect malicious JavaScript. Experimental results show that our proposed model can detect obfuscated and de-obfuscated malicious JavaScript with an accuracy of 99.6% and 0.03s detection time. Our model performs better than other state-of-the-art models in terms of accuracy and least training and detection time.

Herrera, Adrian.  2020.  Optimizing Away JavaScript Obfuscation. 2020 IEEE 20th International Working Conference on Source Code Analysis and Manipulation (SCAM). :215—220.

JavaScript is a popular attack vector for releasing malicious payloads on unsuspecting Internet users. Authors of this malicious JavaScript often employ numerous obfuscation techniques in order to prevent the automatic detection by antivirus and hinder manual analysis by professional malware analysts. Consequently, this paper presents SAFE-DEOBS, a JavaScript deobfuscation tool that we have built. The aim of SAFE-DEOBS is to automatically deobfuscate JavaScript malware such that an analyst can more rapidly determine the malicious script's intent. This is achieved through a number of static analyses, inspired by techniques from compiler theory. We demonstrate the utility of SAFE-DEOBS through a case study on real-world JavaScript malware, and show that it is a useful addition to a malware analyst's toolset.

2021-05-03
Lehniger, Kai, Aftowicz, Marcin J., Langendorfer, Peter, Dyka, Zoya.  2020.  Challenges of Return-Oriented-Programming on the Xtensa Hardware Architecture. 2020 23rd Euromicro Conference on Digital System Design (DSD). :154–158.
This paper shows how the Xtensa architecture can be attacked with Return-Oriented-Programming (ROP). The presented techniques include possibilities for both supported Application Binary Interfaces (ABIs). Especially for the windowed ABI a powerful mechanism is presented that not only allows to jump to gadgets but also to manipulate registers without relying on specific gadgets. This paper purely focuses on how the properties of the architecture itself can be exploited to chain gadgets and not on specific attacks or a gadget catalog.
Luo, Lan, Zhang, Yue, Zou, Cliff, Shao, Xinhui, Ling, Zhen, Fu, Xinwen.  2020.  On Runtime Software Security of TrustZone-M Based IoT Devices. GLOBECOM 2020 - 2020 IEEE Global Communications Conference. :1–7.
Internet of Things (IoT) devices have been increasingly integrated into our daily life. However, such smart devices suffer a broad attack surface. Particularly, attacks targeting the device software at runtime are challenging to defend against if IoT devices use resource-constrained microcontrollers (MCUs). TrustZone-M, a TrustZone extension for MCUs, is an emerging security technique fortifying MCU based IoT devices. This paper presents the first security analysis of potential software security issues in TrustZone-M enabled MCUs. We explore the stack-based buffer overflow (BOF) attack for code injection, return-oriented programming (ROP) attack, heap-based BOF attack, format string attack, and attacks against Non-secure Callable (NSC) functions in the context of TrustZone-M. We validate these attacks using the Microchip SAM L11 MCU, which uses the ARM Cortex-M23 processor with the TrustZone-M technology. Strategies to mitigate these software attacks are also discussed.
2021-04-09
Noiprasong, P., Khurat, A..  2020.  An IDS Rule Redundancy Verification. 2020 17th International Joint Conference on Computer Science and Software Engineering (JCSSE). :110—115.
Intrusion Detection System (IDS) is a network security software and hardware widely used to detect anomaly network traffics by comparing the traffics against rules specified beforehand. Snort is one of the most famous open-source IDS system. To write a rule, Snort specifies structure and values in Snort manual. This specification is expressive enough to write in different way with the same meaning. If there are rule redundancy, it could distract performance. We, thus, propose a proof of semantical issues for Snort rule and found four pairs of Snort rule combinations that can cause redundancy. In addition, we create a tool to verify such redundancy between two rules on the public rulesets from Snort community and Emerging threat. As a result of our test, we found several redundancy issues in public rulesets if the user enables commented rules.
2021-04-08
Westland, T., Niu, N., Jha, R., Kapp, D., Kebede, T..  2020.  Relating the Empirical Foundations of Attack Generation and Vulnerability Discovery. 2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science (IRI). :37–44.
Automatically generating exploits for attacks receives much attention in security testing and auditing. However, little is known about the continuous effect of automatic attack generation and detection. In this paper, we develop an analytic model to understand the cost-benefit tradeoffs in light of the process of vulnerability discovery. We develop a three-phased model, suggesting that the cumulative malware detection has a productive period before the rate of gain flattens. As the detection mechanisms co-evolve, the gain will likely increase. We evaluate our analytic model by using an anti-virus tool to detect the thousands of Trojans automatically created. The anti-virus scanning results over five months show the validity of the model and point out future research directions.
2021-03-09
Lee, T., Chang, L., Syu, C..  2020.  Deep Learning Enabled Intrusion Detection and Prevention System over SDN Networks. 2020 IEEE International Conference on Communications Workshops (ICC Workshops). :1—6.

The Software Defined Network (SDN) provides higher programmable functionality for network configuration and management dynamically. Moreover, SDN introduces a centralized management approach by dividing the network into control and data planes. In this paper, we introduce a deep learning enabled intrusion detection and prevention system (DL-IDPS) to prevent secure shell (SSH) brute-force attacks and distributed denial-of-service (DDoS) attacks in SDN. The packet length in SDN switch has been collected as a sequence for deep learning models to identify anomalous and malicious packets. Four deep learning models, including Multilayer Perceptron (MLP), Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM) and Stacked Auto-encoder (SAE), are implemented and compared for the proposed DL-IDPS. The experimental results show that the proposed MLP based DL-IDPS has the highest accuracy which can achieve nearly 99% and 100% accuracy to prevent SSH Brute-force and DDoS attacks, respectively.

Zhou, B., He, J., Tan, M..  2020.  A Two-stage P2P Botnet Detection Method Based on Statistical Features. 2020 IEEE 11th International Conference on Software Engineering and Service Science (ICSESS). :497—502.

P2P botnet has become one of the most serious threats to today's network security. It can be used to launch kinds of malicious activities, ranging from spamming to distributed denial of service attack. However, the detection of P2P botnet is always challenging because of its decentralized architecture. In this paper, we propose a two-stage P2P botnet detection method which only relies on several traffic statistical features. This method first detects P2P hosts based on three statistical features, and then distinguishes P2P bots from benign P2P hosts by means of another two statistical features. Experimental evaluations on real-world traffic datasets shows that our method is able to detect hidden P2P bots with a detection accuracy of 99.7% and a false positive rate of only 0.3% within 5 minutes.

2021-03-04
Afreen, A., Aslam, M., Ahmed, S..  2020.  Analysis of Fileless Malware and its Evasive Behavior. 2020 International Conference on Cyber Warfare and Security (ICCWS). :1—8.

Malware is any software that causes harm to the user information, computer systems or network. Modern computing and internet systems are facing increase in malware threats from the internet. It is observed that different malware follows the same patterns in their structure with minimal alterations. The type of threats has evolved, from file-based malware to fileless malware, such kind of threats are also known as Advance Volatile Threat (AVT). Fileless malware is complex and evasive, exploiting pre-installed trusted programs to infiltrate information with its malicious intent. Fileless malware is designed to run in system memory with a very small footprint, leaving no artifacts on physical hard drives. Traditional antivirus signatures and heuristic analysis are unable to detect this kind of malware due to its sophisticated and evasive nature. This paper provides information relating to detection, mitigation and analysis for such kind of threat.

Matin, I. Muhamad Malik, Rahardjo, B..  2020.  A Framework for Collecting and Analysis PE Malware Using Modern Honey Network (MHN). 2020 8th International Conference on Cyber and IT Service Management (CITSM). :1—5.

Nowadays, Windows is an operating system that is very popular among people, especially users who have limited knowledge of computers. But unconsciously, the security threat to the windows operating system is very high. Security threats can be in the form of illegal exploitation of the system. The most common attack is using malware. To determine the characteristics of malware using dynamic analysis techniques and static analysis is very dependent on the availability of malware samples. Honeypot is the most effective malware collection technique. But honeypot cannot determine the type of file format contained in malware. File format information is needed for the purpose of handling malware analysis that is focused on windows-based malware. For this reason, we propose a framework that can collect malware information as well as identify malware PE file type formats. In this study, we collected malware samples using a modern honey network. Next, we performed a feature extraction to determine the PE file format. Then, we classify types of malware using VirusTotal scanning. As the results of this study, we managed to get 1.222 malware samples. Out of 1.222 malware samples, we successfully extracted 945 PE malware. This study can help researchers in other research fields, such as machine learning and deep learning, for malware detection.