Abstract
Modern malware profoundly impacts our digitized planet; its victims range from private individuals to governments to international corporations. Therefore, it is essential to collect as much malware-related data as possible. Such data advances our understanding of malware and equips us to mitigate its noxious effects. In this work, we present a web-scraping tool, dubbed ScrapeIOC, which was written in Python3 and captures hashed malware samples, tagging them with either a malware type or a malware family. Additional information about the hashed malware sample is also recorded. Scraping was conducted via API calls to various websites; ultimately, a database containing more than 20,000 hashed samples was produced. The hashed malware samples included MD5, SHA1, and SHA256 hashes. This work details each part of the tool’s code, including its auxiliary features, such as information display and database backups. Toward the end of this work, a second database was also generated. For the second database, the ten most common malware families of spring 2020 were used as tags. The second database comprises over 4,000 hashed samples.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2023 IEEE International Conference on Metaverse Computing, Networking and Applications |
Publisher | IEEE |
Publication date | 2023 |
Pages | 124-128 |
ISBN (Print) | 979-8-3503-3334-3 |
ISBN (Electronic) | 979-8-3503-3333-6 |
DOIs | |
Publication status | Published - 2023 |
Event | 2023 IEEE International Conference on Metaverse Computing, Networking and Applications - Kyoto, Japan Duration: 26 Jun 2023 → 28 Jun 2023 |
Conference
Conference | 2023 IEEE International Conference on Metaverse Computing, Networking and Applications |
---|---|
Country/Territory | Japan |
City | Kyoto |
Period | 26/06/2023 → 28/06/2023 |
Keywords
- Malware Detection
- Web Scraping
- Indicators of Compromise
- API
- Search Tag