Giter Club home page Giter Club logo

aptmalware's Introduction

APT Malware Dataset

This dataset contains over 3,500 malware samples that are related to 12 APT groups which alledgedly are sponsored by 5 different nation-states. This dataset was used for benchmarking different Machine Learning approaches performing authorship attribution. This dataset can be used for future benchmarks or malware research.

Data Characteristics

The samples in the dataset are distributed as follows:

Country APT Group Family Requested Downloaded
China APT 1 1007 405
China APT 10 i.a. PlugX 300 244
China APT 19 Derusbi 33 32
China APT 21 TravNet 118 106
Russia APT 28 Bears 230 214
Russia APT 29 Dukes 281 281
China APT 30 164 164
North-Korea DarkHotel DarkHotel 298 273
Russia Energetic Bear Havex 132 132
USA Equation Group Fannyworm 395 395
Pakistan Gorgon Group Different RATs 1085 961
China Winnti 406 387
Total 4449 3594

Remarks

All samples are named according to their SHA-256 hash and grouped by APT group. Samples are put in separate password-protected compressed folders (.zip). The password for all files is infected.

Source

The malware samples are collected using open source threat intelligence reports from multiple vendors. Many threat intelligence reports were collected and a list of all filehashes used as indicators of compromise (IoC) has been collected. These hashes were used to obtain the malware samples from VirusTotal.

The file overview.csv contains an overview of all malware samples and the reports in which their hash-value has been found.

Code Used for Authorship Attribution

The source code of the experiments performed for benchmarking authorship attribution performance can be found at GitHub: APT Attribution Code.

License

Open Database License

This APT Malware Dataset is made available under Open Database License whose full text can be found at http://opendatacommons.org/licenses/odbl/. Any rights in individual contents of the database are licensed under the Database Contents License whose text can be found http://opendatacommons.org/licenses/dbcl/.

aptmalware's People

Contributors

cyber-research avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.