Giter Club home page Giter Club logo

degas's Introduction

Degas

DGA-generated domain detection using deep learning models

Edgar Degas, "Four Dancers"

Running

I'm currently using Conda (Anaconda/Miniconda) for development, but you should be able to use Pipenv or virtualenv as well using the included requirements.txt.

conda:

conda env create -f environment.yml
conda activate degas

Pipenv:

pipenv install -r requirements.txt
pipenv shell

Virtualenv is similar, but there's really no reason to use virtualenv instead of Pipenv anymore.

Retraining the model

There is a trained model checked into the models directory. If you'd like to train your own, you'll first need to download the training data from S3:

python degas/runner download-data

Process the data into the simple CSV form that the model builder expects:

python degas/runner process-data data/raw data/processed

Those steps only need to be run once, unless you change the training data.

To then retrain the model using the generated dataset, first install tensorflow-gpu using your package manager of choice (conda install tensorflow-gpu or pip install tensorflow-gpu) so that training is GPU-accelerated.

Then, run:

python degas/runner train-model data/processed

Run python degas/runner train-model --help for some available tuning options. This takes about an hour and a half on an GTX 1070. It only runs about 9 epochs before it short-circuits; you could potentially run it for, say, 5 epochs and still get good accuracy with half the training time: python degas/runner train-model --epochs 5 data/processed

Making predictions

Since this project uses Tensorflow as the underlying deep learning library, the recommended way to use this for inference is to use Tensorflow Serving.

You should be able to serve it using:

docker run -p 8501:8501 \
  --mount type=bind,source=models/degas,target=/models/degas\
  -e MODEL_NAME=degas -t tensorflow/serving

See Tensorflow Serving docs for more information about available options.

About Degas

Why deep learning for this task? Because it works well, and it isn't hard to implement. From Byu et al, 2018:

"Deep neural networks have recently appeared in the literature on DGA detection Woodbridge et al. (2016); Saxe & Berlin (2017); Yu et al. (2017). They significantly outperform traditional machine learning methods in accuracy, at the price of increasing the complexity of training the model and requiring larger datasets."

Since there's plenty of data available to train with, creating a deep learning model is just as easy or easier than the alternatives.

References

https://openreview.net/forum?id=BJLmN8xRW&noteId=BJLmN8xRW http://faculty.washington.edu/mdecock/papers/byu2018a.pdf

Why "Degas"?

  • Because it's more fun working on a project with a name, rather than "DGA-detector" or something.
  • Perhaps naming the project after an impressionist painter will make it sound more impressive?
  • It was the first result from the classic "Samba naming algorithm" ( egrep -i '^d.*g.*a.* /usr/share/dict/words )

For the record, I'm pronouncing it "de-gah", as in Edgar Degas, not "de-gas", as in "to remove all the gas."

degas's People

Contributors

matthoffman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

degas's Issues

run in docker by tensorFlow serving got errors

Hi there,
Thx for your degas project, I'm new bee with tensorflow, could you pls. help me about run degas in docker with tensorflow serving. Any reply is a big help.
When I run it in docker using:

docker run -p 8501:8501
--mount type=bind,source=/Users/myUserName/PycharmProjects/degas/models/degas,target=/models/degas
-e MODEL_NAME=degas -t tensorflow/serving:1.12.0

It works fine in my docker. Then I call RESTful API by post :

{
 "instances": [
   "www.google.com","234kd3ds9fkj3.com"
 ]
}

to :
http://localhost:8501/v1/models/degas:predict
I got the error response:

{
    "error": "Failed to process element: 0 of \'instances\' list. Error: Invalid argument: JSON Value: \"www.google.com\" Type: String is not of expected type: int32"
}

and even I POST this in request, got error response too:

{
 "inputs": [
   "www.google.com","234kd3ds9fkj3.com"
 ]
}

Edit:
I did a google more, then I change my input to:

{
    "inputs":[
        {
            "input_image": [0.2]
        }
    ]
}

And I got ERROR message as below:

{
    "error": "JSON Value: {\n    \"input_image\": [\n        0.2\n    ]\n} Type: Object is not of expected type: int32"
}

Edit:
I did a google more, then I change my input to:

{
    "inputs":[
        {
            "input_image": [1,2]
        }
    ]
}


And I got ERROR message as below:


{
    "error": "JSON Value: {\n    \"input_image\": [\n        1,\n        2\n    ]\n} Type: Object is not of expected type: int32"
}

Cloud you please point me out where is wrong with my operate?
THANKS A LOT.

/**************************************************************************/

EDIT:

Already OK now.

  1. specify tensorflow and other tensorflow packages to 1.11.0 (eg. tensorflow serving etc.)
  2. post json( predict for "www.google.com", "a2x43v89es01.com", "www.twitter.com" ):
{
 "instances": [[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,0,0,0,0,0,0,0,0,0,0,0,0,0,37,37,37,12,21,29,29,21,26,19,12
,17,29,27]
,[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,0,0,0,0,0,0,0,0,0,0,0,15,2,38,4,3,36,8,9,19,33,0,1,12
,17,29,27]
,[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,0,0,0,0,0,0,0,0,0,0,0,0,37,37,37,12,34,37,23,34,34,19,32,12
,17,29,27]]
}
  1. response of tensorflow serving will be:
{
    "predictions": [
        [
            4.54876e-11
        ],
        [
            0.723077
        ],
        [
            2.9277e-18
        ]
    ]
}

/**************************************************************************/

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.