Giter Club home page Giter Club logo

deepids's Introduction

DeepIDS

An intrusion detection system built on deep learning.

Purpose

As it stands, this project is intended to be a proof-of-concept for an intrusion detection system, and not an intrusion prevention system. Further work could extend this project to an IPS or a near-realtime IDS using a high-performance, distributed computing environment.

Data

We've used the KDDCup99 dataset from OpenML.org. There are several known issues surrounding this dataset; please see the data/README.md file for more information.

Approach

We trained three neural networks, as well as a logistic regression classifier for stacking. All models were built with Keras on a Tensorflow backend. The architectures can be found in each network's respective development notebook in develop/. We trained each model using the Adam optimizer with a batch size of 32 and Keras defaults for beta hyperparameters. We also used early stopping monitoring validation accuracy with a patience of three epochs. Other hyperparameters differed for each model, and are shown below:

ReLU SELU ResNet Stacker
Optimized Parameters 11,030,551 8,632,983 19,846,551 1,587
Dropout 0.2
BatchNorm Yes Yes
Learning Rate .001 .00002 .001 .00002

Preliminary Results

We monitored validation accuracy to checkpoint the best model from each training run, then chose the best model with respect to validation accuracy for each architecture from all of our training runs. Note that in general, it is not advisable to use accuracy when working with a highly imbalanced dataset like KDDCup99. However, since we are using an ensemble, we want to encourage the base models to develop their own strengths and weaknesses. Using accuracy as a model selection criterion allows for this by ignoring the class imbalance in its selection process.

ReLU SELU ResNet Ensemble
Training 0.99858 0.99891 0.99195 0.99945
Validation 0.99857 0.99889 0.99195 0.99943
Test 0.99206 0.99780 0.99319 0.99819

Below are our results on the test set. Binary results are determined by converting predictions and labels to a binary classification of normal vs. malicious connection attempts. Note that the result for binary recall is not a realistic estimate of real world performance, while precision estimates are more realistic. This is due to deficiencies of the KDDCup99 dataset (as noted here). We've included a standard softmax classifier as a baseline.

Baseline DeepIDS
23-class Accuracy 0.99551 0.99819
Binary Precision 1.0 0.99857
Binary Recall 0.5 0.99974

Here are the binary results by class. See our blog post for an interactive version. Binary results by class

Next Steps

We were unaware of the existence of an updated intrusion detection benchmarking dataset until the end of this project. Now that we know that NSL-KDD exists, we will be reiterating our method on it. We will then perform a more complete literature search and compare our method with state of the art results. Stay tuned!

deepids's People

Contributors

jvmncs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

deepids's Issues

Aw, snap! We didn't get a username with your request.

Aw, snap! We didn't get a username with your request.

Don't have an account? https://plot.ly/api_signup

Questions? [email protected]

PlotlyError Traceback (most recent call last)
in ()
6 )
7 figure = go.Figure(data = data, layout = layout)
----> 8 plt.iplot(figure, filename='tsne-scatter')

/home/vinay/securetensor/local/lib/python2.7/site-packages/plotly/plotly/plotly.pyc in iplot(figure_or_data, **plot_options)
162 embed_options['height'] = str(embed_options['height']) + 'px'
163
--> 164 return tools.embed(url, **embed_options)
165
166
untitled

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.