Giter Club home page Giter Club logo

spam-email-classification's Introduction


SpamClassifier
SpamClassifier

Model Deploy Using Flask on Heroku Platform

In this project I build a model for classifying the SMS/Email into spam or ham through the text of the SMS/Email using standard classifiers.

What It Does:


Live Demo:


How It Does:

Extract the text and the target class from the dataset. Extract the features of the test using TF IDF vectorizer for the Input features.Split the skewed data into shuffled sets using stratified shuffle split in sklearn library. Use standard classifiers to classify the data into spam or ham.


Prerequisites:

I would highly recommend that before the hack night you have some kind of toolchain and development environment already installed and ready. If you have no idea where to start with this, try a combination like:

  • Python
  • scikit-learn / sklearn \
  • Pandas
  • nltk
  • NumPy
  • matplotlib
  • An environment to work in - something like Jupyter or Spyder For Linux people, your package manager should be able to handle all of this. If it somehow can't, see if you can at least install Python and pip and then use pip to install the abovepackages.

Dataset:

The SMS/Email Spam Collection is a set of SMS tagged messages that have been collected for SMS/Email Spam research. It contains one set of SMS messages in English of 5,567 messages, tagged according being ham (legitimate) or spam.

You can collect raw dataset from here .The files contain one message per line. Each line is composed by two columns:

  • Class- contains the label (ham or spam)
  • Message - contains the raw text.

ModelPipeline:


Components:

  • Using TF-IDF for feature extraction of the text data for the messages.
  • Use splits for skewed data(Since the number of ham are far more than the number of spam messages,the data is skewed)
  • Use stratified shuffled split for the split of skewed data.
  • Use different standard classifiers for classification of the SMS.
  • Compare the accuracy of various classifiers using standard classification metrics

AccuracyResult:


Future Scope:

  • Adding this feature in a dynamic website which supports contact-us typo feature.
  • Show live user inputs for Ham and Spam .

spam-email-classification's People

Contributors

bharatc9530 avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.