Giter Club home page Giter Club logo

leadqualifier's Introduction

LeadQualifier

This repo is a collection of scripts we use at Xeneta to qualify sales leads with machine learning. Read more about this project in the Medium article Boosting Sales With Machine Learning.

You can use this repo for two things:

  1. Try to beat our predictions using our data and your own algorithm
  2. Create a lead qualifier for your company, using your own data

Setup

Start off by running the following command:

pip install -r requirements.txt

You'll also need to download the stopword from the nltk package. Run the Python interpreter and type the following:

import nltk
nltk.download('stopwords')

1. Experiment with your own algorithms

We'd love to see more algorithms on the leaderboard, so send us a pull request once you've implemented one.

We've provided you with our vectorized and transformed data here. We can unfortunately not share the raw text data, as it contains sensitive company information (who our customers are).

To test our your own algorithm, simply add it the run.py file and run the script:

python run.py

Thanks to lampts for implementing the best performing algorithm so far, the SGDClassifier.

Leaderboard:

Algorithm Precision Recall F1 Score
SGD Classifier 0.872 0.940 0.905
Random Forest 0.845 0.915 0.878

PS: We're also experimenting with a neural net (in TensorFlow) in the nn.py file.

2. Create your own lead qualifier

To create your own lead qualifier, you'll need to get hold of company descriptions (to create your dataset). We currently use FullContact for this.

Note: We've added dummy data, so that you can run both scripts without getting errors, and to give you examples on how the sheets should look like.

This script trains an algorithm on your own input data. It expects two excel sheets named qualified and disqualified in the input folder. These sheets need to contain two columns:

  • URL
  • Description

Run the script:

python run.py

It'll dump three files into the qualify_leads project:

  • algorithm
  • vectorizer
  • tfidf_vectorizer

You're now ready to start classifying your sales leads!

This is the script that actually predicts the quality of your leads. Add an excel sheet named data in the input folder. Use the same format as the example file that's already there.

Run the script:

python run.py

It'll output an excel sheet with a column named Prediction, where 1 equals qualified and 0 equals disqualified:

Got questions? Email me at [email protected].

leadqualifier's People

Contributors

perborgen avatar greenstarmatter avatar leosartaj avatar

Watchers

James Cloos avatar Mirza Safiullah Baig avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.