Giter Club home page Giter Club logo

wajdibensaad / kaggle_customer_transation_prediction Goto Github PK

View Code? Open in Web Editor NEW
3.0 1.0 0.0 1.41 MB

My Final Submission for the 'Santander Customer Transaction Prediction'. I have participated in this very tough and interesting competition on Kaggle a while ago and I finally got the time to put all the work together in this Repo.

Jupyter Notebook 100.00%
kaggle jupyter-notebook predictive-modeling lightgbm exploratory-data-analysis python analytics data-science banking customer-analytics

kaggle_customer_transation_prediction's Introduction

Kaggle Competition : 'Santander Customer Transation Prediction'

made-with-python Maintenance Ask Me Anything ! Open Source Love png1

My Final Submission for the 'Santander Customer Transaction Prediction':

In this repo, I assamble some of the work I did during an interesting ( and a very tough) Kaggle competition.

Here is the official link of the competition on Kaggle..

It was a true learning experience for me to participate in the challenge. What mad it a special competition is the number of talented and smart participants for all over the world. I would particularly mention all the Kaggle masters and grand-masters that were driving the challenge to higher levels and providing ideas and hints all along the way.

Here is a part of the description provided by the competition hosts:

"... In this challenge, we invite Kagglers to help us identify which customers will make a specific transaction in the future, irrespective of the amount of money transacted. The data provided for this competition has the same structure as the real data we have available to solve this problem. "

1-Exploratory Data Analysis Notebook

In this notebook, I tried to go though the data and see if I could notice a certain pattern or an interesting trend. It was one of the most interesting phases of this competition because:

  • The data was 'clean' : no heavy work was required to put the variables into shape.
  • The data was synthetic : the data set was not a real world production data, but it was generated by an algorithm to simulate the behavior of customers and to be as close as possible to the actual ‘Santander’ customer data.
  • Almost every participant was stuck at a certain performance threshold : it was very hard to enhance the model beyond a certain performance point.

2-LightGBM model with Data Augmentation

I have experimented with various models and technics, But the model that had the highest performance point was the LightGBM. Stacking and bleing was also a huge part of the top 1% winning solutions. However, I have tried to keep it simple and to get though it step by step and understand how the data is behaving after passing through each different model.

One of the 'magic' ideas that were discussed in the competition forum was feature engineering and especially data augmentation. Other feature engineering ideas were applied, such as creating 100s of new variables as a blend of existing variables, and doing all possible and imaginable combinations.

3-Other ideas that did not work (Work in progress)

Here, I will try to assamble all (most) of the ideas that I have tried but did not work.

It was mostly different models (XGBoost, Regressions, Basic Neural Network models... ect.)

Info:

I could not share the competition data due to the competition rules. The competition host requires an explicit acceptance of the competition rules by the user before having access to the data set. To be able to get the competition data, you should have a kaggle account, access the competition page and agree on the competition rules.

kaggle_customer_transation_prediction's People

Contributors

wajdibensaad avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

kaggle_customer_transation_prediction's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.