Giter Club home page Giter Club logo

debuggers's Introduction

Debuggers

Bancolombia Dataton Code from the Kaggle Competition, where you can see the rules and download the data that was given to solve the problem

You can see the IDE requirements for python and R used

How much is the family monthly expense of each Bancolombia customers?

To face this problem we explore three ways to master the data and make the final dataset to training the models

Data Engineering

This problem was made thank to the capacity of this machine

We fist have to fix the numerical and the categorical data, and make some categorical ranks to label the data focused on the objective variable, but here you can see a kind of EDA

  1. Aggregate all data per user id

Here is the code used to reach this objective. At the very beggining of this process the idea was to set the money variables in dic-20 value here based on the IPC

  1. Aggregate all data per month

Here is the code used to reach this objective

  1. Take historical data along the timeline

Here is the code to reach this objective

In each case we have to impute all the missing values with the PPCA R method here and verificate the results here

Data Science

To modeling the problem we follow this steps

  1. Correlations analysis, this was the one done in the data per id

  2. Linear regression

  1. PCA
Method Scale
user Id z-norm
month z-norm
all timeline z-norm
all timeline Centred
  1. Tree algorithms

  2. Neuronal Networs (1st, 2nd, 3rd, 4th, 5th, 6th, 7th, 8th, 9th, 10th, 11th, 12th, 13th, 14th)

At the end the best results was the one's reached with the all timeline strategy because in this case it wasn't necessary to chage any at all about the objetive variable, but the time's up and we didn't win but we learned too much!

Variables

debuggers's People

Contributors

dlesmes avatar oxiboy avatar camilahurtadoa15 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.