Giter Club home page Giter Club logo

icub-magic-trick's Introduction

DATA ANALYSIS AND FEATURE PREDICTION - Trial and Errors

DATASETS

Features

Early Short Response (1.5 sec after looking at the card begin)

  • data/features_none_sre.csv: No baseline
  • data/features_sub_sre.csv: Subtracted by baseline
  • data/features_div_sre.csv: Divided by baseline

Late Short Response (1.5 sec before looking at the card end)

  • data/features_none_srl.csv: No baseline
  • data/features_sub_srl.csv: Subtracted by baseline
  • data/features_div_srl.csv: Divided by baseline

For Each one, there is a problem of imalancing of the classes, in particular we have:

  • 1/5 subject old card (class 1)
  • 5/6 new card (class 0)

This imbalancing can cause a problem like: i I have to detect the 1 I'll always tell 0. So the accuracy will be very high but the classifier will be crap.

So the idea is to:

  • Rebalance the dataset
  • Use different evaluation rather than the accuracy (like the F1_score)

Dataset Rebalancing

I tried two alternatives:

  • Oversample the class 1 (with the SMOTE algorithm)
  • Aggregate the class 0 by subject in order to balance the dataset

Another problem with the oversampling is the combined use of cross-validation. If I oversample the dataset before the cross-validation (the problem is bigger with the copying method, but still present with SMOTE) There is the risk to have the same data in multiple folds. Otherwise, with SMOTE I oversample, then split the data and pretend that value is not fabricated.

The right way

icub-magic-trick's People

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.