Giter Club home page Giter Club logo

big5-project's Introduction

Big5-Project

Data

First Impressions Data (2017) :

https://chalearnlap.cvc.uab.cat/dataset/24/description/

  • train-1~6.zip
  • val-1~2.zip
  • test-1~2.zip
  • gt : 'extraversion', 'neuroticism', 'agreeableness', 'conscientiousness', 'openness'

each zip folder contains about 960 clips, clips length = 15s


Google Drive Data Description

  • Preprocessed data
    • data_v1: 6 random frame per video, frame size 128
      • trian_set.dat (train-1.zip)(734)
      • valid_set.dat (val-1.zip)(764)
      • test_set.dat (test-1~2.zip)(1845)
    • data_v2: 6 random frame per video, frame size 128
      • train_set.dat(train-2~6.zip)(4300)
    • data_v3: 6 random frame per video, frame size 128, try not drop any videos
      • train_set.dat(train-2~6.zip)(5034)
      • valid_set.dat (val-1.zip)(960)
      • valid_set_2.dat (val-2.zip)(960)
      • test_set.dat (test-1~2.zip)(1998)
    • latest_data
      • 30_256px: train, valid data are saved individually using video name. 30 frames (等距離的切), frame size 256.
  • Train
    • train: all videos in train-1.zip (960)
    • train13: all videos in train-2~6.zip (5040)
    • train_6000: all videos in trainset (6000)
  • Valid
    • valid: all videos in val-1.zip (960)
    • valid_2: all videos in val-2.zip (960)
    • valid_small: few vidoes from val-1.zip (960)
  • Test
    • test: all videos in testset (2000)

Main

  • big5.ipynb
    • regression
    • criterion: accuracy=1-MAE
  • big5_classification.ipynb
    • classification
    • criterion: accuracy, precision, recall

Code Cell Description

  • extract folder
    • only for extracting folders in Drive
  • Packages*
    • remember to rerun if colab shut down
  • Configuration*
    • settings for model, training, preprocess data...
    • define root directory
    • define checkpoint & logs name: date, note
  • Helper
    • functions to detect face, extract audio and frames
    • import dlib
  • Build New Dataset
    • use functions in helper to build traian, valid, test dataset
    • list of tuples save as .dat
    • never run it if you are using .dat
  • Dataloader*
    • pytorch Dataset for latter use
      • dim: input channel, default = 1
    • return
        sample = {'images': images,             # size: N frames, H, W
                  'label': float(class_label),  # 0. or 1.  
                  'audio': audio,               # None now
                  'uid': uid,                   # video name
                  'value': org_value}           # origin annotation value 
                                                # in big5.ipynb, 'label' is original 5 value
    • augmentation (horizon flip)
  • Load data
    • read .dat
    • data structure in .dat
    # update 11/1
    # new structure: only record one preprocessed data
    sample1 tuple(array[(30*256*256)]
    
    ##########
    # old structure: each video and true labels in a list
    list[
      # np arrays: preprocessed images, ground truth big 5
      sample1 tuple(array[(6*128*128)], array[(5)]),
      sample2 tuple(array[(6*128*128)], array[(5)]),
      sample3 ...
      ...
    ]  
    • in classification.ipynb, find Q40 & Q60 to filter 20% data in the center
    • no need to filter them in regression
  • Model*
    • model classes and helper function
    • CNN block
    • resnet
    • it is OK to run all of them
    • model input: Batch size, Channel(1 for gray scale), Depth(N frame), H, W
  • Build model
    • only choose one of the model to build
      • CNN Block
      • Resnet
      • Pretrain
    • Then run Optimizer & Checkpoint
  • Train Helper*
    • loss function
    • train & valid module: input batches into model
  • Train
    • Tensorboard for monitoring the training process, may need to wait a bit for it to show up
    • main: do serveral time of training & validing (epoch)
  • Test
    • Run cells with * before testing: Packages, Config, Dataloader, Model, Train Helper
    • Then build test dataloader and model to inference

Resource

Code Source: https://github.com/grimmdaniel/personality-trait-prediction

Competition: Chalearn 2017 Looking at People CVPR/IJCNN Competition

Leaderboard(Results), Evaluation Criterion: Platform

big5-project's People

Contributors

jennahsuan avatar

Stargazers

Derek Lu avatar

Watchers

 avatar

Forkers

isabelle1007

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.