Giter Club home page Giter Club logo

goemotions-pytorch's Introduction

Assignment 4: GoEmotions Pytorch

NOTE: Our Assignemnt is split into 2 repositories. This repository is for the Transfer Learning Experiments using Goemotions data. You can find the repository for Semi_Supervised Learning using GAN-BERT architecture at https://github.com/SamarthMM/ganbert-pytorch

Abstract

Data has become ubiquitous in the modern day and age. However, the challenge lies in acquiring truthfully annotated high quality data. In this paper, we try to look into the challenge of limited labeled training data availability for NLP sentiment analysis tasks. We talk about the Sentiment Analysis task and it's broader usage in different fields and on varied datasets. We perform an extensive literature survey on the various model architectures used for emotion classification. We perform transfer learning by using a BERT base cased model on GoEmotions dataset for zero shot and one-shot fine tuning on twitter dataset. We then investigate the viability of using a GAN model as a semi supervised technique to leverage the presence of unlabeled data.

This code has been adapted from monologg's implementation of GoEmotions. The original README has been copied over to README_1.md Pytorch Implementation of GoEmotions with Huggingface Transformers

Requirements

  • torch==1.4.0
  • transformers==2.11.0
  • attrdict==2.0.1

Hyperparameters

You can change the parameters from the json files in config directory.

Parameter
Learning rate 5e-5
Warmup proportion 0.1
Epochs 10
Max Seq Length 50
Batch size 16

How to Run

Baseline (Twitter):

python3 run_goemotions.py --taxonomy twitter

Experiment 1:

For zero shot learning:

#First train model on GoEmotions group taxonomy
python3 run_goemotions.py --taxonomy group
#Then evaluate on twitter data
python3 Zero_Shot_Prediction.py --taxonomy twitter_zeroshot

Experiment 2:

For one shot learning:

#Change "train_file": <"labeled_200.tsv"|"labeled_8000.tsv"> in config/twitter_frozenberg.json to allow training with 200 vs 8000 examples respectively
python3 Retrain_Goemotions_classifier_layer.py --taxonomy twitter_frozenbert 

Plotting Graphs:

python3 Results.py --out_dir ckpt/<taxonomy>/<checkpoint directory> --taxonomy <name_of_plots>

#FOr example, the below command creates plots for accuracy and macro f1 score using the runs saved in the checkpoint directory 'ckpt/twitter/bert-base-cased-goemotions-twitter'. The name of the plots will start with 'twitter_unfrozen'. --taxonomy option in this command is just lazy naming and does not refer to the taxonomy used to generate the checkpoint results!
python3 Results.py --out_dir ckpt/twitter/bert-base-cased-goemotions-twitter --taxonomy twitter_unfrozen

Reference

goemotions-pytorch's People

Contributors

monologg avatar samarthmm avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.