Giter Club home page Giter Club logo

aera's Introduction

Distilling ChatGPT for Explainable Automated Student Answer Assessment

Jiazheng Li, Lin Gui, Yuxiang Zhou, David West, Cesare Aloisi and Yulan He

Current Version GitHub contributors GitHub stars GitHub forks

Abstract

Providing explainable and faithful feedback is crucial for automated student answer assessment. In this paper, we introduce a novel framework that explores using ChatGPT, a cutting-edge large language model, for the concurrent tasks of student answer scoring and rationale generation. We identify the appropriate instructions by prompting ChatGPT with different templates to collect the rationales, where inconsistent rationales are refined to align with marking standards. The refined ChatGPT outputs enable us to fine-tune a smaller language model that simultaneously assesses student answers and provides rationales. Extensive experiments on the benchmark dataset show that the proposed method improves the overall QWK score by 11% compared to ChatGPT. Furthermore, our thorough analysis and human evaluation demonstrate that the rationales generated by our proposed method are comparable to those of ChatGPT. Our approach provides a viable solution to achieve explainable automated assessment in education.

Getting Started

Project structure:

CUE
├── README.md
├── environment.yml     # Conda environment     
├── train.py            # Train rationale generation model
├── train_cls.py        # Train classification baseline
├── dataset
│   ├── ...
│   └── README.md       # Pls read this for dataset detail
└── aera
    └── ...

Creating an environment from an environment.yml file

conda env create -f environment.yml

Define the path you save models

In the constants.py, you will find two constant paths defined at the top:

DATAFOLDER = "/path/to/the/folder/"
CACHEFOLDER = "/path/to/the/folder/transfomers_cache/"

The DATAFOLDER is the path to save all the trianed models and the CACHEFOLDER is the path used to store transformers package's cached models.

An example to use AERA

Finetune classification baseline

python train_cls.py -d $dataset_name -b $batch_size -e $num_of_epochs -m $base_model -p $output_path -r $rounds_to_train

Example:

python train_cls.py -d asap-1 -b 16 -e 30 -m bert-base-uncased -p bert-base-uncased-asap-1 -r 5

Finetune rationale generation model

python train.py -d $dataset_name -b $batch_size -e $num_of_epochs -p $output_path -r $rounds_to_train

Example:

python train_cls.py -d asap-1 -b 8 -e 30 -p longt5_large-asap-1 -r 5

Cite our work

@misc{li2023distilling,
      title={Distilling ChatGPT for Explainable Automated Student Answer Assessment}, 
      author={Jiazheng Li and Lin Gui and Yuxiang Zhou and David West and Cesare Aloisi and Yulan He},
      year={2023},
      eprint={2305.12962},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

aera's People

Stargazers

 avatar  avatar

Watchers

 avatar

aera's Issues

Label refinement issue with gpt-4-turbo-preview

When utilising gpt-4-turbo-preview for rationale generation, there arises an occasional occurrence where the model disregards the provided ground truth label hint and generates its own justification. E.g., "0 points; 2.5 points. The student's answer begins to touch on several key elements but does not fully or clearly articulate them according to the criteria."
Interestingly, when employing latest got-3.5-turbo on the same dataset, such deviations from the provided hint have not been encountered.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.