Giter Club home page Giter Club logo

genread's Introduction

Code for GenRead: Genrate rather than Retrieve!

Introduction & Setup

  • This is the official implementation of our pre-print paper "Generate rather than Retrieve: Large Language Models are Strong Context Generators", in ICLR 2023 [OpenReview] [arXiv].

  • Create an environment and install openai package via pip install openai.

  • Add your OpenAI API key at openai.api_key (line 12) in inference.py

Download the Datasets

  • From their official websites: [NQ/TriviaQA/WebQ] / [FM2] / [FEVER/Wizard]

  • From Google drive: (we unified the formats of the above datasets) [link]

  • Please put them into indataset folder. Now it contains webq and fm2.

Zero-shot Setting

Step1: generate background document.

python mainfunc.py 
  --dataset {dataset} 
  --task step1 
  --split test
  • Note: we use the text-davinci-002 in our experiment; we use greedy search in the zero-shot setting, to ensure the reproducibility of our experiments.

  • Note: if you have limited access to OpenAI API, you could directly use our outputs, without spending money on reproducing our experiments. [zero-shot: step1]

Step2: infer answer from document.

python mainfunc.py 
  --dataset {dataset} 
  --task step2 
  --split test
  • Trick: we remove the \n in the generated documents.

  • Note: if you have limited access to OpenAI API, you could directly use our outputs, without spending money on reproducing our experiments. [zero-shot: step2]

Supervised Setting

Method1: use sampling to generate multiple documents.

python mainfunc.py 
  --dataset {dataset} 
  --task step1 
  --split test 
  --num_sequence 10 
  --temperature 0.95
  • We note that when decoding with sample-based methods, the outputs may be different each time. So we cannot guarantee that your output will be exactly the same as the one we provide. [supervised: sampling]

Method2: use clustering to generate diverse documents.

python clusterfunc.py 
  --dataset {dataset} 
  --task step1 
  --split {split} 
  --num_sequence 1 
  --temperature 0.95 
  --clustering
  • We note that when using different in-context demonstrations, the outputs may be different each time. So we cannot guarantee that your output will be exactly the same as the one we provide. [supervised: clustering]

Fusion-in-decoder: train a reader model to infer answer from documents

  • We use the FiD code from its official GitHub repository [link].

  • Download our trained FiD checkpoint at Huggingface Hub.

    git lfs install
    git clone https://huggingface.co/wyu1/GenRead-3B-NQ
    
    git lfs install
    git clone https://huggingface.co/wyu1/GenRead-3B-TQA
    
  • If you need checkpoints on other settings, please email [email protected]

Citation

@inproceedings{yu2023generate,
  title={Generate rather than retrieve: Large language models are strong context generators},
  author={Yu, Wenhao and Iter, Dan and Wang, Shuohang and Xu, Yichong and Ju, Mingxuan and Sanyal, Soumya and Zhu, Chenguang and Zeng, Michael and Jiang, Meng},
  booktitle={International Conference for Learning Representation (ICLR)},
  year={2023}
}

Please kindly cite our paper if you find this paper and the codes helpful.

genread's People

Contributors

wyu97 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.