Giter Club home page Giter Club logo

biglittledecoder's Introduction

Speculative Decoding with Big Little Decoder (BiLD)

This repo implements Speculative Decoding with Big Little Decoder (BiLD) on top of the HuggingFace framework.

Check out the paper for more details.

image

What is Big Little Decoder?

Big Little Decoder is a simple framework that enables faster generative inference. It can dramatically accelerate text generation by ~2x, without compromising performance on a variety of text generation scenarios. Furthermore, it is a simple plug-and-play solution that requires no training or architecture redesign.

Here's the key underlying idea:

  1. BiLD offloads the majority of simple word decisions to a smaller model, and only switches the control back to the larger model when needed.
  2. The small model "fallbacks" to the large model, when it runs into a hard-to-predict word.
  3. In case the small model makes a misstep, the larger model can "rollback" the predictions to correct the error
  4. This collaborative text generation combines the small model's fast autoregressive execution with the large model's accurate and efficient non-autoregressive execution!

Running BiLD for Machine Translation

Prerequisite

You need to prepare your own large and small models. You can either use HuggingFace's pretrained models or finetune them on your target tasks. Please refer to the HuggingFace's official instructions for more detail on loading and/or finetuning pretrained models.

Evaluation

We provide a script that evaluates BiLD on machine translation tasks: examples/pytorch/run_bild_translation.py.

BiLD evaluation command:

CUDA_VISIBLE_DEVICES=0 python run_bild_translation.py --model bild --small [small_model_path] --large [large_model_path] \
    --dataset_name iwslt2017 --dataset_config iwslt2017-de-en --source_lang de --target_lang en --bild_rollback [RB] --bild_fallback [FB]
  • This command runs bild on the IWSLT 2017 De-En translation task.
  • [small_model_path] and [large_model_path] are paths to the small and the large model, respectively (prepared as prerequisite).
  • [RB] is the rollback threshold (normally 2~5 works fine). [FB] is the fallback threshold that can have a value from 0 to 1. For more details of these two hyperparameters, please refer to our paper.

We also provide a command for running the baseline model:

CUDA_VISIBLE_DEVICES=0 python run_bild_translation.py --model [model_path] \
    --dataset_name iwslt2017 --dataset_config iwslt2017-de-en --source_lang de --target_lang en 
  • [model_path] is the path to the baseline model (e.g. [small_model_path] or [large_model_path])

Pretrained Checkpoints

We provide finetuned checkpoints that were used for the evaluations in our paper.

Dataset Model Link
IWSLT-2017-De-En mT5-small link
IWSLT-2017-De-En mT5-small (aligned) link
IWSLT-2017-De-En mT5-large link
WMT-2014-De-En mT5-small link
WMT-2014-De-En mT5-small (aligned) link
WMT-2014-De-En mT5-large link
XSUM T5-small link
XSUM T5-small (aligned) link
XSUM T5-large link
CNNDM T5-small link
CNNDM T5-small (aligned) link
CNNDM T5-large link

biglittledecoder's People

Contributors

aaugustin avatar anton-l avatar daspartho avatar erenup avatar gante avatar joeddav avatar jplu avatar julien-c avatar kamalkraj avatar lewtun avatar lukovnikov avatar lysandrejik avatar mfuntowicz avatar mrm8488 avatar narsil avatar nielsrogge avatar patil-suraj avatar patrickvonplaten avatar rlouf avatar rocketknight1 avatar sanchit-gandhi avatar sgugger avatar sshleifer avatar stancld avatar stas00 avatar stefan-it avatar stevhliu avatar thomwolf avatar victorsanh avatar ydshieh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

biglittledecoder's Issues

How to import T5_BiLD model in run_translation task.

System Info

I try from transformers.models.t5.modeling_t5 import T5_BiLDModel, but it doesn't work. I build the library from transformer repo.

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Just run the translation task. CUDA_VISIBLE_DEVICES=0 python run_bild_translation.py --model bild --small /nobackup/haozhang/BigLittleDecoder/models/smallmodel --large /nobackup/haozhang/BigLittleDecoder/models/bigmodel
--dataset_name iwslt2017 --dataset_config iwslt2017-de-en --source_lang de --target_lang en --bild_rollback 3 --bild_fallback 3

Expected behavior

It can not import the T5BiLDModel

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.