Giter Club home page Giter Club logo

Comments (6)

Muennighoff avatar Muennighoff commented on May 18, 2024

Hey there are some more details on mT0 fine-tuning here: #12
The config is here: #6 (comment)

from xmtf.

sh0tcall3r avatar sh0tcall3r commented on May 18, 2024

Thanks for reply! Will try mentioned config.

from xmtf.

sh0tcall3r avatar sh0tcall3r commented on May 18, 2024

Hey @Muennighoff , It's seems that I still can't get a couple of things. Would be very appreciate If you could give me a hand here.
Well, I need to finetune your model mT0-xxl (not the initial T5X-xxl), so accordingly to the manual https://github.com/google-research/t5x/blob/main/docs/usage/finetune.md I need 3 components (excluded SeqIO Task, which is clear as for now) to proceed:

  1. Checkpoint -- Could you please share with mT0-xxl checkpoint? In the manual all used checkpoints are the TensorFlow weights etc, but on the HuggingFace there are only PyTorch weights. So I do need either mT0-xxl checkpoint in TensorFlow or finetune the model in PyTorch (is it even possible?)
  2. Gin file for the model to finetune (mT0-xxl in the case) -- Could I use the default one like https://github.com/google-research/t5x/blob/main/t5x/examples/t5/mt5/xxl.gin?
  3. Gin file configuring finetuning process -- I write it by my own based on https://github.com/google-research/t5x/blob/main/t5x/configs/runs/finetune.gin with some overrides, right?
    Please, correct me if I wrong in some points.

from xmtf.

Muennighoff avatar Muennighoff commented on May 18, 2024

There's a t5x ckpt here: https://huggingface.co/bigscience/mt0-t5x
I don't remember which size that model is though; I don't have the other ones anymore, maybe @adarob does

For 2. & 3., yes I think so

from xmtf.

adarob avatar adarob commented on May 18, 2024

from xmtf.

sh0tcall3r avatar sh0tcall3r commented on May 18, 2024

Thanks a lot, guys!

from xmtf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.