Giter Club home page Giter Club logo

Comments (3)

atiorh avatar atiorh commented on May 11, 2024 1

Note that the Core ML decoder is unable to do a forward pass with multiple tokens in one forward pass at the moment so we need to "decode the prompt" one 1 token at a time. This will likely result in a slowdown in the short term for long prompts. When we bring up the MLX backend, it shouldn't be a problem at all.

from whisperkit.

atiorh avatar atiorh commented on May 11, 2024

This is the only remaining feature before we are feature-complete with respect to the OpenAI API. We will implement this before 1.0. Thank you for bringing this up!

from whisperkit.

ZachNagengast avatar ZachNagengast commented on May 11, 2024

Yep, good callout @ldenoue, this is definitely needed for parity, and we have been tracking todos for when it is available.

We built a look-up table to address this for the common task and language combinations which is the usePrefillCache option, but arbitrary text prompts will require either generating the cache 1 token at a time like @atiorh mentioned, or a new model that can generate prompt caches in a single forward pass, which will likely come from integrating MLX #33. See this thread for a similar discussion of the issue huggingface/transformers#23845 (comment)

In the meantime the simplest way to go about this would be opening up the prefill prompt tokens to be set via DecodingOptions directly, which would enable arbitrary prompts including startofprev and custom vocabulary words as you requested, but would require a forward pass for each token, what do you think?

from whisperkit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.