Giter Club home page Giter Club logo

Comments (4)

atiorh avatar atiorh commented on June 20, 2024

Thanks for the note @coder543! medium models were hitting an edge case with the Neural Engine that we triaged away for now. Technically, you can still use https://github.com/argmaxinc/whisperkittools to prepare the medium and medium.en model assets and use them with cpuAndGPU compute units without issues. We decided not to fork availability of models across different compute units and preserve a non-leaky abstraction for seamless switching.

from whisperkit.

coder543 avatar coder543 commented on June 20, 2024

Gotcha, that is unfortunate, since in my extensive testing of other Whisper apps on iPhone, the Medium model is the best one that can realistically run in real time over long durations. But, small is pretty good too, I guess!

from whisperkit.

atiorh avatar atiorh commented on June 20, 2024

@coder543 Have you noticed large models being too slow? Would be great to get an example audio/video where it falls back in streaming mode on iPhone 12+. We are always looking to improve based on feedback and we can follow up when we improve performance.

from whisperkit.

coder543 avatar coder543 commented on June 20, 2024

On the 15 Pro Max that I have, the large models run at an RTF of slightly greater than 1, and they’re just slow in general. The medium models are half the size, so they are just about perfect. When I’ve tested things more in Hello Transcribe over the months, the large models are tolerable and seem to barely keep up with real time… but I prefer the balance that medium provides, if I’m running a model on my phone. (On a powerful desktop, the large models are great.)

I didn’t spend too much time trying the distil models, but I’ve had mixed feeling about the accuracy of those models in past testing.

from whisperkit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.