Giter Club home page Giter Club logo

Comments (4)

pdevine avatar pdevine commented on June 30, 2024 4

@UmutAlihan we've actually been building out a test farm to better catch these issues before we release, but there are a lot of different permutations to test. Stability is incredibly important to us.

That said, in 0.1.33 we were trying to improve our memory calculation to more efficiently pack in models, and sometimes we weren't calculating enough space and some of the layers were being allocated to the GPU when they should have been allocated to the CPU. The problem is if we're too conservative then performance suffers because more layers get sent to the CPU and there will be a dozen issues with people complaining about slow performance.

Unfortunately I don't have a 4070 Ti Super to test out on. I think what's happening is the model is close to the size of your VRAM and we're not calculating the memory graph correctly w/ gemma. I'll double check with some other people on the team.

from ollama.

UmutAlihan avatar UmutAlihan commented on June 30, 2024 2

yes after 0.1.33 release many things have broken

unforunately I think contributors are trying to be so fast that they are unable to test in coverage or write clean quality code

I was very hopeful for the ollama and its community however if this FOMO release cycle continues to break things more I might need to turn back to LiteLLM or other alternatives :'/

from ollama.

UmutAlihan avatar UmutAlihan commented on June 30, 2024 2

well thank you for detailed response 🫡

I am using 2x 3060s and yes llama3 8b is loading to 24gb vram GPU with around 80% utilization. So I can assume that your root cause analysis is true and hope that more users would prefer stability over performance 🙏

from ollama.

oldmanjk avatar oldmanjk commented on June 30, 2024

yes after 0.1.33 release many things have broken

unforunately I think contributors are trying to be so fast that they are unable to test in coverage or write clean quality code

I was very hopeful for the ollama and its community however if this FOMO release cycle continues to break things more I might need to turn back to LiteLLM or other alternatives :'/

100%

from ollama.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.