Comments (2)
Not much of an issue anymore. The minute difference are due to precision differences in floating point operations. Downcasting operations from FP32 to BF16/FP16 introduces a slight amount of numerical instability, which causes the engine to produce slightly different responses in otherwise deterministic settings.
This has not been an issue so far for downstream users, but I'll investigate a fix as soon as I'm able.
from aphrodite-engine.
Yeap, this is not a big issue in more use cases.
from aphrodite-engine.
Related Issues (20)
- Bad generation with GGUF and OpenAI api HOT 1
- [Bug]: openAI endpoint crashing on "no locator available" HOT 1
- [Bug]: Pydantic serializer issue when pinging /v1/models HOT 2
- [Bug]: `ValueError: Out of range float values are not JSON compliant` when requesting logprobs from awq model HOT 1
- [sparsetral and Qwen2idae]: support for mixtral of lora HOT 12
- [Bug]: exl2 is not auto detected HOT 2
- [Usage]: nccl and cupy problem "no cupy" and "NCCL_ERROR_UNHANDLED_CUDA_ERROR" when use TP in wsl HOT 10
- [Bug]: Issue when trying to load a AWQ model with --load-in-4bits for mixtral flavors HOT 3
- Installation fails on NAVI gpu HOT 2
- [Bug]: loading model with int8 kv cache chokes HOT 1
- [Usage]: Question about VRAM requirement and temperature HOT 2
- [Feature]: Support YiForCausalLM HOT 5
- [Misc]: Building docker container requires insane amount of memory HOT 7
- [Bug]: Outlines json guided decoding HOT 7
- [Feature]: BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences HOT 1
- [Bug]: Does --trust-remote-code work? HOT 1
- [Bug]: multi GPU crashes backend HOT 6
- [Bug]: WSL Cuda out of Memory when Trying to Load GGUF Model HOT 8
- [Usage]: load-in-4bit not load after converted, and it seem not use swap well
- [Bug]: KV Cache and Max Tokens - Lack of Consistency
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aphrodite-engine.