Comments (3)
Same issue for me.
from sppo.
Same here using my adapted algorithm to train loading model in 4bit quant (https://github.com/kaykyr/SPPO):
{'loss': 139721.6528, 'learning_rate': 4.1666666666666667e-07, 'rewards/chosen': 0.0, 'rewards/rejected': 0.0, 'rewards/accuracies': 0.0, 'rewards/margins': 0.0, 'logps/rejected': -5881.09033203125, 'logps/chosen': -5823.2236328125, 'logits/rejected': -0.2584860622882843, 'logits/chosen': -0.2584828734397888, 'epoch': 0.25}
25%|█████████████████████████████████▎ | 10/40 [07:28<22:20, 44.68s/it]
from sppo.
Yes, the model does start with a big loss. Since in SPPO loss in paper, eta=1/beta, and beta=1e-3.
from sppo.
Related Issues (18)
- Is it possible to run llama 3-70B and/or mixtral 8x22b through this process? HOT 1
- ConnectionError: Couldn't reach 'synthetic_data_llama-3-8b-instruct-sppo-iter3_score' on the Hub (ConnectionError) HOT 2
- Suggestion: Gemma 2 9B and 27B. HOT 2
- ShareGPT appending
- Which version of vllm should be installed HOT 4
- Questions about the training code HOT 1
- Some packages' version are too old
- Ranking speed & training hyperparameters
- Dataset used and results in Gemma-2-9B results HOT 12
- Any chance it work on my homelab? HOT 3
- What's the package configuration for reproduce SPPO-Gemma-2? HOT 1
- Scores and probability calcuations HOT 2
- Adaptation for 4-bit Quantization Training/Responses Generation (with 2 Home GPUs) HOT 1
- Question about SPPO
- Good work
- SPPO Implementation on Axolotl!
- DPO baseline implementation
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sppo.