Comments (7)
We did not explore the DPO with LLaVA models. Could you share your results and example outputs before/after DPO so we can dig into it?
from vlfeedback.
The following are the results for MME benchmark.
MME score { perception, cognition, ocr }
LLaVA-v1.5-7B with DPO {1342, 313, 125}
LLaVA-v1.5-13B with DPO {1425, 312, 130}
from vlfeedback.
How many epochs have your trained with DPO?
from vlfeedback.
Above results are from 1 epoch training for 7B model and 3 epoch training for 13B model.
from vlfeedback.
I'm sorry for not getting back to you sooner. We also recently explored performing DPO training on the LLaVA backbone and observed degraded MME performance. However, the scores on other benchmarks have consistently improved.
Model | MM-Vet | MMHal | MMBench |
---|---|---|---|
LLaVA-v1.5-7B | 30.5 | 2.42 | 63.0 |
LLaVA-v1.5-7B + DPO | 31.7 | 2.62 | 63.9 |
We attribute that the simple answer format required by MME cannot be followed by the model after DPO training, and would like to investigate it later.
from vlfeedback.
may be you can add a prompt like this query = f'<img>{img_path}</img>\n{question} you can only use "Yes" or "No" as your responses without adding any extra text or explanation.
from vlfeedback.
Hi all, we found a great repo with the support/results of many other models: https://github.com/TideDra/VL-RLHF
The performance can be boosted almost consistently for LLaVA-Next series models. So my guess is that the current LLaVA-v1.5 series model is too weak to serve as a starting model for DPO ( possibly due to its lower resolution 336 v.s. Qwen-VL). LLaVA-Next series is more powerful with the image tiling mechanism.
Check it out if you want to further explore the DPO/RLHF with VLFeedback!
from vlfeedback.
Related Issues (7)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vlfeedback.