Comments (1)
You mean that you have 3x slower step time? Or is it 3x slower to target accuracy? The first would be unexpected, but I wouldn't know why that is the case without examining the training with a profiler.
In general, you could not expect to have the same performance when going from float32 to bfloat16. With ViTs we found that the first Adam momentum can safely be kept in bfloat16 (example config), but the second moment and the model weights need to be kept in float32.
from big_vision.
Related Issues (20)
- Question: Will SigLIP / SigLiT be added to this codebase? HOT 5
- About RL fine-tuning code release HOT 2
- Question: Updating mask in classification evaluator HOT 1
- questions about t-SNE visualization in FlexiViT HOT 1
- requirements issue HOT 2
- Announcement: big_vision is transitioning from jax.pmap to jax.jit. HOT 1
- Gradient accumulation
- Is there Pytorch version CLIPPO? HOT 2
- Question about SigLIP HOT 7
- Memory Efficient Attention integration HOT 2
- Question about ViT-augreg ("How to train?") fine-tuning transfer HOT 2
- Contrastive Input Pipeline HOT 2
- [BUG] in big_vision.models.proj.flexi.vit HOT 1
- Reproduced result for flexivit HOT 2
- SigLIP and canonicalize
- Text lowering issue
- Negative rho values in GSAM training HOT 1
- FlexiVit is also flexible with image resolution?
- Load ViT with CLIPPO Weights HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from big_vision.