Comments (5)
Thank you for this interesting work. I have trained this model from scratch using medical images. When evaluating the model, all the output masks (# of masks used = 8) consistently show the same structure with different intensity. Have you seen this issue using natural images? Any idea what could cause this? Thank you.
from groupvit.
Hi @aryaabdi,
Did you manage to solve the problem you mentioned? I'm getting a similar behavior.
from groupvit.
I get the same issue. @xvjiarui Can you help please?
from groupvit.
Hi all,
Sorry for the late reply. If you are training with specific domain images, I would suggest you start with pre-training on large scale natural images first. And the contrastive loss needs large batch size and large dataset to work.
from groupvit.
I tried training from scratch and also from the pre-trained (on natural images) model. The latter performed better. However, I realized the contrastive loss is not going to be effective if the number of entities within a batch is limited. I believe @xvjiarui can use a very large batch size because the training dataset contains many different entities. For example, gcc3m contains ~16k different entities. This was not the case in my training dataset and I think that is why I was not getting the desired behavior. Hope this helps.
from groupvit.
Related Issues (20)
- Inconsistent Results from the paper
- Parallel training hangs in reduce_tensor method HOT 1
- Learnable group tokens fine-tuning on out-of-domain datasets HOT 1
- Checkpoint for the model trained just with 1 grouping stage HOT 1
- Inquiry about the Grouping Blocks HOT 2
- error when use single node and multi gpu for training HOT 5
- background threshold issues HOT 1
- How fast can this model run?
- YFCC subset taking very long to process HOT 1
- Visualization of Concepts Learnt by Group Tokens
- How to generate the mask in Figure 1 in Appendix?
- Questions about input pred label file HOT 3
- question about training setup [#GPUs].
- use gumbel softmax only for training
- How to generate image-text pairs when i want to use other datasets such as ADE20K or medical segmentation dataset? HOT 1
- Command 'apex' not found HOT 1
- AttributeError: module 'torch.nn' has no attribute 'backends'
- Open-vocabulary semantic segmentation HOT 1
- Are there any plans to retrain a version with larger data and parameter scale?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from groupvit.