woctezuma / finetune-detr Goto Github PK
View Code? Open in Web Editor NEWFine-tune Facebook's DETR (DEtection TRansformer) on Colaboratory.
License: MIT License
Fine-tune Facebook's DETR (DEtection TRansformer) on Colaboratory.
License: MIT License
It consist in a 8 (9 with non-object class) classes dataset. (https://bdd-data.berkeley.edu/)
First I transformed it to the COCO format and then followed the collab notebook.
I finetuned it for 10 epochs with standard learning rates and I get no convergence at all.
Would that be an implementation error of my part or do you think it is normal, have you tuned other datasets with more than 2 classes?
Any suggestions? I find it really bizarre if it doesn't converge since most of the classes are overlapping with the original model.
Hello @woctezuma
Thanks a lot for your valuable contributions. I am working on deformable detr. But there isn't much help with fine-tuning part. I came across your fine-tuning detr notebook, and it's amazing.
I wanted to ask about, what changes you made in your Detr fork, to accommodate fine-tuning? If I delete the class_embed.weight and class_embed.bias layers in deformable detr, it gives size mismatch error.
I was hoping, if you could brief, what changes did you make in original detr model, so that I can use them for deformable detr.
It will be a huge help. Looking forward to your response!
How we can fine-tune a dataset that contains quadrilateral (8-points) bounding boxes?
Hey there!
Thank you so much for sharing quality tutorials and codes.
I'd like to know the exact parts that are being updated during the fine-tuning stage.
Is the classification head (class_embed) the only one that are updated and the backbone and the rest of the network are not?
I wonder if the revised codes are all stated in your gist (https://gist.github.com/woctezuma/e9f8f9fe1737987351582e9441c46b5d) or are there other parts that you have fixed for fine-tuning?
If that's the case, I wonder how you froze the whole network except for the last class_embed fc layer.
Thanks!
Hi, I'm sorry, I am confused by this statement:
Fine-tuning is recommended if your dataset has less than 10k images. Otherwise, training from scratch would be an option.
I know you're just repeating the advice of the DETR team here, but I hope you can help me clarify my understanding.
Wouldn't the model always benefit from transfer learning conceptually? That is, wouldn't first fine-tuning on COCO then fine-tuning on a custom dataset always be better than fine-tuning on the custom dataset directly from scratch, regardless of the size of the custom dataset?
I thought that was the whole point of transfer learning... Or is it because we are using a backbone pretrained on ImageNet (which is huge), so fine-tuning on COCO, on another custom dataset, or on both COCO and custom, does not make a big difference?
Many thanks in advance for your thoughts
Hi I am getting the following import error:
Traceback (most recent call last):
File "main.py", line 13, in <module>
import datasets
File "/content/detr/datasets/__init__.py", line 5, in <module>
from .coco import build as build_coco
File "/content/detr/datasets/coco.py", line 14, in <module>
import datasets.transforms as T
File "/content/detr/datasets/transforms.py", line 13, in <module>
from util.misc import interpolate
File "/content/detr/util/misc.py", line 22, in <module>
from torchvision.ops import _new_empty_tensor
ImportError: cannot import name '_new_empty_tensor' from 'torchvision.ops' (/usr/local/lib/python3.7/dist-packages/torchvision/ops/__init__.py)
I also structured the data folder as you suggest:
path_detr = '/content/drive/MyDrive/detr_final'
detr_final
|-annotations #jsons
│ ├ annotations/train.json
│ └ annotations/val.json
├ train_img/ # training images
└ val_img/ #validation images
And here is my call to main.py:
!python main.py \
--dataset_file "detr_final" \
--coco_path "/content/drive/MyDrive/detr_final" \
--output_dir "outputs" \
--resume "detr-r50_no-class-head.pth" \
--num_classes $num_classes \
--epochs 10
The only code of your notebook that I skipped is what follows, but I already had coco format annotations so I thought I could.
import convert as via2coco
data_path = '/content/VIA2COCO/'
for keyword in ['train', 'val']:
input_dir = data_path + 'balloon/' + keyword + '/'
input_json = input_dir + 'via_region_data.json'
categories = ['balloon']
super_categories = ['N/A']
output_json = input_dir + 'custom_' + keyword + '.json'
print('Converting {} from VIA format to COCO format'.format(input_json))
coco_dict = via2coco.convert(
imgdir=input_dir,
annpath=input_json,
categories=categories,
super_categories=super_categories,
output_file_name=output_json,
first_class_index=first_class_index,
)
Any ideas? Thanks a lot
First I tried with batch size 2. Somewhere at the end of 1st epoch the process got killed due to memory usage.
Then I changed to batch size 1 and epoch 1. I was able to finish the training. But I observed that memory usage kept on increasing during the process. Around 25GB RAM was being used at the end.
Is it normal behavior of DETR or is this is an issue ?
Hey, I have a set of images and I want them to be on coco dataset format. Can you tell me how to do that? how to have a proper annotation file like in coco.
I had made an attempt to change the num_queries to 500 as images have approximately 450 objects.
to which i received following error.
Traceback (most recent call last):
File "main.py", line 248, in
main(args)
File "main.py", line 178, in main
model_without_ddp.load_state_dict(checkpoint['model'], strict=False)
File "/home/rsharma/git/detr/.venv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 847, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DETR:
size mismatch for query_embed.weight: copying a param with shape torch.Size([100, 256]) from checkpoint, the shape in current model is torch.Size([500, 256]).
I had followed all the steps and able to get decent results.
but as soon as I change the num_queries, I am lost.
Any help is appreciated.
The bot created this issue to inform you that pyup.io has been set up on this repo.
Once you have closed it, the bot will open pull requests for updates as soon as they are available.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.