Comments (8)
Is it possible to finetune only a certain number of higher layers and leave the first x layers untouched ? Somehow switch off the backpropogation for these x layers (to prevent overfitting)
It is possible but cannot be controlled from the API. See http://caffe.berkeleyvision.org/gathered/examples/finetune_flickr_style.html and set the corresponding lr_mult
values accordingly. FYI this is not in the API mostly because in my own experience finetuning the whole weights always yields better results, including when the risk of overfit is very high.
Supposing I have a target dataset of 100k images where classes are similar to the imagenet dataset, does finetuning make more sense or training a linear classifier (SVM) using the output from the second last layer. I will be trying both out but any insight will be interesting and maybe helpful for other readers as well.
See the very useful information on this page: http://cs231n.github.io/transfer-learning/
Also is there a deepdetect mailing list for such discussions ?
gitter is the way we go, there's no in between for now. One reason is that China appears to be cut off from google groups.
from deepdetect.
@revilokeb there's no dedicated way yet through the API because my understanding is that it should not be so hard to do. This has been on my todolist though, so this issue is to be considered.
My understanding on how to do is as follows:
- copy and change the .protoxt files (including deploy) from the original model in order to prepare the new model
- copy the weights from the original model repository into the new one
- train via API (using new repository as target) and with modified learning rate (etc...) as needed
Now, still my understanding, when using the mlp
template (which I assume to not be a common case when learning from images, where finetuning has had the most proven track), simply copying the weights and using the template
parameter and options via API should recreate a model with the novel number of classes.
from deepdetect.
Actually, I forgot to document that it is possible to pass the weights
parameter to the mllib
object at service creation.
EDIT: fix location
from deepdetect.
Fixed the documentation so that the weights
parameter appears among the options for the Caffe parameters_mllib
, see http://www.deepdetect.com/api/?shell#create-a-service
@revilokeb I've successfully fine-tuned a Googlenet, so if this use-case is still useful for you, let me know if you have issues. Note that the prototxt needs to be edited by hand unfortunately at this points since the name of last fully connected layer of the net needs to be changed in order for Caffe to initialize it randomly (and re-learn it), see http://caffe.berkeleyvision.org/gathered/examples/finetune_flickr_style.html
from deepdetect.
@beniz Thanks for the update, I will have a look and report back if I am running into any issues
from deepdetect.
Added finetuning
boolean API keyword for Caffe a service creation (see API documentation), along the weights
parameter. Allows the automatic preparation of a model template for finetuning.
I am closing this issue for now as finetuning is working very well in my tests on a variety of image classification tasks.
from deepdetect.
Don't mean to open the issue but have a few questions about finetuning through deepdetect and in general.
- Is it possible to finetune only a certain number of higher layers and leave the first x layers untouched ?
Somehow switch off the backpropogation for these x layers (to prevent overfitting) - Supposing I have a target dataset of 100k images where classes are similar to the imagenet dataset, does finetuning make more sense or training a linear classifier (SVM) using the output from the second last layer. I will be trying both out but any insight will be interesting and maybe helpful for other readers as well.
Also is there a deepdetect mailing list for such discussions ?
Thanks
from deepdetect.
FYI this is not in the API mostly because in my own experience finetuning the whole weights always yields better results, including when the risk of overfit is very high.
I think you are referring to the top red line which is discussed here
thanks again @beniz
from deepdetect.
Related Issues (20)
- Inconsistent predictons using refinedet model HOT 12
- Memory leak on constant /predict requests HOT 8
- Refinedet Tensorrt prediction fails HOT 7
- Memory leak on compressed predict requests with oatpp HOT 7
- Different prediction with tensorrt on refinedet model for the version v0.18.0 HOT 3
- getting error while training, .solverstate HOT 23
- Chain predictions swapped between images HOT 2
- Simsearch query segfault when using IVF indexes, but not default/flat index HOT 6
- On object detect training call, missing either test or train list causes a segfault
- dd_client not find in this path anyone help HOT 2
- How do I do a face recognition using this? HOT 2
- DeepDetect full rewrite in Pure Java
- 'OCR' object has no attribute 'histogram_equalization' HOT 13
- "best: -1" in predict behaves differently in torch models HOT 2
- Torch v1.12 requires libcupti* but nvidia/cuda:11.6.0-cudnn8-runtime-ubuntu20.04 doesn't include it
- Race condition / pthread error when predicting
- I have error build xgboost HOT 1
- Using `true` or `false` instead of `1` or `0` for query params for status or labels returns a internal server error HOT 1
- Question about hosting the docker image HOT 4
- Graphics problem with tsne algorithm HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepdetect.