Giter Club home page Giter Club logo

Comments (7)

bazilas avatar bazilas commented on July 17, 2024 1

you could also check the following references for more tips:

http://research.microsoft.com/pubs/192769/tricks-2012.pdf (gradient optimization tips)

http://www.springer.com/computer/theoretical+computer+science/book/978-3-642-35288-1 (if you don't have access to the book, you can find the individually the papers)

from matconvnet.

lenck avatar lenck commented on July 17, 2024

Hi,

what is a common practice is that on the validation set you tune hyperparameters - which is in case of CNNs the structure of the network itself. Because you are effectively optimising these parameters 'by hand' it may be prone to overfit on the validation dataset.

That is why most of the challenges do have seperate test set where the ground truth labels are not known and which is usually done on a remote server of the organiser. In this way, the organiser is able to limit the number of evaluation on the test set preventing any further over-fitting (in compute vision this is the case of Pascal VOC and ILSVRC datasets).

And if you want to prevent overfitting on the validation dataset, of course you can do any sort of cross validation. It's just that you usually do not have access to test labels...

There is also a lot of other resources on the internets describing this, e.g. this question...

from matconvnet.

johnny5550822 avatar johnny5550822 commented on July 17, 2024

Great. Sorry, being dumb. My question should have been avoided. Thanks

But, one questions, what parameters can I tune by validation set? Like learning rate, hidden layer etc?

from matconvnet.

lenck avatar lenck commented on July 17, 2024

Why sorry, I had no clue about this as well a year ago, so I'm quite happy to help :) But what I found is that google usually knows the most in the end...
Yeah, exactly what you say are the hyperparameters which people usually tune... Hyperparameters as those are parameters influencing the training and the model architecture (number of hidden layers, number of hidden units, size of the convolution kernels and their stride, max pooling sizes, ... it's gazilion of them in the end...).

from matconvnet.

johnny5550822 avatar johnny5550822 commented on July 17, 2024

Great! Thanks a lot. I am refreshing memory and learn a lot too. :)

from matconvnet.

johnny5550822 avatar johnny5550822 commented on July 17, 2024

thanks!

from matconvnet.

lenck avatar lenck commented on July 17, 2024

Thanks for the tips too! Haven't heard about the "Neural Networks: Tricks of the Trade" book before...

from matconvnet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.