Comments (9)
For big batch you need more RAM memory - make sure during your proces that you have enough memory.
Also if you wish to train network you need to be sure that you have specific hyper parameters that allow you to fully train network.
Currently we are recommending batchsize in range 32 - 128. Those are configurations that are under your validation - other are experimental and we plan to extend our support in the future.
from caffe.
Oh okay. I see one of the main differences in using the intel phi over a gpu is that you have more memory and thus can achieve a larger batch size. I'll keep experimenting, but I think you are right it. The job was being killed because it was consuming too much memory.
While I have you here. Should hyperthreading speed up training time? I have found that if I use hyperthreading when I go from OMP_NUM_CORES=64 to OMP_NUM_CORES=256 Caffe actually slows down?
from caffe.
Try to use HT and dont use OMP_NUM_CORES.
Solution for HT is being optimized (mostly for multinode). We are working to provide the best performance out of the box (without any additional commands or cores restrictions). Those changes should be released in one month.
from caffe.
Hmmm. Okay. If I don't use OMP_NUM_THREADS then my job runs serially. Is HT another environment variable? There is a flag to enable hyperthreading when submitting the job but it doesn't let you specify the number of threads to use. I was following the guidelines here https://github.com/intel/caffe/wiki/Recommendations-to-achieve-best-performance that is why I was using OMP_NUM_THREADS.
Is this the place to get help with this sort of thing. Or is there a forum somewhere else I should be using?
P.S. I was accidentally logged in as my friend earlier.
from caffe.
HT is environment variable like configuration of MCDRAM.
We are focusing of research those variables and provide some best known configuration and develop new features that will use other configurations and give better performance - so our recommandation might change over time.
There is no forum. We use this github for user support.
from caffe.
Is there still anything to be discussed here? Can this issue be closed?
from caffe.
Can I just confirm before you do. What do you think may be causing the issue of serial execution when I don't specify OMP_NUM_THREADS?
from caffe.
Could you describe this issue? We are moving here from one topic to another. The best way will be to use separate threads for each issue.
from caffe.
Of course. Above you suggested "dont use OMP_NUM_CORES." So I tried running without setting OMP_NUM_THREADS and as I said above, when I do so caffe seems to run serially? Even when I have set the HT option on.
from caffe.
Related Issues (20)
- fasterrcnn bug
- How to convert .xml file into train.txt and val.txt for LMDB conversion in caffe training for object detection? HOT 1
- Strangle fault, if investigate ssd.
- Unable to Build caffe with Makefile or CMake HOT 3
- how to set up the test process with intel caffe
- 8bit quantization calibrator and fine-tuning
- pooled_height_ is undefined HOT 1
- Xbyak::Error
- Release Mode: Application Crash
- Failed to Build caffe with Makefile HOT 2
- Failed while compiling matcaffe HOT 1
- cannot find -l1
- Not able to use this caffe version in multi threading environment.
- All Travis CI pull request tests are failing or erroring with the current version of code
- Allow user to control warnings-as-errors behaviour
- CMake bug prevents successful rebuild
- python Net API is Invalid
- memory leak with vaviable input size
- not able to run docker, fails on "caffe": executable file not found in $PATH: unknown.
- how to calculate the mAP with this version of Caffe
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from caffe.