Giter Club home page Giter Club logo

Comments (14)

eldar avatar eldar commented on July 23, 2024 1

Hi, can you try changing this line https://github.com/eldar/deepcut/blob/master/lib/pose/cnn_cache_features.m#L47 to caffe.set_mode_cpu(); ? I always use GPU, but it never occured to me that people might not have GPUs with large enough memory, sorry!

from deepcut.

eldar avatar eldar commented on July 23, 2024

It's actually very difficult to say from this log, what the error is. I've never seen anything like that.
So how exactly did you build caffe? "After applying the solution from issue 1799" - what was this fix?

from deepcut.

farshidfarhat avatar farshidfarhat commented on July 23, 2024

here https://github.com/eldar/deepcut-cnn/blob/9b5de9cb70a0a440311178f26fbd6984d81e5c54/models/finetune_flickr_style/solver.prototxt#L17, I uncommented the last line to solve the issue about "Cannot use GPU in CPU-only Caffe".

Actually I installed Caffe locally (without SUDO/ROOT access) on a Redhat-based cluster. I changed Makefile.config as follows based on my system config:
CXXFLAGS += -std=c++11
CPU_ONLY := 1
BLAS := mkl

I commented the following part https://github.com/eldar/deepcut-cnn/blob/9b5de9cb70a0a440311178f26fbd6984d81e5c54/src/caffe/layers/softmax_loss_vec_layer.cpp#L236-L251 similar to softmax_loss_layer.cpp by myself.

I couldn't "make solver-callback" from your instructions, as there was no "solver-callback:" in Makefile!

Also I made your change "caffe.set_mode_cpu();" in https://github.com/eldar/deepcut/blob/master/lib/pose/cnn_cache_features.m#L47

from deepcut.

eldar avatar eldar commented on July 23, 2024

"make solver-callback" - this will have to be executed not in the directory of caffe, but of directory of the solver.

Can you run the CNN-only demo as described here: https://github.com/eldar/deepcut-cnn/#installation-instructions
adding the use_cpu flag like so:

python ./pose_demo.py image.png --out_name=prediction

This will ensure that you got the CNN running, at the very least.

from deepcut.

farshidfarhat avatar farshidfarhat commented on July 23, 2024

After debugging, I could run "python ./pose_demo.py image.png --out_name=prediction".
But "make solver-callback" gives the following log:
[ 50%] Building CXX object CMakeFiles/solver-callback.dir/src/pose/research/solver-callback.cxx.o
cc1plus: error: unrecognized command line option "-std=c++11"
make[3]: *** [CMakeFiles/solver-callback.dir/src/pose/research/solver-callback.cxx.o] Error 1
make[2]: *** [CMakeFiles/solver-callback.dir/all] Error 2
make[1]: *** [CMakeFiles/solver-callback.dir/rule] Error 2
make: *** [solver-callback] Error 2

from deepcut.

farshidfarhat avatar farshidfarhat commented on July 23, 2024

I used this command to solve the above error:

cmake . -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=c++ -DGUROBI_ROOT_DIR=/usr/global/gurobi/gurobi651/linux64 -DGUROBI_VERSION=65

GCC and GUROBI should be compatible in this case.
Finally I made it on my system.

make.err.txt

from deepcut.

farshidfarhat avatar farshidfarhat commented on July 23, 2024

Segmentation fault after running the demo:

...
I1020 11:20:43.944026 15336 net.cpp:228] conv1 does not need backward computation.
I1020 11:20:43.944032 15336 net.cpp:270] This network produces output loc_pred
I1020 11:20:43.944036 15336 net.cpp:270] This network produces output next_pred
I1020 11:20:43.944042 15336 net.cpp:270] This network produces output prob
I1020 11:20:43.944288 15336 net.cpp:283] Network initialization done.
I1020 11:20:44.850095 15336 net.cpp:816] Ignoring source layer data
I1020 11:20:44.850126 15336 net.cpp:816] Ignoring source layer label_data_1_split
I1020 11:20:44.902542 15336 net.cpp:816] Ignoring source layer res4b4_up_pose
I1020 11:20:44.902570 15336 net.cpp:816] Ignoring source layer crop_res4b4
I1020 11:20:44.902576 15336 net.cpp:816] Ignoring source layer loss_part_res4b4
I1020 11:20:44.902582 15336 net.cpp:816] Ignoring source layer res4b12_up_pose
I1020 11:20:44.902587 15336 net.cpp:816] Ignoring source layer crop_res4b12
I1020 11:20:44.902593 15336 net.cpp:816] Ignoring source layer loss_part_res4b12
I1020 11:20:44.902909 15336 net.cpp:816] Ignoring source layer loss_part_res5c
I1020 11:20:44.903682 15336 net.cpp:816] Ignoring source layer loss_loc
I1020 11:20:44.912511 15336 net.cpp:816] Ignoring source layer loss_next
save dir /gpfs/work/f/fuf111/deepcut/data/mpii-multiperson/scoremaps/test
testing from net file /gpfs/work/f/fuf111/deepcut/data/caffe-models/ResNet-101-mpii-multiperson.caffemodel
deepcut: test (MPII multiperson test) 2/1758
/usr/global/matlab/R2015a/bin/matlab: line 1: 15216 Segmentation fault pbs_taskset matlab-bin $@

from deepcut.

eldar avatar eldar commented on July 23, 2024

Hey, I can't see from the log what exactly is the problem, but it could be that you didn't set the gurobi license file appropriately. This is where the location is set in the code https://github.com/eldar/deepcut/blob/master/lib/pose/exp_params.m#L18, you can modify it. You can obtain the academic license for free from Gurobi website.

P.S. In the next couple of days we will update the repository with completely new solver, that runs fast and also doesn't require any license.

from deepcut.

farshidfarhat avatar farshidfarhat commented on July 23, 2024

Hi Eldar,

Thanks for your reply.
Actually I did all the instructions as you posted in README.md as well as Gurobi license.
I don't know Matlab version matters or not. But there is an error when I run ./start_matlab.sh as:

                                                           < M A T L A B (R) >
                                                 Copyright 1984-2015 The MathWorks, Inc.
                                                 R2015a (8.5.0.197613) 64-bit (glnxa64)
                                                            February 12, 2015

To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.

Pose startup done

Academic License

Error using dbstop
Not enough input arguments.

from deepcut.

eldar avatar eldar commented on July 23, 2024

Can you modify start_matlab.sh script or just start it with this command instead?

LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libstdc++.so.6 matlab

from deepcut.

farshidfarhat avatar farshidfarhat commented on July 23, 2024

Yes. I ran "dbstop if error" later inside Matlab, and the error is as follows:

...
I1021 11:12:10.756536 2446 net.cpp:270] This network produces output next_pred
I1021 11:12:10.756551 2446 net.cpp:270] This network produces output prob
I1021 11:12:10.757047 2446 net.cpp:283] Network initialization done.
Unexpected Standard exception from MEX file.
What() is:basic_string::append
..

Error in caffe.Net/copy_from (line 123)
caffe_('net_copy_from', self.hNet_self, weights_file);

Error in caffe.get_net (line 34)
net.copy_from(weights_file);

Error in caffe.Net (line 31)
self = caffe.get_net(varargin{:});

Error in cnn_cache_features (line 52)
net = caffe.Net(net_def_file, net_bin_file, 'test');

Error in demo_multiperson (line 9)
cnn_cache_features( experiment_index, 'test', image_index, 1);

123 caffe_('net_copy_from', self.hNet_self, weights_file);

from deepcut.

eldar avatar eldar commented on July 23, 2024

Can you stop the debugger on this line:

Error in cnn_cache_features (line 52)
net = caffe.Net(net_def_file, net_bin_file, 'test');

and check if net_def_file points to existing model definition file (somewhere in /models) and net_bin_file points to correct caffe binary weights fiel (something.caffe)?

from deepcut.

farshidfarhat avatar farshidfarhat commented on July 23, 2024

It seems fine! May it be related to copy a huge model file?

...

Cleared 0 solvers and 0 stand-alone nets
52 net = caffe.Net(net_def_file, net_bin_file, 'test');

K>> net_def_file
net_def_file =
/gpfs/work/f/fuf111/deepcut/models/ResNet-101-FCN_out_14_sigmoid_locreg_allpairs_test.prototxt

K>> net_bin_file
net_bin_file =
/gpfs/work/f/fuf111/deepcut/data/caffe-models/ResNet-101-mpii-multiperson.caffemodel

from deepcut.

eldar avatar eldar commented on July 23, 2024

Sorry, it's quite difficult to say what's wrong without proper error log. The model definitely fits on a 12Gb GPU. Maybe the file was corrupted during download? Here's the hash for mine:

deepercut-models$ md5sum ResNet-101-mpii-multiperson.caffemodel
a1aa7fb45c4f1a0e90087d6ddac24cf1  ResNet-101-mpii-multiperson.caffemodel

from deepcut.

Related Issues (16)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.