mitmul / deeppose Goto Github PK

View Code? Open in Web Editor NEW

408.0 32.0 129.0 140 KB

DeepPose implementation in Chainer

Home Page: http://static.googleusercontent.com/media/research.google.com/ja//pubs/archive/42237.pdf

License: GNU General Public License v2.0

Python 95.37% Shell 4.63%

chainer

deeppose's Introduction

DeepPose

NOTE: This is not official implementation. Original paper is DeepPose: Human Pose Estimation via Deep Neural Networks.

Requirements

Python 3.5.1+
- Chainer 1.13.0+
- numpy 1.9+
- scikit-image 0.11.3+
- OpenCV 3.1.0+

I strongly recommend to use Anaconda environment. This repo may be able to be used in Python 2.7 environment, but I haven't tested.

Installation of dependencies

pip install chainer
pip install numpy
pip install scikit-image
# for python3
conda install -c https://conda.binstar.org/menpo opencv3
# for python2
conda install opencv

Dataset preparation

bash datasets/download.sh
python datasets/flic_dataset.py
python datasets/lsp_dataset.py
python datasets/mpii_dataset.py

MPII Dataset

MPII Human Pose Dataset
training images: 18079, test images: 6908
- test images don't have any annotations
- so we split trining imges into training/test joint set
- each joint set has
training joint set: 17928, test joint set: 1991

Start training

Starting with the prepared shells is the easiest way. If you want to run train.py with your own settings, please check the options first by python scripts/train.py --help and modify one of the following shells to customize training settings.

For FLIC Dataset

bash shells/train_flic.sh

For LSP Dataset

bash shells/train_lsp.sh

For MPII Dataset

bash shells/train_mpii.sh

GPU memory requirement

AlexNet
- batchsize: 128 -> about 2870 MiB
- batchsize: 64 -> about 1890 MiB
- batchsize: 32 (default) -> 1374 MiB
ResNet50
- batchsize: 32 -> 6877 MiB

Prediction

Will add some tools soon

deeppose's People

Contributors

Stargazers

Watchers

Forkers

chrisyang jingbim umariqb houxianxu kylemcdonald amoliu jethrotan bsed amiltonwong ryuuji5 suranus dilthoms chikuta qingsong99 barneyeldinosaurio thetesla alexismignon dotannn olivernina aabobakr caomw silasxue westamine alexanderpu keeganren ilovecv aihgf liyang90 zbxzc35 unforeseenocean saifsayed bdutta19 keinitta shiba24 deepcv ml-lab nancywang1991 boyihu jessiechouuu abolger quanweikikai shyamalschandra gingerhead22 ironbcc kawasaki2013 vyraun soledad89 yinggo unnonouno benjamesbabala lyk125 allensmile ieyer alenaliu satoshirobatofujimoto wltongxing davidduo 1165048017 iwooook datagold2017 coocoky jay-uchicago peterzs xwyangjshb hmittal657 archenroot timellemit coderx7 taichu012 nakamoo dennisleouts xiaoliu-lucy tangaggie jackspp guuuuuuuufy mbassov elevanth dbyxatu hejiaget haroldss jacksama hisakaz0 rahuja123 harryzj xflee tedmei tarekalbawab yzy-thu afcarl windzeeker hibiscuses haotman 877325778 hnulst deepdriving gherao marnim inasic feng-leaf corner4world

deeppose's Issues

cupy.cuda.runtime.CUDARuntimeError: cudaErrorMemoryAllocation: out of memory

I want to run the training on a GPU with ID 1. So to added the argument 1 to function call [ model.to_gpu(1) ] in rrain.py . While I have ~2gb available on the GPU, when I run the network with batchsize of 32, I get the following error

cupy.cuda.runtime.CUDARuntimeError: cudaErrorMemoryAllocation: out of memory

Here are the parameters of training model

--model models/AlexNet_flic.py
--gpu 0
--epoch 1000
--batchsize 32
--snapshot 10
--datadir data/FLIC-full
--channel 3
--flip 1
--size 220
--crop_pad_inf 1.5
--crop_pad_sup 2.0
--shift 5
--lcn 1
--joint_num 7 \

Am I doing something wrong in the way I am changing the GPU ID or is there some other problem?

Here's the full stack trace

Traceback (most recent call last):
File "scripts/train.py", line 211, in
model, optimizer = get_model_optimizer(args)
File "scripts/train.py", line 92, in get_model_optimizer
optimizer.setup(model)
File "/usr/local/lib/python2.7/dist-packages/chainer/optimizer.py", line 72, in setup
self.prepare()
File "/usr/local/lib/python2.7/dist-packages/chainer/optimizer.py", line 87, in prepare
self.init_state(param, state)
File "/usr/local/lib/python2.7/dist-packages/chainer/optimizers/ada_grad.py", line 20, in init_state
state['h'] = xp.zeros_like(param.data)
File "/usr/local/lib/python2.7/dist-packages/cupy/creation/basic.py", line 167, in zeros_like
return zeros(a.shape, dtype=dtype)
File "/usr/local/lib/python2.7/dist-packages/cupy/creation/basic.py", line 144, in zeros
a = empty(shape, dtype)
File "/usr/local/lib/python2.7/dist-packages/cupy/creation/basic.py", line 20, in empty
return cupy.ndarray(shape, dtype=dtype)
File "cupy/core/core.pyx", line 87, in cupy.core.core.ndarray.init (cupy/core/core.cpp:4790)
File "cupy/cuda/memory.pyx", line 252, in cupy.cuda.memory.alloc (cupy/cuda/memory.cpp:5116)
File "cupy/cuda/memory.pyx", line 385, in cupy.cuda.memory.MemoryPool.malloc (cupy/cuda/memory.cpp:7701)
File "cupy/cuda/memory.pyx", line 401, in cupy.cuda.memory.MemoryPool.malloc (cupy/cuda/memory.cpp:7621)
File "cupy/cuda/memory.pyx", line 313, in cupy.cuda.memory.SingleDeviceMemoryPool.malloc (cupy/cuda/memory.cpp:6556)
File "cupy/cuda/memory.pyx", line 328, in cupy.cuda.memory.SingleDeviceMemoryPool.malloc (cupy/cuda/memory.cpp:6380)
File "cupy/cuda/memory.pyx", line 232, in cupy.cuda.memory._malloc (cupy/cuda/memory.cpp:5055)
File "cupy/cuda/memory.pyx", line 233, in cupy.cuda.memory._malloc (cupy/cuda/memory.cpp:4970)
File "cupy/cuda/memory.pyx", line 31, in cupy.cuda.memory.Memory.init (cupy/cuda/memory.cpp:1388)
File "cupy/cuda/runtime.pyx", line 164, in cupy.cuda.runtime.malloc (cupy/cuda/runtime.cpp:2700)
File "cupy/cuda/runtime.pyx", line 106, in cupy.cuda.runtime.check_status (cupy/cuda/runtime.cpp:1815)
cupy.cuda.runtime.CUDARuntimeError: cudaErrorMemoryAllocation: out of memory

can this code run without GPU or cuda support

invalid device ordinal

Error happened when I tried to run train_lsp.sh file, it said cuda error or invalid device: invalid device ordinal

I use Cygwin on windows 10. GPU is GTX 1070, and both cuda and cudnn works well.

Do I need 2 GPU to run alexnet?

Thank you very much.

@mitmul

Memory problem

Anyone has an idea how much memory is in need to run this program? I have 16GB and it seems like that is way not enough. Thanks

Model and Parameters For LSP Dataset

I trained lsp dataset with default parameters.
The test error is large.
Do you have appropriate model or parameters for lsp dataset ?

Thanks!

CUDARuntimeError: cudaErrorMemoryAllocation: out of memory error

Download failure with datasets/download.sh

URL for "LSP Extended Training Dataset" seems to be no longer valid.

deeppose/datasets/download.sh

Line 18 in fcf53af

wget http://www.comp.leeds.ac.uk/mat4saj/lspet_dataset.zip

How to run

Hey i have downloaded the whole code and data set but i dont know how to train or run it because there is no read me file or proper instruction how to execute it ?
Kindly help me.

how to predict

Please tell me how to predict using static images.
($ python ***. py )

CHAINER_SEED type error

running:
python scripts/train.py
gives:
willi@Graphi1k14:~/deeppose$ python scripts/train.py
Traceback (most recent call last):
File "scripts/train.py", line 206, in
os.environ['CHAINER_SEED'] = args.seed
File "/usr/lib/python2.7/os.py", line 471, in setitem
putenv(key, item)
TypeError: must be string, not int

if I run:
python scripts/train.py --seed foo
it gives me:
train.py: error: argument --seed: invalid int value: 'foo'

if I run:
python scripts/train.py --seed 42
gives the same like the first try (without --seed argument)

It seems to work if I comment out the line:
os.environ['CHAINER_SEED'] = args.seed

data_reader.cpp:98] Check failed: new_queue_pairs_.size() == 0 (1 vs. 0)

BVLC/caffe#3394

Getting the following error while trying to train AlexNet network. Anyone else faced this? Any fix for the same? Unable to figure out the fix as per the abovementioned link.

Training time and model/state sample

How many epochs is suggested for the run? 1000?

Also does anyone already have the trained model and state that I can use?

Thanks!

Unresolved references in evaluate_flic.py

I successfully managed to train the FLIC dataset over a few epocs but I was not able to evaluate the results due to unresolved references in evaluate_flic.py. Here are the problematic lines:

L73: img, joints = resize(img, joints, size)
NameError: name 'size' is not defined

L270: log_fn = grep.grep('{}/log.txt'.format(result_dir))[0]
NameError: name 'grep' is not defined

L167: input_data, labels = load_data(trans, args, lines)
NameError: global name 'trans' is not defined

Prediction for Real Time Video

Is it possible to use it for predicting an action from a real-time video feed?(something like checking if a person is running)
I am very new to this field and even if can give me an idea, it would be great.
Thanks

TypeError: rotate() got an unexpected keyword argument 'center'

Hello, I try to train your code, however I keep getting this error as below.
I have Ubuntu14.04, Cuda8.0, python 2.7. Caffe, and GTX1080.
Is there any solution I can solve this error?

bash shells/train_flic.sh
2016-12-13 14:02:51,960 [INFO] sys.version_info(major=2, minor=7, micro=6, releaselevel='final', serial=0)
2016-12-13 14:02:51,960 [INFO] chainer version: 1.18.0
2016-12-13 14:02:51,971 [INFO] cuda: True, cudnn: True
2016-12-13 14:02:51,971 [INFO] Namespace(adam_alpha=0.001, adam_beta1=0.9, adam_beta2=0.999, adam_eps=1e-08, base_zoom=1.5, batchsize=128, channel=3, coord_normalize=True, epoch=101, fliplr=True, fname_index=0, gcn=True, gpus='0', ignore_label=-1, im_size=220, img_dir='data/FLIC-full/images', joint_index=1, lr=0.01, lr_decay_freq=10, lr_decay_ratio=0.1, min_dim=0, model='models/AlexNet.py', n_joints=7, opt='Adam', resume_model=None, resume_opt=None, resume_param=None, rotate=True, rotate_range=10, seed=1701, show_log_iter=10, snapshot=10, symmetric_joints='[[2, 4], [1, 5], [0, 6]]', test_csv_fn='data/FLIC-full/test_joints.csv', test_freq=10, train_csv_fn='data/FLIC-full/train_joints.csv', translate=True, translate_range=5, valid_freq=5, weight_decay=0.0005, zoom=True, zoom_range=0.2)
2016-12-13 14:06:51,617 [INFO] data/FLIC-full/train_joints.csv is ready
2016-12-13 14:08:53,526 [INFO] data/FLIC-full/test_joints.csv is ready
Traceback (most recent call last):
File "scripts/train.py", line 229, in
trainer.run()
File "/usr/local/lib/python2.7/dist-packages/chainer/training/trainer.py", line 289, in run
update()
File "/usr/local/lib/python2.7/dist-packages/chainer/training/updater.py", line 170, in update
self.update_core()
File "/usr/local/lib/python2.7/dist-packages/chainer/training/updater.py", line 281, in update_core
batch = self.get_iterator('main').next()
File "/usr/local/lib/python2.7/dist-packages/chainer/iterators/multiprocess_iterator.py", line 77, in next
self._invoke_prefetch()
File "/usr/local/lib/python2.7/dist-packages/chainer/iterators/multiprocess_iterator.py", line 173, in _invoke_prefetch
data = self.dataset[index]
File "/usr/local/lib/python2.7/dist-packages/chainer/dataset/dataset_mixin.py", line 31, in getitem
return self.get_example(index)
File "/home/seungheelee/workspace/deeppose-master/scripts/dataset.py", line 169, in get_example
image, joints = self.apply_rotate(image, joints, ignore_joints)
File "/home/seungheelee/workspace/deeppose-master/scripts/dataset.py", line 128, in apply_rotate
image = transform.rotate(image, angle, center=joint_center)
TypeError: rotate() got an unexpected keyword argument 'center'

TypeError: Cannot cast array data from dtype('float64') to dtype('<U32') according to the rule 'safe'

Hello!
I am testing with your code, however I got dtype typeerror when training with shell scripts..
Is there any solution that I can fix this?

File "/home/seungheelee/download/caffe/scripts/dataset.py", line 169, in get_example
image, joints = self.apply_rotate(image, joints, ignore_joints)
File "/home/seungheelee/download/caffe/scripts/dataset.py", line 133, in apply_rotate
joints = ((rot_mat.dot((joints - joint_center).T)).T) + joint_center
TypeError: Cannot cast array data from dtype('float64') to dtype('<U32') according to the rule 'safe'

CPU Mode Predict_flic.py

When running visualization code I got error:

Traceback (most recent call last):
File "scripts/predict_flic.py", line 211, in
test(args)
File "scripts/predict_flic.py", line 118, in test
pred = preds[n]
TypeError: 'Variable' object does not support indexing

It worked when I changed pred = preds[n] to pred = preds.data[n]. It is assigned it correctly for GPU mode with

    if args.gpu >= 0:
        preds = cuda.to_cpu(preds.data)
        input_data = cuda.to_cpu(input_data)
        labels = cuda.to_cpu(labels)

But same should be done for CPU mode.

Label data is used to Predict?

In the predict_flic.py, when I run it in test mode, it seems to load the label data for a particular file, and then does the prediction. (that's my understanding)

how will this work if I give a new image which is not there in the existing labelled data set. is there a simple example of how I can just take 1 image and get a prediction of joints.

strange contrast normalization

Hi there,
On https://github.com/mitmul/deeppose/blob/master/scripts/evaluate_flic.py#L69 , I find it strange that you substract the standard deviation instead of normalizing by it.

path problem

Hello,

The Alexnet train_test.prototxt path to lmdb is slightly wrong:
../../data/FLIC-full/image_train.lmdb
but should be:
../../data/image_train.lmdb

Same issue with the labels of course.

Pretrained Models

Are pretrained models available?
I would really appreciate if you share the weights of the models.

Thank You

test_sample - different predictions for the same input

This should be added net.set_phase_test() into load_net() function in test_sample.py. Otherwise the net is in train phase and the dropout is used - predictions are differet for the same input data...

"evaluate_Flic.py" parameter 'trans'

I want to ask what is the trans. In my test, I can't run because of
"
Traceback (most recent call last):
File "evaluatDeepPose-Shao/evaluateFlic.py", line 286, in
test(args)
File "evaluatDeepPose-Shao/evaluateFlic.py", line 169, in test
input_data, labels = load_data(trans, args, lines)
NameError: global name 'trans' is not defined
"
I really confused about this. And want warm-hearted people help me !!! thanks!

Error while visualiing filters

When I try to read the visualize the 1 conv layer, here is the error I am getting when trying to read epoch-1.model

File "../../scripts/draw_filters.py", line 28, in
model = pickle.load(open(model_file,'rb'))
cPickle.UnpicklingError: invalid load key, 'H'.

Any idea why this is happening?

Number of stages for cascaded pose regression

Using the provided 'train_test.prototxt' file for training, how many stages for cascaded pose regression will be trained. Is there any parameter to vary the number of stages? Or does it only train the initial stage?

test_sample - the mean value is not subtracted from the testing sample

I'm not sure, but when it is removed within training phase then it must be removed as well in test phase.

I suggest to change code:

    net = caffe.Net(MODEL_FILE_PATH, PRETRAINED_MODEL_PATH)
    net.set_mode_gpu()

with code

    net = caffe.Classifier(MODEL_FILE_PATH, PRETRAINED_MODEL_PATH)

    net.set_mean('data', np.load(MEAN_SHAPE_PTY_PATH))  # ImageNet mean
    net.set_raw_scale('data', 255)  # the reference model operates on images in [0,255] range instead of [0,1]
    net.set_channel_swap('data', (2,1,0))  # the reference model has channels in BGR order instead of RGB
    net.set_mode_gpu()
    net.set_phase_test()

Add minimum memory requirements to readme

I tried running this on a basic card with 2GB of GPU RAM but i get an out of memory error as the training script is starting up. what is the minimum requirement to run this on the FLIC data?

kyle:deeppose kyle$ python scripts/train.py --model models/AlexNet_flic.py --gpu 0 --epoch 1000 --batchsize 128 --prefix AlexNet_LCN_AdaGrad_lr-0.0005 --snapshot 10 --datadir data/FLIC-full --channel 3 --flip True --size 220 --crop_pad_inf 1.5 --crop_pad_sup 2.0 --shift 5 --lcn True --joint_num 7
Traceback (most recent call last):
  File "scripts/train.py", line 246, in <module>
    trans, args, input_q, data_q)
  File "scripts/train.py", line 131, in train
    loss, pred = model.forward(input_data, label, train=True)
  File "models/AlexNet_flic.py", line 34, in forward
    h = F.local_response_normalization(h)
  File "/usr/local/lib/python2.7/site-packages/chainer-1.0.1-py2.7.egg/chainer/functions/local_response_normalization.py", line 123, in local_response_normalization
    return LocalResponseNormalization(n, k, alpha, beta)(x)
  File "/usr/local/lib/python2.7/site-packages/chainer-1.0.1-py2.7.egg/chainer/function.py", line 163, in __call__
    outputs = self.forward(in_data)
  File "/usr/local/lib/python2.7/site-packages/chainer-1.0.1-py2.7.egg/chainer/function.py", line 199, in forward
    return self.forward_gpu(inputs)
  File "/usr/local/lib/python2.7/site-packages/chainer-1.0.1-py2.7.egg/chainer/functions/local_response_normalization.py", line 68, in forward_gpu
    self.y = x[0] * x[0]  # temporary
  File "/usr/local/lib/python2.7/site-packages/chainer-1.0.1-py2.7.egg/chainer/cuda.py", line 718, in new_op
    return raw_op(self, other)
  File "/usr/local/lib/python2.7/site-packages/pycuda/gpuarray.py", line 468, in __mul__
    result = self._new_like_me(_get_common_dtype(self, other))
  File "/usr/local/lib/python2.7/site-packages/pycuda/gpuarray.py", line 401, in _new_like_me
    allocator=self.allocator, strides=strides)
  File "/usr/local/lib/python2.7/site-packages/pycuda/gpuarray.py", line 204, in __init__
    self.gpudata = self.allocator(self.size * self.dtype.itemsize)
  File "/usr/local/lib/python2.7/site-packages/chainer-1.0.1-py2.7.egg/chainer/cuda.py", line 352, in mem_alloc
    allocation = pool.allocate(nbytes)
pycuda._driver.MemoryError: memory_pool::allocate failed: out of memory - failed to free memory for allocation

I could'nt run deeppose prog.

Evaluating FLIC model

Has any tried testing the models that were generated? I tried running it offline (outside of the train algorithm) and got the following error. It seems like the model and input parameter sizes do not match.

Namespace(batchsize=1, channel=3, crop_pad_inf=1, crop_pad_sup=1, cropping=1, datadir='data/FLIC-sample', draw_limb=True, flip=1, fname_index=0, gcn=1, gpu=0, joint_index=1, joint_num=7, min_dim=100, mode='test', model='results/AlexNet_2016-08-02_10-37-31/AlexNet.py', n_imgs=9, param='results/AlexNet_2016-08-02_10-37-31/epoch-10.model', resize=-1, seed=9, shift=5, size=128, symmetric_joints='[[2, 4], [1, 5], [0, 6]]', text_scale=1.0)
Traceback (most recent call last):
File "scripts/evaluate_flic.py", line 304, in
test(args)
File "scripts/evaluate_flic.py", line 180, in test
model(x, t)
File "results/AlexNet_2016-08-02_10-37-31/AlexNet.py", line 46, in call
h = F.dropout(F.relu(self.fc6(h)), train=self.train, ratio=0.6)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/chainer/links/connection/linear.py", line 72, in call
return linear.linear(x, self.W, self.b)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/chainer/functions/connection/linear.py", line 79, in linear
return LinearFunction()(x, W, b)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/chainer/function.py", line 122, in call
self._check_data_type_forward(in_data)
File "/home/ubuntu/anaconda2/lib/python2.7/site-packages/chainer/function.py", line 197, in _check_data_type_forward
raise type_check.InvalidType(e.expect, e.actual, msg=msg)
chainer.utils.type_check.InvalidType:
Invalid operation is performed in: LinearFunction (Forward)

Expect: prod(in_types[0].shape[1:]) == in_types[1].shape[1]
Actual: 2304 != 9216

cupy.cuda.runtime.CUDARuntimeError: cudaErrorMemoryAllocation: out of memory

I want to run the training on a GPU with ID 1. So to added the argument 1 to function call [ model.to_gpu(1) ] in rrain.py . While I have ~4gb available on the GPU, when I run the network with batchsize of 32, I get the following error

cupy.cuda.runtime.CUDARuntimeError: cudaErrorMemoryAllocation: out of memory

Here are the parameters of training model

Am I doing something wrong in the way I am changing the GPU ID or is there some other problem?

how to use the model to predict on some sample image - other than dataset?

@mitmul

at this point, its not clear on how to use the model to predict the joints on some sample test image.

somewhere in the code, it seems to load the existing joint locations and passes that to the model forward function being called for predictions.

some help and a simple python script to use the model on some other sample test image would be useful. thanks.

In the eval process, test pictures are also cropped by joints in transform function.

In the eval process, we also use load_data(trans, args, input_q, data_q) function to load test data, and use the same trans.transform function to load data and transform data. But in the transform function, the test pictures are also cropped basic on the true joints data.

So I think it is better to change transform function a little bit. When in eval process, the pictures are not cropped.

caffe

How to use the trained Alexnet testing other pics?

Thank you very much

where is transform.py

transform.py is needed in evaluate_flic.py, but it is missing in scripts/. Where did it go??

Train or test with custom dataset

Is there any script to test custom image on the pretrained model or to train a custom dataset to predict the joints? This is already mentioned in #10.

Include "act_id" of MPII annotations to mpii_dataset.py

As I only want to train my model on one specific acting category of MPII dataset, I am trying to include the categories in mpii_dataset.py. Unfortunately I could not solve it - can you help me taking the "act_id" of the MPII data set into your conversion code?

tr_plus_indices.mat not found

running python datasets/flic_dataset.py gives me the following error:
FileNotFoundError: [Errno 2] No such file or directory: 'data/FLIC-full/tr_plus_indices.mat'

I have manually downloaded FLIC-full to the dataset location directory. But I can't seem to find tr_plus_indices.mat even in the source zip.

In Python 2.7 env, stuck after "test_joints.csv is ready"

I agree, this is a Pyhton 3 project.

Does anyone know how to move forward with the training with Chainer.

If you help me, I may become a Chainer fan (seems very readable).

mitmul / deeppose Goto Github PK

deeppose's Introduction

DeepPose

Requirements

Installation of dependencies

Dataset preparation

MPII Dataset

Start training

For FLIC Dataset

For LSP Dataset

For MPII Dataset

GPU memory requirement

Prediction

deeppose's People

Contributors

Stargazers

Watchers

Forkers

deeppose's Issues

Recommend Projects

Recommend Topics

Recommend Org