open-mmlab / mmskeleton Goto Github PK

View Code? Open in Web Editor NEW

2.9K 70.0 1.0K 93.34 MB

A OpenMMLAB toolbox for human pose estimation, skeleton-based action recognition, and action synthesis.

License: Apache License 2.0

Python 52.61% Shell 0.15% Makefile 0.02% Cuda 47.20% C++ 0.02%

action-recognition deep-learning graph-convolutional-network pytorch skeleton-based-action-recognition

mmskeleton's People

Stargazers

Watchers

Forkers

fendaq bityangke shubhampachori12110095 objectdetection ckboss cclauss keyky haowangxidian liviust elevanth jdc08161063 wanjinchang willdamon fireae shuidongliu xzm2004260 zhongxingpeng machanic xxradon leivo zhihengli-ur zbwglory hyzcn jungel2star ml-lab fesianxu wuqiong23 chevrefeuille biranchi2018 zcrwind liyaguang ccv-edward kyocen shuli163love cooparation hzhang57 statml joey5678 ai3dvision danache afshaanmaz kinect59 xiaoyuliu lshiwjx ericwangyz happyday521 ckybit dsp6414 bandontseng sucrerouge zgsxwsdxg kelvinson zumbalamambo ewenwan allankevinrichie eglxiang fyhsky daringpig xiaoluenbi rkshuai softwarelight luicb 3dmm-icme2023 autogyro owalnuto tangxiaohuihui sdd9465 ahhaa suzhoushr bruceyo nkliuyifang keceel minminaicode leiwangr winroot yufuxin witzou tracyleaf sergheiatanasov yushenxiang chenyangsi huaxinxiao soulempty abcinje baiyancheng20 hlig bowrian johnsonman klsl1307 littleboy7 pzw520125 kobebry80 sotakushi mengruxing bieqingcheng mynameiziji sycoi reeshark llihuam amorim-cleison

mmskeleton's Issues

Thoughts

I came across your article on arXiv today and the timing is incredible. I've been working on capturing a large dataset for analyzing human activity based on estimated skeleton position in video frames. My goal is to perform activity recognition and re-identification. This is a project for fun and a way for me to learn more about DNNs. Below is a summary of the dataset I've gathered so far.

So far I have 14 days worth of security camera video (2 cameras) recorded at 3264×1836x30fps resolution with every frame undistorted during daylight hours (and the ability to record any number of additional days.) I'm in the process of running those frames through openpose using their maximum accuracy configuration. Currently I'm only going to process 3 days worth of video from each camera which equates to 7,052,703 frames and ~10TB of frames stored using JPEG images. I have about 7-8 days left on openpose inference.

In the past I've sampled 2fps of the same video at a reduced resolution and manually labeled the data by creating a grouping of poses (identity/track) which identify the same agent across temporal spans of 5 minutes while also spanning two cameras. Presently I have 10,256 identities (I'm estimating that at least 25-35% of those are duplicate identities as people loiter or enter/exit multiple times per day.

I'm not sure what the breakdown of activities are but they consist of skateboarding, roller blading, running, walking & biking. The videos covers two spans of 7 consecutive days (2017-10-30 through 2017-11-05 and 2017-11-13 through 2017-11-19) which included sunny, cloudy and rainy weather conditions. There are no physical occlusions in the scene but people are frequently occluded by other people throughout their tracks. A vast majority of the identities consist of 150-600 poses, the high end would be about 50 identities loitering resulting in 3,000-9,000 poses each.

I am interested in running your network on the resulting pose data and would be happy to share the results. Any thoughts or suggestions you've learned analyzing the pose data?

I can't download NTU RGB+D

Once I register my email,but they reply me to register Organisation.
Then I reply them that I want to just test st-gcn.
and they don't reply any more.

Improvement / Modification of the model

Hi there,

I am looking to improve on the model's structure maybe adding 1 or more layers or something else. How would you suggest I can go about with this?

Also, the link to your paper returns a 403 Forbidden error. Please advise.

Hope to hear back.

HELP!OSError: [Errno 12] Cannot allocate memory

Code version (Git Hash) and PyTorch version

Python 3.6.4 |Anaconda custom (64-bit)
dual system: windows & ubuntu
Nvidia GeForce GTX 950M

Dataset used

NTU RGB+D

Expected behavior

cross-view evaluation in NTU RGB+D,run the code:
python main.py --config config/st_gcn/nturgbd-cross-view/test.yaml

Actual behavior

[ Fri Apr 20 17:44:22 2018 ] Load weights from ./model/ntuxview-st_gcn.pt.
[ Fri Apr 20 17:44:22 2018 ] Model: st_gcn.net.ST_GCN.
[ Fri Apr 20 17:44:22 2018 ] Weights: ./model/ntuxview-st_gcn.pt.
[ Fri Apr 20 17:44:22 2018 ] Eval epoch: 1
Traceback (most recent call last):
File "main.py", line 426, in
processor.start()
File "main.py", line 388, in start
epoch=0, save_score=self.arg.save_score, loader_name=['test'])
File "main.py", line 335, in eval
for batch_idx, (data, label) in enumerate(self.data_loader[ln]):
File "/home/rui/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 310, in iter
return DataLoaderIter(self)
File "/home/rui/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 167, in init
w.start()
File "/home/rui/anaconda3/lib/python3.6/multiprocessing/process.py", line 105, in start
self._popen = self._Popen(self)
File "/home/rui/anaconda3/lib/python3.6/multiprocessing/context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/home/rui/anaconda3/lib/python3.6/multiprocessing/context.py", line 277, in _Popen
return Popen(process_obj)
File "/home/rui/anaconda3/lib/python3.6/multiprocessing/popen_fork.py", line 26, in init
self._launch(process_obj)
File "/home/rui/anaconda3/lib/python3.6/multiprocessing/popen_fork.py", line 73, in _launch
self.pid = os.fork()
[ Fri Apr 20 17:44:22 2018 ] Load weights from ./model/ntuxview-st_gcn.pt.
[ Fri Apr 20 17:44:22 2018 ] Model: st_gcn.net.ST_GCN.
[ Fri Apr 20 17:44:22 2018 ] Weights: ./model/ntuxview-st_gcn.pt.
[ Fri Apr 20 17:44:22 2018 ] Eval epoch: 1
Traceback (most recent call last):
File "main.py", line 426, in
processor.start()
File "main.py", line 388, in start
epoch=0, save_score=self.arg.save_score, loader_name=['test'])
File "main.py", line 335, in eval
for batch_idx, (data, label) in enumerate(self.data_loader[ln]):
File "/home/rui/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 310, in iter
return DataLoaderIter(self)
File "/home/rui/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 167, in init
w.start()
File "/home/rui/anaconda3/lib/python3.6/multiprocessing/process.py", line 105, in start
self._popen = self._Popen(self)
File "/home/rui/anaconda3/lib/python3.6/multiprocessing/context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/home/rui/anaconda3/lib/python3.6/multiprocessing/context.py", line 277, in _Popen
return Popen(process_obj)
File "/home/rui/anaconda3/lib/python3.6/multiprocessing/popen_fork.py", line 26, in init
self._launch(process_obj)
File "/home/rui/anaconda3/lib/python3.6/multiprocessing/popen_fork.py", line 73, in _launch
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

Training loss and Testing accuracy

Hello, when I try to reproduce the results in your paper or try a modified model by a different learning rate schedule, for example, start from base_lr=0.1, go on training until the training loss no longer decreases. I found that while the train loss is in a right, decreasing manner, the testing loss however, blows up and testing accuracy decreases. I think the model is not overfitting because the training loss is around 1. A example log file, epoch 5 and 10 is common, but the testing accuracy and loss of epoch 15 is not in normal range.

[ Sun Apr  8 21:42:15 2018 ] Training epoch: 1
[ Sun Apr  8 21:42:29 2018 ] 	Batch(0/589) done. Loss: 5.3065  lr:0.100000
[ Sun Apr  8 21:44:10 2018 ] 	Batch(100/589) done. Loss: 3.5391  lr:0.100000
[ Sun Apr  8 21:45:52 2018 ] 	Batch(200/589) done. Loss: 3.4271  lr:0.100000
[ Sun Apr  8 21:47:34 2018 ] 	Batch(300/589) done. Loss: 3.1172  lr:0.100000
[ Sun Apr  8 21:49:16 2018 ] 	Batch(400/589) done. Loss: 2.8495  lr:0.100000
[ Sun Apr  8 21:50:58 2018 ] 	Batch(500/589) done. Loss: 2.9964  lr:0.100000
[ Sun Apr  8 21:52:27 2018 ] 	Mean training loss: 3.3161.
[ Sun Apr  8 21:52:27 2018 ] 	Time consumption: [Data]01%, [Network]99%
[ Sun Apr  8 21:52:27 2018 ] Training epoch: 2
[ Sun Apr  8 21:52:41 2018 ] 	Batch(0/589) done. Loss: 3.0124  lr:0.100000
[ Sun Apr  8 21:54:22 2018 ] 	Batch(100/589) done. Loss: 2.9106  lr:0.100000
[ Sun Apr  8 21:56:03 2018 ] 	Batch(200/589) done. Loss: 2.4281  lr:0.100000
[ Sun Apr  8 21:57:44 2018 ] 	Batch(300/589) done. Loss: 2.3935  lr:0.100000
[ Sun Apr  8 21:59:26 2018 ] 	Batch(400/589) done. Loss: 2.3242  lr:0.100000
[ Sun Apr  8 22:01:08 2018 ] 	Batch(500/589) done. Loss: 2.2797  lr:0.100000
[ Sun Apr  8 22:02:36 2018 ] 	Mean training loss: 2.4595.
[ Sun Apr  8 22:02:36 2018 ] 	Time consumption: [Data]03%, [Network]97%
[ Sun Apr  8 22:02:36 2018 ] Training epoch: 3
[ Sun Apr  8 22:02:50 2018 ] 	Batch(0/589) done. Loss: 2.0113  lr:0.100000
[ Sun Apr  8 22:04:31 2018 ] 	Batch(100/589) done. Loss: 1.9469  lr:0.100000
[ Sun Apr  8 22:06:13 2018 ] 	Batch(200/589) done. Loss: 2.0902  lr:0.100000
[ Sun Apr  8 22:07:56 2018 ] 	Batch(300/589) done. Loss: 1.9241  lr:0.100000
[ Sun Apr  8 22:09:38 2018 ] 	Batch(400/589) done. Loss: 1.6968  lr:0.100000
[ Sun Apr  8 22:11:20 2018 ] 	Batch(500/589) done. Loss: 1.6265  lr:0.100000
[ Sun Apr  8 22:12:49 2018 ] 	Mean training loss: 1.8767.
[ Sun Apr  8 22:12:49 2018 ] 	Time consumption: [Data]03%, [Network]97%
[ Sun Apr  8 22:12:49 2018 ] Training epoch: 4
[ Sun Apr  8 22:13:03 2018 ] 	Batch(0/589) done. Loss: 1.5664  lr:0.100000
[ Sun Apr  8 22:14:45 2018 ] 	Batch(100/589) done. Loss: 1.2361  lr:0.100000
[ Sun Apr  8 22:16:27 2018 ] 	Batch(200/589) done. Loss: 1.9590  lr:0.100000
[ Sun Apr  8 22:18:08 2018 ] 	Batch(300/589) done. Loss: 1.4472  lr:0.100000
[ Sun Apr  8 22:19:50 2018 ] 	Batch(400/589) done. Loss: 1.7926  lr:0.100000
[ Sun Apr  8 22:21:32 2018 ] 	Batch(500/589) done. Loss: 1.6678  lr:0.100000
[ Sun Apr  8 22:23:00 2018 ] 	Mean training loss: 1.5810.
[ Sun Apr  8 22:23:00 2018 ] 	Time consumption: [Data]03%, [Network]97%
[ Sun Apr  8 22:23:00 2018 ] Training epoch: 5
[ Sun Apr  8 22:23:14 2018 ] 	Batch(0/589) done. Loss: 1.3737  lr:0.100000
[ Sun Apr  8 22:24:55 2018 ] 	Batch(100/589) done. Loss: 1.6322  lr:0.100000
[ Sun Apr  8 22:26:37 2018 ] 	Batch(200/589) done. Loss: 1.2826  lr:0.100000
[ Sun Apr  8 22:28:20 2018 ] 	Batch(300/589) done. Loss: 1.7919  lr:0.100000
[ Sun Apr  8 22:30:02 2018 ] 	Batch(400/589) done. Loss: 1.5371  lr:0.100000
[ Sun Apr  8 22:31:42 2018 ] 	Batch(500/589) done. Loss: 1.3910  lr:0.100000
[ Sun Apr  8 22:33:11 2018 ] 	Mean training loss: 1.4091.
[ Sun Apr  8 22:33:11 2018 ] 	Time consumption: [Data]03%, [Network]97%
[ Sun Apr  8 22:33:11 2018 ] Eval epoch: 5
[ Sun Apr  8 22:35:05 2018 ] 	Mean test loss of 296 batches: 1.38761536334012.
[ Sun Apr  8 22:35:06 2018 ] 	Top1: 57.94%
[ Sun Apr  8 22:35:06 2018 ] 	Top5: 90.33%
[ Sun Apr  8 22:35:06 2018 ] Training epoch: 6
[ Sun Apr  8 22:35:19 2018 ] 	Batch(0/589) done. Loss: 1.4409  lr:0.100000
[ Sun Apr  8 22:37:00 2018 ] 	Batch(100/589) done. Loss: 1.3341  lr:0.100000
[ Sun Apr  8 22:38:42 2018 ] 	Batch(200/589) done. Loss: 1.0841  lr:0.100000
[ Sun Apr  8 22:40:23 2018 ] 	Batch(300/589) done. Loss: 1.2607  lr:0.100000
[ Sun Apr  8 22:42:05 2018 ] 	Batch(400/589) done. Loss: 1.3300  lr:0.100000
[ Sun Apr  8 22:43:46 2018 ] 	Batch(500/589) done. Loss: 1.1257  lr:0.100000
[ Sun Apr  8 22:45:15 2018 ] 	Mean training loss: 1.2766.
[ Sun Apr  8 22:45:15 2018 ] 	Time consumption: [Data]03%, [Network]97%
[ Sun Apr  8 22:45:15 2018 ] Training epoch: 7
[ Sun Apr  8 22:45:29 2018 ] 	Batch(0/589) done. Loss: 1.4653  lr:0.100000
[ Sun Apr  8 22:47:10 2018 ] 	Batch(100/589) done. Loss: 1.2261  lr:0.100000
[ Sun Apr  8 22:48:51 2018 ] 	Batch(200/589) done. Loss: 1.1842  lr:0.100000
[ Sun Apr  8 22:50:33 2018 ] 	Batch(300/589) done. Loss: 1.2471  lr:0.100000
[ Sun Apr  8 22:52:15 2018 ] 	Batch(400/589) done. Loss: 1.1583  lr:0.100000
[ Sun Apr  8 22:53:56 2018 ] 	Batch(500/589) done. Loss: 0.9828  lr:0.100000
[ Sun Apr  8 22:55:24 2018 ] 	Mean training loss: 1.1803.
[ Sun Apr  8 22:55:24 2018 ] 	Time consumption: [Data]03%, [Network]97%
[ Sun Apr  8 22:55:24 2018 ] Training epoch: 8
[ Sun Apr  8 22:55:39 2018 ] 	Batch(0/589) done. Loss: 1.0015  lr:0.100000
[ Sun Apr  8 22:57:20 2018 ] 	Batch(100/589) done. Loss: 1.0679  lr:0.100000
[ Sun Apr  8 22:59:02 2018 ] 	Batch(200/589) done. Loss: 1.2700  lr:0.100000
[ Sun Apr  8 23:00:43 2018 ] 	Batch(300/589) done. Loss: 1.0391  lr:0.100000
[ Sun Apr  8 23:02:24 2018 ] 	Batch(400/589) done. Loss: 0.8358  lr:0.100000
[ Sun Apr  8 23:04:06 2018 ] 	Batch(500/589) done. Loss: 0.7021  lr:0.100000
[ Sun Apr  8 23:05:34 2018 ] 	Mean training loss: 1.1058.
[ Sun Apr  8 23:05:34 2018 ] 	Time consumption: [Data]03%, [Network]97%
[ Sun Apr  8 23:05:34 2018 ] Training epoch: 9
[ Sun Apr  8 23:05:48 2018 ] 	Batch(0/589) done. Loss: 1.4356  lr:0.100000
[ Sun Apr  8 23:07:28 2018 ] 	Batch(100/589) done. Loss: 0.9781  lr:0.100000
[ Sun Apr  8 23:09:10 2018 ] 	Batch(200/589) done. Loss: 1.1352  lr:0.100000
[ Sun Apr  8 23:10:51 2018 ] 	Batch(300/589) done. Loss: 0.8561  lr:0.100000
[ Sun Apr  8 23:12:33 2018 ] 	Batch(400/589) done. Loss: 1.0276  lr:0.100000
[ Sun Apr  8 23:14:15 2018 ] 	Batch(500/589) done. Loss: 1.3473  lr:0.100000
[ Sun Apr  8 23:15:43 2018 ] 	Mean training loss: 1.0431.
[ Sun Apr  8 23:15:43 2018 ] 	Time consumption: [Data]03%, [Network]97%
[ Sun Apr  8 23:15:43 2018 ] Training epoch: 10
[ Sun Apr  8 23:15:57 2018 ] 	Batch(0/589) done. Loss: 1.2543  lr:0.100000
[ Sun Apr  8 23:17:39 2018 ] 	Batch(100/589) done. Loss: 0.8085  lr:0.100000
[ Sun Apr  8 23:19:20 2018 ] 	Batch(200/589) done. Loss: 1.0412  lr:0.100000
[ Sun Apr  8 23:21:02 2018 ] 	Batch(300/589) done. Loss: 0.9332  lr:0.100000
[ Sun Apr  8 23:22:43 2018 ] 	Batch(400/589) done. Loss: 1.0560  lr:0.100000
[ Sun Apr  8 23:24:25 2018 ] 	Batch(500/589) done. Loss: 0.9087  lr:0.100000
[ Sun Apr  8 23:25:53 2018 ] 	Mean training loss: 0.9881.
[ Sun Apr  8 23:25:53 2018 ] 	Time consumption: [Data]03%, [Network]97%
[ Sun Apr  8 23:25:54 2018 ] Eval epoch: 10
[ Sun Apr  8 23:27:47 2018 ] 	Mean test loss of 296 batches: 1.0697445980197675.
[ Sun Apr  8 23:27:48 2018 ] 	Top1: 68.29%
[ Sun Apr  8 23:27:48 2018 ] 	Top5: 94.53%
[ Sun Apr  8 23:27:48 2018 ] Training epoch: 11
[ Sun Apr  8 23:28:01 2018 ] 	Batch(0/589) done. Loss: 0.6880  lr:0.100000
[ Sun Apr  8 23:29:42 2018 ] 	Batch(100/589) done. Loss: 1.1329  lr:0.100000
[ Sun Apr  8 23:31:23 2018 ] 	Batch(200/589) done. Loss: 0.9698  lr:0.100000
[ Sun Apr  8 23:33:05 2018 ] 	Batch(300/589) done. Loss: 0.6172  lr:0.100000
[ Sun Apr  8 23:34:47 2018 ] 	Batch(400/589) done. Loss: 0.9810  lr:0.100000
[ Sun Apr  8 23:36:31 2018 ] 	Batch(500/589) done. Loss: 0.8487  lr:0.100000
[ Sun Apr  8 23:38:01 2018 ] 	Mean training loss: 0.9404.
[ Sun Apr  8 23:38:01 2018 ] 	Time consumption: [Data]03%, [Network]97%
[ Sun Apr  8 23:38:01 2018 ] Training epoch: 12
[ Sun Apr  8 23:38:15 2018 ] 	Batch(0/589) done. Loss: 0.8225  lr:0.100000
[ Sun Apr  8 23:39:57 2018 ] 	Batch(100/589) done. Loss: 0.9550  lr:0.100000
[ Sun Apr  8 23:41:40 2018 ] 	Batch(200/589) done. Loss: 0.9237  lr:0.100000
[ Sun Apr  8 23:43:23 2018 ] 	Batch(300/589) done. Loss: 0.7804  lr:0.100000
[ Sun Apr  8 23:45:06 2018 ] 	Batch(400/589) done. Loss: 0.7944  lr:0.100000
[ Sun Apr  8 23:46:51 2018 ] 	Batch(500/589) done. Loss: 0.6681  lr:0.100000
[ Sun Apr  8 23:48:20 2018 ] 	Mean training loss: 0.9031.
[ Sun Apr  8 23:48:20 2018 ] 	Time consumption: [Data]03%, [Network]97%
[ Sun Apr  8 23:48:20 2018 ] Training epoch: 13
[ Sun Apr  8 23:48:34 2018 ] 	Batch(0/589) done. Loss: 1.0019  lr:0.100000
[ Sun Apr  8 23:50:15 2018 ] 	Batch(100/589) done. Loss: 1.1436  lr:0.100000
[ Sun Apr  8 23:51:57 2018 ] 	Batch(200/589) done. Loss: 0.9631  lr:0.100000
[ Sun Apr  8 23:53:39 2018 ] 	Batch(300/589) done. Loss: 0.8120  lr:0.100000
[ Sun Apr  8 23:55:21 2018 ] 	Batch(400/589) done. Loss: 1.2053  lr:0.100000
[ Sun Apr  8 23:57:02 2018 ] 	Batch(500/589) done. Loss: 0.6185  lr:0.100000
[ Sun Apr  8 23:58:30 2018 ] 	Mean training loss: 0.8703.
[ Sun Apr  8 23:58:30 2018 ] 	Time consumption: [Data]03%, [Network]97%
[ Sun Apr  8 23:58:30 2018 ] Training epoch: 14
[ Sun Apr  8 23:58:44 2018 ] 	Batch(0/589) done. Loss: 0.7425  lr:0.100000
[ Mon Apr  9 00:00:25 2018 ] 	Batch(100/589) done. Loss: 0.8590  lr:0.100000
[ Mon Apr  9 00:02:07 2018 ] 	Batch(200/589) done. Loss: 0.7516  lr:0.100000
[ Mon Apr  9 00:03:49 2018 ] 	Batch(300/589) done. Loss: 0.8640  lr:0.100000
[ Mon Apr  9 00:05:30 2018 ] 	Batch(400/589) done. Loss: 0.6930  lr:0.100000
[ Mon Apr  9 00:07:11 2018 ] 	Batch(500/589) done. Loss: 0.9798  lr:0.100000
[ Mon Apr  9 00:08:40 2018 ] 	Mean training loss: 0.8336.
[ Mon Apr  9 00:08:40 2018 ] 	Time consumption: [Data]03%, [Network]97%
[ Mon Apr  9 00:08:40 2018 ] Training epoch: 15
[ Mon Apr  9 00:08:54 2018 ] 	Batch(0/589) done. Loss: 0.9048  lr:0.100000
[ Mon Apr  9 00:10:34 2018 ] 	Batch(100/589) done. Loss: 0.7716  lr:0.100000
[ Mon Apr  9 00:12:16 2018 ] 	Batch(200/589) done. Loss: 0.4784  lr:0.100000
[ Mon Apr  9 00:13:57 2018 ] 	Batch(300/589) done. Loss: 0.6179  lr:0.100000
[ Mon Apr  9 00:15:39 2018 ] 	Batch(400/589) done. Loss: 0.9232  lr:0.100000
[ Mon Apr  9 00:17:20 2018 ] 	Batch(500/589) done. Loss: 0.7198  lr:0.100000
[ Mon Apr  9 00:18:49 2018 ] 	Mean training loss: 0.7999.
[ Mon Apr  9 00:18:49 2018 ] 	Time consumption: [Data]03%, [Network]97%
[ Mon Apr  9 00:18:49 2018 ] Eval epoch: 15
[ Mon Apr  9 00:20:43 2018 ] 	Mean test loss of 296 batches: 6.906595945358276.
[ Mon Apr  9 00:20:44 2018 ] 	Top1: 22.58%
[ Mon Apr  9 00:20:44 2018 ] 	Top5: 46.53%

RuntimeError

Thanks for your great work~ I try to run this code, but I have met some problem.

Code version (Git Hash) and PyTorch version

Python 3.6.3 |Anaconda 3(64-bit)|pytorch 0.3.0
system: ubuntu16.04 LTS

Dataset used

kinectics-skeleton

Expected behavior

Evaluation

Actual behavior

python main.py --config config/st_gcn/kinetics-skeleton/test.yaml

[ Thu May 3 20:01:10 2018 ] Load weights from ./model/kinetics-st_gcn.pt.
[ Thu May 3 20:01:10 2018 ] Model: st_gcn.net.ST_GCN.
[ Thu May 3 20:01:10 2018 ] Weights: ./model/kinetics-st_gcn.pt.
[ Thu May 3 20:01:10 2018 ] Eval epoch: 1
Traceback (most recent call last):
File "main.py", line 426, in
processor.start()
File "main.py", line 388, in start
epoch=0, save_score=self.arg.save_score, loader_name=['test'])
File "main.py", line 344, in eval
output = self.model(data)
File "/home/andrew/anaconda3/envs/pytorch-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/home/extended/st-gcn-master/st_gcn/net/st_gcn.py", line 150, in forward
x = self.gcn0(x)
File "/home/andrew/anaconda3/envs/pytorch-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in call
result = self.forward(*input, **kwargs)
File "/home/extended/st-gcn-master/st_gcn/net/unit_gcn.py", line 70, in forward
self.A = self.A.cuda(x.get_device())
RuntimeError: get_device is not implemented for type torch.FloatTensor

Steps to reproduce the behavior

After running python main.py --config config/st_gcn/kinetics-skeleton/test.yaml, I got this RuntimeError

Other comments

[Kinetics Skeleton] Sample corruption? (sample has completely zero values only)

Hi!
Thank you for providing a nice code.

I downloaded Kinetics-skeleton from GoogleDrive and followed the rebuild process as follows:
python tools/kinetics_gendata.py --data_path <path to kinetics-skeleton>

Based on your code, I revised to stacked GRU and it worked in NTU RGB+D dataset.
But, it is not working well in Kinetics-skeleton.
I found that my problem is occurred by some wrong samples as follows:

For example, in "kinetics_val_label.json", "_2siVCR5EBs.json" has True "has_skeleton", but actually "_2siVCR5EBs.json" has completely empty skeleton values over all time steps. Hence, ignore_empty_sample in "feeder_kinetics.py" is not working properly.
Also, "_2hHyXxsyFo.json" in "kinetics_val", sample has mixed exist-skeleton and empty-skeleton values over time steps as like [0,0,1,1,0,0,0,0,0,1,1,1,1] ( 0 is empty-skeleton and 1 is exist-skeleton). But, I think that blank between exist-skeletons might induce discontinuity.

Are these problems what you intend or mistake?
I will really appreciate if you give me an answer.

Run it on CPU

Code version (Git Hash) and PyTorch version

Python 3.6.4 |Anaconda 3(64-bit)|pytorch 0.3.1.post2' CPU
dual system: windows & ubuntu

Dataset used

kinectics-skeleton

Expected behavior

Run the code on CPU environment

Actual behavior

python main.py --config config/st_gcn/kinetics-skeleton/test.yaml
Traceback (most recent call last):
File "main.py", line 425, in
processor = Processor(arg)
File "main.py", line 155, in init
self.load_model()
File "main.py", line 178, in load_model
self.model = Model(**self.arg.model_args).cuda(output_device)
File "/home/extend/PycharmProjects/st-gcn-master/st_gcn/net/st_gcn.py", line 71, in init
self.A = torch.from_numpy(self.graph.A).float().cuda(0)
File "/etc/Anaconda3/envs/pytorch-env/lib/python3.6/site-packages/torch/_utils.py", line 61, in _cuda
with torch.cuda.device(device):
File "/etc/Anaconda3/envs/pytorch-env/lib/python3.6/site-packages/torch/cuda/init.py", line 207, in enter
self.prev_idx = torch._C._cuda_getDevice()
AttributeError: module 'torch._C' has no attribute '_cuda_getDevice'

Steps to reproduce the behavior

Data prepared and model downloaded, after running python main.py --config config/st_gcn/kinetics-skeleton/test.yaml, I got this Attribute Error.

Other comments

Can not testing Pretrained Models.

HI!@yysijie @yjxiong
Thanks for your great work!
I'm studying your work,but I came up with some problems as follows,could you please give me some hints?

Code version (Git Hash) and PyTorch version

Use pytorch0.4 version.

Dataset used

Kinetics-skeleton
`
RuntimeError: Error(s) in loading state_dict for Model:
Unexpected key(s) in state_dict: "backbone.0.gcn1.mask", "backbone.0.gcn1.conv_list.0.weight", "backbone.0.gcn1.conv_list.0.bias", "backbone.0.gcn1.conv_list.1.weight", "backbone.0.gcn1.conv_list.1.bias", "backbone.0.gcn1.conv_list.2.weight", "backbone.0.gcn1.conv_list.2.bias", "backbone.0.gcn1.bn.weight", "backbone.0.gcn1.bn.bias", "backbone.0.gcn1.bn.running_mean", "backbone.0.gcn1.bn.running_var", "backbone.0.tcn1.conv.weight", "backbone.0.tcn1.conv.bias", "backbone.0.tcn1.bn.weight", "backbone.0.tcn1.bn.bias", "backbone.0.tcn1.bn.running_mean", "backbone.0.tcn1.bn.running_var", "backbone.1.gcn1.mask", "backbone.1.gcn1.conv_list.0.weight", "backbone.1.gcn1.conv_list.0.bias", "backbone.1.gcn1.conv_list.1.weight", "backbone.1.gcn1.conv_list.1.bias", "backbone.1.gcn1.conv_list.2.weight", "backbone.1.gcn1.conv_list.2.bias", "backbone.1.gcn1.bn.weight", "backbone.1.gcn1.bn.bias", "backbone.1.gcn1.bn.running_mean", "backbone.1.gcn1.bn.running_var", "backbone.1.tcn1.conv.weight", "backbone.1.tcn1.conv.bias", "backbone.1.tcn1.bn.weight", "backbone.1.tcn1.bn.bias", "backbone.1.tcn1.bn.running_mean", "backbone.1.tcn1.bn.running_var", "backbone.2.gcn1.mask", "backbone.2.gcn1.conv_list.0.weight", "backbone.2.gcn1.conv_list.0.bias", "backbone.2.gcn1.conv_list.1.weight", "backbone.2.gcn1.conv_list.1.bias", "backbone.2.gcn1.conv_list.2.weight", "backbone.2.gcn1.conv_list.2.bias", "backbone.2.gcn1.bn.weight", "backbone.2.gcn1.bn.bias", "backbone.2.gcn1.bn.running_mean", "backbone.2.gcn1.bn.running_var", "backbone.2.tcn1.conv.weight", "backbone.2.tcn1.conv.bias", "backbone.2.tcn1.bn.weight", "backbone.2.tcn1.bn.bias", "backbone.2.tcn1.bn.running_mean", "backbone.2.tcn1.bn.running_var", "backbone.3.gcn1.mask", "backbone.3.gcn1.conv_list.0.weight", "backbone.3.gcn1.conv_list.0.bias", "backbone.3.gcn1.conv_list.1.weight", "backbone.3.gcn1.conv_list.1.bias", "backbone.3.gcn1.conv_list.2.weight", "backbone.3.gcn1.conv_list.2.bias", "backbone.3.gcn1.bn.weight", "backbone.3.gcn1.bn.bias", "backbone.3.gcn1.bn.running_mean", "backbone.3.gcn1.bn.running_var", "backbone.3.tcn1.conv.weight", "backbone.3.tcn1.conv.bias", "backbone.3.tcn1.bn.weight", "backbone.3.tcn1.bn.bias", "backbone.3.tcn1.bn.running_mean", "backbone.3.tcn1.bn.running_var", "backbone.3.down1.conv.weight", "backbone.3.down1.conv.bias", "backbone.3.down1.bn.weight", "backbone.3.down1.bn.bias", "backbone.3.down1.bn.running_mean", "backbone.3.down1.bn.running_var", "backbone.4.gcn1.mask", "backbone.4.gcn1.conv_list.0.weight", "backbone.4.gcn1.conv_list.0.bias", "backbone.4.gcn1.conv_list.1.weight", "backbone.4.gcn1.conv_list.1.bias", "backbone.4.gcn1.conv_list.2.weight", "backbone.4.gcn1.conv_list.2.bias", "backbone.4.gcn1.bn.weight", "backbone.4.gcn1.bn.bias", "backbone.4.gcn1.bn.running_mean", "backbone.4.gcn1.bn.running_var", "backbone.4.tcn1.conv.weight", "backbone.4.tcn1.conv.bias", "backbone.4.tcn1.bn.weight", "backbone.4.tcn1.bn.bias", "backbone.4.tcn1.bn.running_mean", "backbone.4.tcn1.bn.running_var", "backbone.5.gcn1.mask", "backbone.5.gcn1.conv_list.0.weight", "backbone.5.gcn1.conv_list.0.bias", "backbone.5.gcn1.conv_list.1.weight", "backbone.5.gcn1.conv_list.1.bias", "backbone.5.gcn1.conv_list.2.weight", "backbone.5.gcn1.conv_list.2.bias", "backbone.5.gcn1.bn.weight", "backbone.5.gcn1.bn.bias", "backbone.5.gcn1.bn.running_mean", "backbone.5.gcn1.bn.running_var", "backbone.5.tcn1.conv.weight", "backbone.5.tcn1.conv.bias", "backbone.5.tcn1.bn.weight", "backbone.5.tcn1.bn.bias", "backbone.5.tcn1.bn.running_mean", "backbone.5.tcn1.bn.running_var", "backbone.6.gcn1.mask", "backbone.6.gcn1.conv_list.0.weight", "backbone.6.gcn1.conv_list.0.bias", "backbone.6.gcn1.conv_list.1.weight", "backbone.6.gcn1.conv_list.1.bias", "backbone.6.gcn1.conv_list.2.weight", "backbone.6.gcn1.conv_list.2.bias", "backbone.6.gcn1.bn.weight", "backbone.6.gcn1.bn.bias", "backbone.6.gcn1.bn.running_mean", "backbone.6.gcn1.bn.running_var", "backbone.6.tcn1.conv.weight", "backbone.6.tcn1.conv.bias", "backbone.6.tcn1.bn.weight", "backbone.6.tcn1.bn.bias", "backbone.6.tcn1.bn.running_mean", "backbone.6.tcn1.bn.running_var", "backbone.6.down1.conv.weight", "backbone.6.down1.conv.bias", "backbone.6.down1.bn.weight", "backbone.6.down1.bn.bias", "backbone.6.down1.bn.running_mean", "backbone.6.down1.bn.running_var", "backbone.7.gcn1.mask", "backbone.7.gcn1.conv_list.0.weight", "backbone.7.gcn1.conv_list.0.bias", "backbone.7.gcn1.conv_list.1.weight", "backbone.7.gcn1.conv_list.1.bias", "backbone.7.gcn1.conv_list.2.weight", "backbone.7.gcn1.conv_list.2.bias", "backbone.7.gcn1.bn.weight", "backbone.7.gcn1.bn.bias", "backbone.7.gcn1.bn.running_mean", "backbone.7.gcn1.bn.running_var", "backbone.7.tcn1.conv.weight", "backbone.7.tcn1.conv.bias", "backbone.7.tcn1.bn.weight", "backbone.7.tcn1.bn.bias", "backbone.7.tcn1.bn.running_mean", "backbone.7.tcn1.bn.running_var", "backbone.8.gcn1.mask", "backbone.8.gcn1.conv_list.0.weight", "backbone.8.gcn1.conv_list.0.bias", "backbone.8.gcn1.conv_list.1.weight", "backbone.8.gcn1.conv_list.1.bias", "backbone.8.gcn1.conv_list.2.weight", "backbone.8.gcn1.conv_list.2.bias", "backbone.8.gcn1.bn.weight", "backbone.8.gcn1.bn.bias", "backbone.8.gcn1.bn.running_mean", "backbone.8.gcn1.bn.running_var", "backbone.8.tcn1.conv.weight", "backbone.8.tcn1.conv.bias", "backbone.8.tcn1.bn.weight", "backbone.8.tcn1.bn.bias", "backbone.8.tcn1.bn.running_mean", "backbone.8.tcn1.bn.running_var", "gcn0.mask", "gcn0.conv_list.0.weight", "gcn0.conv_list.0.bias", "gcn0.conv_list.1.weight", "gcn0.conv_list.1.bias", "gcn0.conv_list.2.weight", "gcn0.conv_list.2.bias", "gcn0.bn.weight", "gcn0.bn.bias", "gcn0.bn.running_mean", "gcn0.bn.running_var", "tcn0.conv.weight", "tcn0.conv.bias", "tcn0.bn.weight", "tcn0.bn.bias", "tcn0.bn.running_mean", "tcn0.bn.running_var", "person_bn.weight", "person_bn.bias", "person_bn.running_mean", "person_bn.running_var".
While copying the parameter named "data_bn.weight", whose dimensions in the model are torch.Size([54]) and whose dimensions in the checkpoint are torch.Size([108]).
While copying the parameter named "data_bn.bias", whose dimensions in the model are torch.Size([54]) and whose dimensions in the checkpoint are torch.Size([108]).
While copying the parameter named "data_bn.running_mean", whose dimensions in the model are torch.Size([54]) and whose dimensions in the checkpoint are torch.Size([108]).
While copying the parameter named "data_bn.running_var", whose dimensions in the model are torch.Size([54]) and whose dimensions in the checkpoint are torch.Size([108]).
While copying the parameter named "fcn.weight", whose dimensions in the model are torch.Size([400, 256, 1, 1]) and whose dimensions in the checkpoint are torch.Size([400, 256, 1]).

Crashes with maybe some empty buffer error

The training starts off fine but crashes with the following error after some iterations:

[ Thu Jan 25 14:23:24 2018 ] Training epoch: 1
[ Thu Jan 25 14:27:41 2018 ]    Batch(0/940) done. Loss: 6.3487  lr:0.100000
[ Thu Jan 25 14:36:03 2018 ]    Batch(100/940) done. Loss: 5.6673  lr:0.100000
[ Thu Jan 25 14:44:26 2018 ]    Batch(200/940) done. Loss: 5.4984  lr:0.100000
[ Thu Jan 25 14:52:48 2018 ]    Batch(300/940) done. Loss: 5.3150  lr:0.100000
Traceback (most recent call last):
  File "main.py", line 426, in <module>
    processor.start()
  File "main.py", line 371, in start
    self.train(epoch, save_model=save_model)
  File "main.py", line 298, in train
    loss.backward()
  File "/home/anaconda2/lib/python2.7/site-packages/torch/autograd/variable.py", line 167, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
  File "/home/anaconda2/lib/python2.7/site-packages/torch/autograd/__init__.py", line 99, in backward
    variables, grad_variables, retain_graph)
RuntimeError: torch/csrc/autograd/input_buffer.cpp:14: add: Assertion `pos >= 0 && pos < buffer.size()` failed.

Wondering if anyone has seen this before? I'm training on Kinetics-skeletons data downloaded from the provided link.

Cannot download pretrained model

Hi sijie:
I try to run your program. However i cannot download your pretrained model due to the GFW in chinese mainland. Could you please provide pretrained models in Baidu Yun or other way.

With much of thanks.

--save-score

Code version (Git Hash) and PyTorch version

6542289, 0.3.1, running Python 3.6.4

Dataset used

NTU-RGB-D

Expected behavior

save the scores

Actual behavior

Traceback (most recent call last):
File "/home/ss/eclipse-workspace/CNN_Skeleton_NTURGB/main.py", line 428, in
processor.start()
File "/home/ss/eclipse-workspace/CNN_Skeleton_NTURGB/main.py", line 390, in start
epoch=0, save_score=self.arg.save_score, loader_name=['test'])
File "/home/ss/eclipse-workspace/CNN_Skeleton_NTURGB/main.py", line 362, in eval
pickle.dump(score_dict, f)
TypeError: write() argument must be str, not bytes

Steps to reproduce the behavior

I want save the scores so in the main.py, I changed the default=true as below:
'--save-score',
type=str2bool,
default=True,
help='if true, the classification score will be stored')

Other comments

Skeleton Input Format and Preprocessing

Hi and thank you very much for making your work publicly available.

I have some questions regarding the input format of the skeletons on a custom dataset. I can extract skeletons with 18 2D joints in which x-coordinate is in range [0, image_width] and y-coordinate is in range [0, image_height]. In your Kinect Model however, the joints are in range [0,1]. So, should I divide the values in the natural way to get the [0,1]?

In some skeleton-based approaches, they preprocess the skeletons for normalizing them with tricks like recalculating the coordinates with respect to the center of skeleton or dividing by the height and width of skeletons, etc. Would these techniques be necessary with your approach?

val_label.pkl have not found

when I run this command python main.py recognition -c config/st_gcn/kinetics-skeleton/test.yam
the val_label.pkl have not found
Traceback (most recent call last):
File "main.py", line 30, in
p = Processor(sys.argv[2:])
File "/home/lc/Downloads/st-gcn/processor/processor.py", line 33, in init
self.load_data()
File "/home/lc/Downloads/st-gcn/processor/processor.py", line 62, in load_data
dataset=Feeder(**self.arg.test_feeder_args),
File "/home/lc/Downloads/st-gcn/feeder/feeder.py", line 48, in init
self.load_data(mmap)
File "/home/lc/Downloads/st-gcn/feeder/feeder.py", line 54, in load_data
with open(self.label_path, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: './data/Kinetics/kinetics-skeleton/val_label.pkl'

where is the label of estimation result? SoftMax classifier?

Thanks for your great work!
I have read your paper,I found you feed a 256 dimension feature vector to a SoftMax classifier in your work,but I didn't find the "softmax" in your code.So I don't know how to get the classified result of estimation.
Do I misunderstand something?
Actually,I'm trying to make a demo of your work.
How can I input one ".skeleton" file from NTU RGB+D dataset and get its estimation label?
Do you have any these kind of script or just inform me some tips?
Thank you very much!

can you help me?

T run this comman :python3.6 main.py demo --openpose /home/lc/Downloads/openpose-master/cmake-build-debug --video /home/lc/Downloads/st-gcn/resource/media/clean_and_jerk.mp4 --device 0
then got a error
ERROR: unknown command line flag 'display'
ERROR: unknown command line flag 'write_json'
Can not find pose estimation results.

window_size parameter

Dataset used

Kinetics dataset

Other comments

In the paper we read that:

In practice, we represent the clips with tensors of (3, T, 18, 2) dimensions. For simplicity, we pad every clip by replaying the sequence from the start to have T = 300.

If I understand correctly, this T is related to the window_size parameter in the code. However, I am a bit confused as this parameter is used in both the model and the feeder. What exactly is the difference between these two?

About 2D-skeleton

 Hello, I want to ask that does st-gcn work on 2D-skeleton?

Testing realtime

How do I test the code in a realtime video feed?

hi I had some problem. Can you help me?

I want to train the model by myself.
but I run this code python main.py --config config/st_gcn/nturgbd-cross-subject/train.yaml

I get this error message
RuntimeError: invalid argument 1: must be strictly positive at c:\anaconda2\conda-bld\pytorch_1519496000060\work\torch\lib\th\generic/THTensorMath.c:2247

Can you tell me what happened?
And How can I fix it?
Thank you.

Hi~Could someone help me? (Kinetics-skeleton)

Hi. I met some problem when I tried to test realtime. Forward to your reply. Thanks in advance!

Code version (Git Hash) and PyTorch version

pytorch version: 0.3.0

Dataset used

Kinetics-skeleton

Expected behavior

Want to test realtime

Actual behavior

I want to recognize the action in my video. I have used the script tools/convert.py to convert Openpose-format json to st-gcn-format json. ..Then how to feed the converted st-gcn.json file? What does fake_label.json work for? Could you please give me some advice?

Steps to reproduce the behavior

After using Openpose to treat the original video, I got the output with pose estimation and Openpose-format json files. Then I used the script tools/convert.py to convert Openpose-format json files to st-gcn-format json file and fake_label.json.

Other comments

Visulization of ST-GCN in Action

README file contains some visualizations showing neural response magnitude of each node in the last layer of our ST-GCN.

Would you be sharing the code for same?

Under fitting on Kinetics

I've trained st-gcn on your provided Kinetics dataset and find it converged much slower than on NTU RGB-D. The loss is about 3.0 even in epoch 60.

[ Mon Jun 11 03:58:19 2018 ] Training epoch: 58
[ Mon Jun 11 03:59:10 2018 ] 	Batch(0/940) done. Loss: 3.2975  lr:0.000010
[ Mon Jun 11 04:01:25 2018 ] 	Batch(100/940) done. Loss: 3.1727  lr:0.000010
[ Mon Jun 11 04:03:28 2018 ] 	Batch(200/940) done. Loss: 3.1806  lr:0.000010
[ Mon Jun 11 04:05:30 2018 ] 	Batch(300/940) done. Loss: 3.0176  lr:0.000010
[ Mon Jun 11 04:07:33 2018 ] 	Batch(400/940) done. Loss: 3.1771  lr:0.000010
[ Mon Jun 11 04:09:35 2018 ] 	Batch(500/940) done. Loss: 3.0130  lr:0.000010
[ Mon Jun 11 04:11:38 2018 ] 	Batch(600/940) done. Loss: 3.1398  lr:0.000010
[ Mon Jun 11 04:13:39 2018 ] 	Batch(700/940) done. Loss: 3.1270  lr:0.000010
[ Mon Jun 11 04:15:43 2018 ] 	Batch(800/940) done. Loss: 3.1568  lr:0.000010
[ Mon Jun 11 04:17:44 2018 ] 	Batch(900/940) done. Loss: 3.2155  lr:0.000010
[ Mon Jun 11 04:18:30 2018 ] 	Mean training loss: 3.0334.
[ Mon Jun 11 04:18:30 2018 ] 	Time consumption: [Data]07%, [Network]93%
[ Mon Jun 11 04:18:30 2018 ] Training epoch: 59
[ Mon Jun 11 04:19:20 2018 ] 	Batch(0/940) done. Loss: 3.0132  lr:0.000010
[ Mon Jun 11 04:21:36 2018 ] 	Batch(100/940) done. Loss: 3.0151  lr:0.000010
[ Mon Jun 11 04:23:40 2018 ] 	Batch(200/940) done. Loss: 2.9103  lr:0.000010
[ Mon Jun 11 04:25:43 2018 ] 	Batch(300/940) done. Loss: 3.1019  lr:0.000010
[ Mon Jun 11 04:27:47 2018 ] 	Batch(400/940) done. Loss: 3.0053  lr:0.000010
[ Mon Jun 11 04:29:49 2018 ] 	Batch(500/940) done. Loss: 3.0840  lr:0.000010
[ Mon Jun 11 04:31:52 2018 ] 	Batch(600/940) done. Loss: 3.0048  lr:0.000010
[ Mon Jun 11 04:33:54 2018 ] 	Batch(700/940) done. Loss: 2.8002  lr:0.000010
[ Mon Jun 11 04:35:57 2018 ] 	Batch(800/940) done. Loss: 2.8962  lr:0.000010
[ Mon Jun 11 04:37:58 2018 ] 	Batch(900/940) done. Loss: 2.8665  lr:0.000010
[ Mon Jun 11 04:38:44 2018 ] 	Mean training loss: 3.0354.
[ Mon Jun 11 04:38:44 2018 ] 	Time consumption: [Data]07%, [Network]93%
[ Mon Jun 11 04:38:44 2018 ] Training epoch: 60
[ Mon Jun 11 04:39:34 2018 ] 	Batch(0/940) done. Loss: 2.9069  lr:0.000010
[ Mon Jun 11 04:41:49 2018 ] 	Batch(100/940) done. Loss: 3.2320  lr:0.000010
[ Mon Jun 11 04:43:51 2018 ] 	Batch(200/940) done. Loss: 3.0250  lr:0.000010
[ Mon Jun 11 04:45:54 2018 ] 	Batch(300/940) done. Loss: 3.0408  lr:0.000010
[ Mon Jun 11 04:47:56 2018 ] 	Batch(400/940) done. Loss: 2.9131  lr:0.000010
[ Mon Jun 11 04:49:59 2018 ] 	Batch(500/940) done. Loss: 3.0381  lr:0.000010
[ Mon Jun 11 04:52:00 2018 ] 	Batch(600/940) done. Loss: 2.9975  lr:0.000010
[ Mon Jun 11 04:54:02 2018 ] 	Batch(700/940) done. Loss: 3.0352  lr:0.000010
[ Mon Jun 11 04:56:04 2018 ] 	Batch(800/940) done. Loss: 3.0978  lr:0.000010
[ Mon Jun 11 04:58:04 2018 ] 	Batch(900/940) done. Loss: 3.0627  lr:0.000010
[ Mon Jun 11 04:58:51 2018 ] 	Mean training loss: 3.0331.
[ Mon Jun 11 04:58:51 2018 ] 	Time consumption: [Data]07%, [Network]93%
[ Mon Jun 11 04:58:51 2018 ] Eval epoch: 60
[ Mon Jun 11 05:00:30 2018 ] 	Mean test loss of 78 batches: 3.2698305753561168.
[ Mon Jun 11 05:00:31 2018 ] 	Top1: 29.97%
[ Mon Jun 11 05:00:32 2018 ] 	Top5: 52.80%

I presume it is because the kinetics dataset is much larger than NTU and the same network no longer fits the kinetics dataset.

Strange train log of Kinetics

I try to train Kinetics-skeleton by myself, but get confused with the training log.
Here is the output log of my training process, where the loss seems hard to converge, and the first validation accuracy is only 2.07%. (2 GPU used & batch size=128)
Is it normal or not?Thanks for your help!

[ Wed Jun 6 16:01:01 2018 ] Training epoch: 1
[ Wed Jun 6 16:01:16 2018 ] Batch(0/1879) done. Loss: 6.3596 lr:0.100000
[ Wed Jun 6 16:02:39 2018 ] Batch(100/1879) done. Loss: 5.9635 lr:0.100000
[ Wed Jun 6 16:04:06 2018 ] Batch(200/1879) done. Loss: 5.8815 lr:0.100000
[ Wed Jun 6 16:05:30 2018 ] Batch(300/1879) done. Loss: 5.8872 lr:0.100000
[ Wed Jun 6 16:06:55 2018 ] Batch(400/1879) done. Loss: 5.9088 lr:0.100000
[ Wed Jun 6 16:08:20 2018 ] Batch(500/1879) done. Loss: 5.8656 lr:0.100000
[ Wed Jun 6 16:09:44 2018 ] Batch(600/1879) done. Loss: 5.8867 lr:0.100000
[ Wed Jun 6 16:11:09 2018 ] Batch(700/1879) done. Loss: 5.8330 lr:0.100000
[ Wed Jun 6 16:12:33 2018 ] Batch(800/1879) done. Loss: 5.9027 lr:0.100000
[ Wed Jun 6 16:14:00 2018 ] Batch(900/1879) done. Loss: 5.8755 lr:0.100000
[ Wed Jun 6 16:15:25 2018 ] Batch(1000/1879) done. Loss: 5.8760 lr:0.100000
[ Wed Jun 6 16:16:50 2018 ] Batch(1100/1879) done. Loss: 5.8521 lr:0.100000
[ Wed Jun 6 16:18:14 2018 ] Batch(1200/1879) done. Loss: 5.8933 lr:0.100000
[ Wed Jun 6 16:19:38 2018 ] Batch(1300/1879) done. Loss: 5.9311 lr:0.100000
[ Wed Jun 6 16:21:03 2018 ] Batch(1400/1879) done. Loss: 5.8277 lr:0.100000
[ Wed Jun 6 16:22:28 2018 ] Batch(1500/1879) done. Loss: 5.8691 lr:0.100000
[ Wed Jun 6 16:23:53 2018 ] Batch(1600/1879) done. Loss: 5.8348 lr:0.100000
[ Wed Jun 6 16:25:18 2018 ] Batch(1700/1879) done. Loss: 5.9588 lr:0.100000
[ Wed Jun 6 16:26:43 2018 ] Batch(1800/1879) done. Loss: 5.9448 lr:0.100000
[ Wed Jun 6 16:27:48 2018 ] Mean training loss: 5.9060.
[ Wed Jun 6 16:27:48 2018 ] Time consumption: [Data]01%, [Network]99%
[ Wed Jun 6 16:27:48 2018 ] Training epoch: 2
[ Wed Jun 6 16:28:01 2018 ] Batch(0/1879) done. Loss: 5.8867 lr:0.100000
[ Wed Jun 6 16:29:25 2018 ] Batch(100/1879) done. Loss: 5.9181 lr:0.100000
[ Wed Jun 6 16:30:49 2018 ] Batch(200/1879) done. Loss: 5.8454 lr:0.100000
[ Wed Jun 6 16:32:13 2018 ] Batch(300/1879) done. Loss: 5.8444 lr:0.100000
[ Wed Jun 6 16:33:36 2018 ] Batch(400/1879) done. Loss: 5.9084 lr:0.100000
[ Wed Jun 6 16:34:59 2018 ] Batch(500/1879) done. Loss: 5.8634 lr:0.100000
[ Wed Jun 6 16:36:22 2018 ] Batch(600/1879) done. Loss: 5.8313 lr:0.100000
[ Wed Jun 6 16:37:45 2018 ] Batch(700/1879) done. Loss: 5.9054 lr:0.100000
[ Wed Jun 6 16:39:09 2018 ] Batch(800/1879) done. Loss: 5.8981 lr:0.100000
[ Wed Jun 6 16:40:32 2018 ] Batch(900/1879) done. Loss: 5.9091 lr:0.100000
[ Wed Jun 6 16:41:56 2018 ] Batch(1000/1879) done. Loss: 5.8667 lr:0.100000
[ Wed Jun 6 16:43:19 2018 ] Batch(1100/1879) done. Loss: 5.9404 lr:0.100000
[ Wed Jun 6 16:44:42 2018 ] Batch(1200/1879) done. Loss: 5.8945 lr:0.100000
[ Wed Jun 6 16:46:05 2018 ] Batch(1300/1879) done. Loss: 5.9334 lr:0.100000
[ Wed Jun 6 16:47:28 2018 ] Batch(1400/1879) done. Loss: 5.8050 lr:0.100000
[ Wed Jun 6 16:48:50 2018 ] Batch(1500/1879) done. Loss: 5.9695 lr:0.100000
[ Wed Jun 6 16:50:13 2018 ] Batch(1600/1879) done. Loss: 5.8498 lr:0.100000
[ Wed Jun 6 16:51:36 2018 ] Batch(1700/1879) done. Loss: 5.8443 lr:0.100000
[ Wed Jun 6 16:52:59 2018 ] Batch(1800/1879) done. Loss: 5.8773 lr:0.100000
[ Wed Jun 6 16:54:03 2018 ] Mean training loss: 5.8806.
[ Wed Jun 6 16:54:03 2018 ] Time consumption: [Data]01%, [Network]99%
[ Wed Jun 6 16:54:03 2018 ] Training epoch: 3
[ Wed Jun 6 16:54:15 2018 ] Batch(0/1879) done. Loss: 5.8311 lr:0.100000
[ Wed Jun 6 16:55:38 2018 ] Batch(100/1879) done. Loss: 5.8930 lr:0.100000
[ Wed Jun 6 16:57:00 2018 ] Batch(200/1879) done. Loss: 5.8610 lr:0.100000
[ Wed Jun 6 16:58:24 2018 ] Batch(300/1879) done. Loss: 5.8864 lr:0.100000
[ Wed Jun 6 16:59:47 2018 ] Batch(400/1879) done. Loss: 5.8617 lr:0.100000
[ Wed Jun 6 17:01:10 2018 ] Batch(500/1879) done. Loss: 5.8742 lr:0.100000
[ Wed Jun 6 17:02:33 2018 ] Batch(600/1879) done. Loss: 5.8479 lr:0.100000
[ Wed Jun 6 17:03:56 2018 ] Batch(700/1879) done. Loss: 5.8783 lr:0.100000
[ Wed Jun 6 17:05:18 2018 ] Batch(800/1879) done. Loss: 5.8687 lr:0.100000
[ Wed Jun 6 17:06:41 2018 ] Batch(900/1879) done. Loss: 5.8906 lr:0.100000
[ Wed Jun 6 17:08:04 2018 ] Batch(1000/1879) done. Loss: 5.7582 lr:0.100000
[ Wed Jun 6 17:09:26 2018 ] Batch(1100/1879) done. Loss: 5.8391 lr:0.100000
[ Wed Jun 6 17:10:50 2018 ] Batch(1200/1879) done. Loss: 5.8959 lr:0.100000
[ Wed Jun 6 17:12:13 2018 ] Batch(1300/1879) done. Loss: 5.8328 lr:0.100000
[ Wed Jun 6 17:13:37 2018 ] Batch(1400/1879) done. Loss: 5.8172 lr:0.100000
[ Wed Jun 6 17:15:01 2018 ] Batch(1500/1879) done. Loss: 5.9253 lr:0.100000
[ Wed Jun 6 17:16:22 2018 ] Batch(1600/1879) done. Loss: 5.8176 lr:0.100000
[ Wed Jun 6 17:17:45 2018 ] Batch(1700/1879) done. Loss: 5.8348 lr:0.100000
[ Wed Jun 6 17:19:08 2018 ] Batch(1800/1879) done. Loss: 5.8490 lr:0.100000
[ Wed Jun 6 17:20:12 2018 ] Mean training loss: 5.8631.
[ Wed Jun 6 17:20:12 2018 ] Time consumption: [Data]01%, [Network]99%
[ Wed Jun 6 17:20:12 2018 ] Training epoch: 4
[ Wed Jun 6 17:20:24 2018 ] Batch(0/1879) done. Loss: 5.8593 lr:0.100000
[ Wed Jun 6 17:21:46 2018 ] Batch(100/1879) done. Loss: 5.8140 lr:0.100000
[ Wed Jun 6 17:23:09 2018 ] Batch(200/1879) done. Loss: 5.8602 lr:0.100000
[ Wed Jun 6 17:24:32 2018 ] Batch(300/1879) done. Loss: 5.7933 lr:0.100000
[ Wed Jun 6 17:25:55 2018 ] Batch(400/1879) done. Loss: 5.8226 lr:0.100000
[ Wed Jun 6 17:27:18 2018 ] Batch(500/1879) done. Loss: 5.8541 lr:0.100000
[ Wed Jun 6 17:28:41 2018 ] Batch(600/1879) done. Loss: 5.8988 lr:0.100000
[ Wed Jun 6 17:30:03 2018 ] Batch(700/1879) done. Loss: 5.8665 lr:0.100000
[ Wed Jun 6 17:31:26 2018 ] Batch(800/1879) done. Loss: 5.8884 lr:0.100000
[ Wed Jun 6 17:32:49 2018 ] Batch(900/1879) done. Loss: 5.8810 lr:0.100000
[ Wed Jun 6 17:34:12 2018 ] Batch(1000/1879) done. Loss: 5.9417 lr:0.100000
[ Wed Jun 6 17:35:35 2018 ] Batch(1100/1879) done. Loss: 5.8325 lr:0.100000
[ Wed Jun 6 17:36:58 2018 ] Batch(1200/1879) done. Loss: 5.8373 lr:0.100000
[ Wed Jun 6 17:38:21 2018 ] Batch(1300/1879) done. Loss: 5.9107 lr:0.100000
[ Wed Jun 6 17:39:43 2018 ] Batch(1400/1879) done. Loss: 5.8672 lr:0.100000
[ Wed Jun 6 17:41:07 2018 ] Batch(1500/1879) done. Loss: 5.8756 lr:0.100000
[ Wed Jun 6 17:42:31 2018 ] Batch(1600/1879) done. Loss: 5.8735 lr:0.100000
[ Wed Jun 6 17:43:54 2018 ] Batch(1700/1879) done. Loss: 5.8137 lr:0.100000
[ Wed Jun 6 17:45:16 2018 ] Batch(1800/1879) done. Loss: 5.8421 lr:0.100000
[ Wed Jun 6 17:46:20 2018 ] Mean training loss: 5.8499.
[ Wed Jun 6 17:46:20 2018 ] Time consumption: [Data]01%, [Network]99%
[ Wed Jun 6 17:46:20 2018 ] Training epoch: 5
[ Wed Jun 6 17:46:32 2018 ] Batch(0/1879) done. Loss: 5.8172 lr:0.100000
[ Wed Jun 6 17:47:54 2018 ] Batch(100/1879) done. Loss: 5.7880 lr:0.100000
[ Wed Jun 6 17:49:18 2018 ] Batch(200/1879) done. Loss: 5.8781 lr:0.100000
[ Wed Jun 6 17:50:40 2018 ] Batch(300/1879) done. Loss: 5.8889 lr:0.100000
[ Wed Jun 6 17:52:02 2018 ] Batch(400/1879) done. Loss: 5.8591 lr:0.100000
[ Wed Jun 6 17:53:26 2018 ] Batch(500/1879) done. Loss: 5.8163 lr:0.100000
[ Wed Jun 6 17:54:49 2018 ] Batch(600/1879) done. Loss: 5.7984 lr:0.100000
[ Wed Jun 6 17:56:12 2018 ] Batch(700/1879) done. Loss: 5.8237 lr:0.100000
[ Wed Jun 6 17:57:35 2018 ] Batch(800/1879) done. Loss: 5.7660 lr:0.100000
[ Wed Jun 6 17:58:58 2018 ] Batch(900/1879) done. Loss: 5.7854 lr:0.100000
[ Wed Jun 6 18:00:21 2018 ] Batch(1000/1879) done. Loss: 5.8939 lr:0.100000
[ Wed Jun 6 18:01:44 2018 ] Batch(1100/1879) done. Loss: 5.8935 lr:0.100000
[ Wed Jun 6 18:03:06 2018 ] Batch(1200/1879) done. Loss: 5.7438 lr:0.100000
[ Wed Jun 6 18:04:29 2018 ] Batch(1300/1879) done. Loss: 5.7744 lr:0.100000
[ Wed Jun 6 18:05:52 2018 ] Batch(1400/1879) done. Loss: 5.8634 lr:0.100000
[ Wed Jun 6 18:07:15 2018 ] Batch(1500/1879) done. Loss: 5.9241 lr:0.100000
[ Wed Jun 6 18:08:38 2018 ] Batch(1600/1879) done. Loss: 5.8004 lr:0.100000
[ Wed Jun 6 18:10:00 2018 ] Batch(1700/1879) done. Loss: 5.8693 lr:0.100000
[ Wed Jun 6 18:11:22 2018 ] Batch(1800/1879) done. Loss: 5.7603 lr:0.100000
[ Wed Jun 6 18:12:27 2018 ] Mean training loss: 5.8416.
[ Wed Jun 6 18:12:27 2018 ] Time consumption: [Data]01%, [Network]99%
[ Wed Jun 6 18:12:27 2018 ] Eval epoch: 5
[ Wed Jun 6 18:13:59 2018 ] Mean test loss of 155 batches: 5.710531379330543.
[ Wed Jun 6 18:14:00 2018 ] Top1: 2.07%
[ Wed Jun 6 18:14:00 2018 ] Top5: 8.24%

out of memory

Thanks for your great work!
When I run this project with the following errors:
System information (version)

centos7
python=3.6.5

(pytorch) [root@localhost st-gcn]# python main.py --config config/st_gcn/kinetics-skeleton/test.yaml
[ Wed Jun  6 10:30:43 2018 ] Load weights from ./model/kinetics-st_gcn.pt.
[ Wed Jun  6 10:30:43 2018 ] Model:   st_gcn.net.ST_GCN.
[ Wed Jun  6 10:30:43 2018 ] Weights: ./model/kinetics-st_gcn.pt.
[ Wed Jun  6 10:30:43 2018 ] Eval epoch: 1
0/310
main.py:349: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  volatile=True)
main.py:353: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  volatile=True)
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1525909934016/work/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory

EOFError during Testing Pretrained Models

HI!@yysijie @yjxiong
Thanks for your great work!
I'm studying your work,but I came up with some problems as follows,could you please give me some hints?

Code version (Git Hash) and PyTorch version

I USE Docker which contains:
===================== module list ===================
python 3.6 (apt) # torch latest (git) # chainer latest (pip) # cntk 2.3 (pip) # jupyter latest (pip) # mxnet latest (pip) # pytorch 0.3.0 (pip) # tensorflow latest (pip) # theano latest (git) # keras latest (pip) # lasagne latest (git) # opencv latest (git) # sonnet latest (pip) # caffe latest (git) # ==================================================

Dataset used

NTU RGB+D

Expected behavior

Testing Pretrained Models
run the code python main.py --config config/st_gcn/kinetics-skeleton/test.yaml

Actual behavior

root@eb6f3da57874:/data1/st-gcn/st-gcn-master# python main.py --config config/st_gcn/kinetics-skeleton/test.yaml
[ Mon Apr 23 10:08:30 2018 ] Load weights from ./model/kinetics-st_gcn.pt.
[ Mon Apr 23 10:08:31 2018 ] Model: st_gcn.net.ST_GCN.
[ Mon Apr 23 10:08:31 2018 ] Weights: ./model/kinetics-st_gcn.pt.
[ Mon Apr 23 10:08:31 2018 ] Eval epoch: 1
Process Process-8:
Process Process-23:
Process Process-22:
Process Process-25:
Process Process-21:
Process Process-24:
Process Process-2:
Process Process-5:
Process Process-13:
Process Process-7:
Process Process-14:
Process Process-11:
Process Process-12:
Process Process-35:
Process Process-32:
Process Process-31:
Process Process-19:
Process Process-28:
Process Process-30:
Process Process-33:
Process Process-16:
Process Process-26:
Process Process-18:
Process Process-27:
Process Process-29:
Process Process-15:
Process Process-50:
Process Process-51:
Process Process-45:
Process Process-44:
Process Process-46:
Traceback (most recent call last):
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 46, in _worker_loop
data_queue.put((idx, samples))
File "/usr/lib/python3.6/multiprocessing/queues.py", line 341, in put
obj = _ForkingPickler.dumps(obj)
File "/usr/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
File "/usr/local/lib/python3.6/dist-packages/torch/multiprocessing/reductions.py", line 113, in reduce_storage
fd, size = storage.share_fd()
RuntimeError: unable to write to file </torch_140_3440579072> at /pytorch/torch/lib/TH/THAllocator.c:319
.....
with _resource_sharer.get_connection(self._id) as conn:
File "/usr/lib/python3.6/multiprocessing/resource_sharer.py", line 87, in get_connection
c = Client(address, authkey=process.current_process().authkey)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 493, in Client
answer_challenge(c, authkey)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 732, in answer_challenge
message = connection.recv_bytes(256) # reject large message
File "/usr/lib/python3.6/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 383, in _recv
raise EOFError
EOFError

Steps to reproduce the behavior

I got pretrained models and both two datasets are ready

MemoryError and BrokenPipeError

Code version (Git Hash) and PyTorch version

Python 3.5 pytorch latest

Dataset used

NTU-RGB-D

Expected behavior

run python main.py --config config/st_gcn/nturgbd-cross-subject/test.yaml

Actual behavior

[ Wed Apr 25 14:36:10 2018 ] Load weights from ./model/ntuxsub-st_gcn.pt.
[ Wed Apr 25 14:36:10 2018 ] Model: st_gcn.net.ST_GCN.
[ Wed Apr 25 14:36:10 2018 ] Weights: ./model/ntuxsub-st_gcn.pt.
[ Wed Apr 25 14:36:10 2018 ] Eval epoch: 1
Traceback (most recent call last):
File "", line 1, in
File "C:\Program Files\Anaconda3\lib\multiprocessing\spawn.py", line 106, in spawn_main
exitcode = _main(fd)
File "C:\Program Files\Anaconda3\lib\multiprocessing\spawn.py", line 116, in _main
self = pickle.load(from_parent)
MemoryError
Traceback (most recent call last):
File "main.py", line 426, in
processor.start()
File "main.py", line 388, in start
epoch=0, save_score=self.arg.save_score, loader_name=['test'])
File "main.py", line 335, in eval
for batch_idx, (data, label) in enumerate(self.data_loader[ln]):
File "C:\Program Files\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 310, in iter
return DataLoaderIter(self)
File "C:\Program Files\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 167, in init
w.start()
File "C:\Program Files\Anaconda3\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Program Files\Anaconda3\lib\multiprocessing\context.py", line 212, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Program Files\Anaconda3\lib\multiprocessing\context.py", line 313, in _Popen
return Popen(process_obj)
File "C:\Program Files\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 66, in init
reduction.dump(process_obj, to_child)
File "C:\Program Files\Anaconda3\lib\multiprocessing\reduction.py", line 59, in dump
ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe

I use batch_size 1 does not work in GTX1050 4G what should i do?
Thank you!

Steps to reproduce the behavior

Other comments

Help！Memory Error！

Hi. I met some problem . Forward to your reply. Thanks!

Dataset used ：

Kinetics-skeleton

After rebuilding the database by this command：
python tools/kinetics_gendata.py --data_path
I got a train_data.npy file which is vrey large(=29G）. Now Iwant to transform it to a *.mat file to do my job. However, when I use np.load( ) to load this file ,it showed memory error because it was too big maybe.
What should I do to load the .npy file?Can you help we?Thanks very much!

Question about feeder_kinetics.py

Code version (Git Hash) and PyTorch version

Dataset used

Expected behavior

Actual behavior

Steps to reproduce the behavior

Other comments

Hi,

Thanks for the work. I'm using the master branch and trying to understand how you deal with Kinetics dataset. And there's something I'm confused in the script https://github.com/yysijie/st-gcn/blob/master/st_gcn/feeder/feeder_kinetics.py#L72

label_path = self.label_path
with open(label_path) as f:
    label_info = json.load(f)

sample_id = [name.split('.')[0] for name in self.sample_name]
self.label = np.array(
    [label_info[id]['label_index'] for id in sample_id])
has_skeleton = np.array(
    [label_info[id]['has_skeleton'] for id in sample_id])

If I don't mistake it, the label file should be kinetics_train.json from this link. And there is no key named label_index or has_skeleton in the json file, actually the single video's json file has label_index but no has_skeleton as well.

Does that matter or if I want to evaluate on a new dataset, do I need to create these keys?

Thanks.

Openpose

Hi Sir,

I am very much interested in your project and want to use it for my research purpose.

Ubuntu: 14.04
CUDA: 8
Python: 3.6

I wanted to see the demo but I have this error. Please help me to understand the issue

How to efficiently load kinetics dataset

Code version (Git Hash) and PyTorch version

pytorch4.0

Dataset used

kinetics

Expected behavior

1.(use class Feeder ):After executed your code kinetics_gendata.py , I got train_data.npy,et. Then use Feeder class to load:

params.train_feeder_args["data_path"] = params.dataset_dir+'kinetics-skeleton'+'/processed/train_data.npy'
dataloader = torch.utils.data.DataLoader(
    dataset=Feeder(**params.train_feeder_args),
    batch_size=params.batch_size,
    shuffle=True,
    num_workers=params.num_workers, pin_memory=False)

I will got out of memory .Because train_data.npy is about 43GB, it too large to load without prefetch.
2. （use class Feeder_kinetics ）: this is class is to load data from original XXX.json file.

params.train_feeder_args["data_path"] = params.dataset_dir+'/kinetics-skeleton'+'/kinetics_train'
dataloader = torch.utils.data.DataLoader(
                dataset=Feeder_kinetics(**params.train_feeder_args),
                batch_size=params.batch_size,
                shuffle=True,
                num_workers=params.num_workers, pin_memory=False)

but by this way, the dataloader is slowly, induced that my model on GPU is always wait for data to compute,the efficiency of GPU is about 50%(too low).

Actual behavior

So, do you have any ideas to load the kinetis dataset efficiently?Thank you!

Steps to reproduce the behavior

Other comments

FileNotFoundError

Code version (Git Hash) and PyTorch version

6542289, 0.3.1, running Python 3.6.4

Dataset used

NTU-RGB-D

Expected behavior

Generate the data

Actual behavior

FileNotFoundError: [Errno 2] No such file or directory: 'data/NTU-RGB-D/samples_with_missing_skeletons.txt'

Steps to reproduce the behavior

I downloaded the NTURGB skeleton file, unzipped it, placed it into the data folder under NTU RGB D, and ran the code. It would give me that error.

Other comments

tools/ntu_gendata write error

the line 77
with open('{}/{}_label.pkl'.format(out_path, part), 'w') as f:

should be fixed as:
with open('{}/{}_label.pkl'.format(out_path, part), 'wb') as f:

to avoid error: must be str, not bytes

refer: https://stackoverflow.com/questions/13906623/using-pickle-dump-typeerror-must-be-str-not-bytes

Training on new data set

Hi, I'm trying to retrain the model on a new dataset: "TV Human Interaction Dataset" by Oxford, after skelenton extraction from openpose and reformated usiing your convert_poses.py, I found my data is quite different from your kinetics-skelenton dataset, it seems like you normalized the "pose" section in your dataset to 0-1 range, should I do that as well?

Thanks for your great work!

libcaffe.so.1.0.0: cannot open shared object file

Hi,
I tried to run "python3.6 main.py demo --openpose /home/action-recognition/openpose/build" on my computer,but it failed:

Can you help me? How can I fix it?
By the way,openpose can be used successfully:

FileNotFoundError: [Errno 2] No such file or directory

Code version (Git Hash) and PyTorch version

6542289, 0.3.1, running Python 3.6.4

Dataset used

NTU-RGB-D

Expected behavior

run the code

Actual behavior

Traceback (most recent call last):
File "/home/rss/eclipse-workspace/CNN_Skeleton_NTURGB/main.py", line 415, in
with open(p.config, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: './config/NTU-RGB-D/xview/ST_GCN.yaml'

Steps to reproduce the behavior

I did all steps such as data preperation, testing and evalution and saw the result on cmd. At the moment I want run the code (main.py) on eclipse but the above error appeaered.

Other comments

Help with error when running demo

Code version (Git Hash) and PyTorch version:

Branch: master
Pytorch: 0.4.0

Dataset used

Expected behavior

Save the video in data/demo_result/

Actual behavior

/home/ximenes/.virtualenvs/gcn/bin/python3.6 /home/ximenes/PycharmProjects/graph/st-gcn/main.py demo --openpose /home/ximenes/openpose/build/ --video /home/ximenes/Desktop/openpose/videos/dataset/S001C001P001R001A001_rgb.avi
Traceback (most recent call last):
File "/home/ximenes/PycharmProjects/graph/st-gcn/main.py", line 29, in
p = Processor(sys.argv[2:])
File "/home/ximenes/PycharmProjects/graph/st-gcn/processor/io.py", line 26, in init
self.init_environment()
File "/home/ximenes/PycharmProjects/graph/st-gcn/processor/io.py", line 57, in init_environment
self.io.save_arg(self.arg)
File "/home/ximenes/.virtualenvs/gcn/lib/python3.6/site-packages/torchlight-1.0-py3.6.egg/torchlight/io.py", line 118, in save_arg
FileNotFoundError: [Errno 2] No such file or directory: '{self.work_dir}/config.yaml'

Steps to reproduce the behavior

A follow the instructions to install the libraries

My environment

Command:
/home/ximenes/.virtualenvs/gcn/bin/python3.6 /home/ximenes/PycharmProjects/graph/st-gcn/main.py demo --openpose /home/ximenes/openpose/build/ --video /home/ximenes/Desktop/openpose/videos/dataset/S001C001P001R001A001_rgb.avi

Other comments

Openpose is working corretly in my machine.

cuda RuntimeError

I run this code : python main.py --config config/st_gcn/nturgbd-cross-subject/test.yaml

but I recieved this error:

RuntimeError: cuda runtime error (35) : CUDA driver version is insufficient for CUDA runtime version at torch/csrc/cuda/Module.cpp:107

How should do if I want to verify the pretrained model only in CPU?

【Kinetics Dataset】Data Corrupted or incomplete from Baiduyun or Google Drive

Hi! Thank you for your great jobs!

But when I download the Kinetics dataset from the link your provided, the unarchiver remind me that the data is corrupted(from Baiduyun) or incomplete(from Google Drive).

If I continue to unarchive the zip file and run the command "python tools/kinetics_gendata.py --data_path "

The process would fail during the check

Traceback (most recent call last):----------------            ]
  File "tools/kinetics_gendata.py", line 84, in <module>
    gendata(data_path, label_path, data_out_path, label_out_path)
  File "tools/kinetics_gendata.py", line 57, in gendata
    data, label = feeder[i]
  File "/home/qiang/Software/st-gcn/st_gcn/feeder/feeder_kinetics.py", line 131, in __getitem__
    assert (self.label[index] == label)
AssertionError

I guess it is because the file is incomplete. Can you give some advice?

cuda runtime error

Hi,
I tried to evaluate ST-GCN model pretrained on NTU RGB dataset but I recieved this error:

RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1518243271935/work/torch/lib/THC/generic/THCStorage.cu:58

this is my complete output:
python main.py --config config/st_gcn/nturgbd-cross-view/test.yaml
[ Thu Mar 29 12:40:17 2018 ] Load weights from ./model/ntuxview-st_gcn.pt.
[ Thu Mar 29 12:40:17 2018 ] Model: st_gcn.net.ST_GCN.
[ Thu Mar 29 12:40:17 2018 ] Weights: ./model/ntuxview-st_gcn.pt.
[ Thu Mar 29 12:40:17 2018 ] Eval epoch: 1
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1518243271935/work/torch/lib/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
File "main.py", line 426, in
processor.start()
File "main.py", line 388, in start
epoch=0, save_score=self.arg.save_score, loader_name=['test'])
File "main.py", line 344, in eval
output = self.model(data)
File "/home/razieh/miniconda3/envs/tensorflow141/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/razieh/eclipse-workspace/CNN_Skeleton_NTURGB/st_gcn/net/st_gcn.py", line 153, in forward
x = m(x)
File "/home/razieh/miniconda3/envs/tensorflow141/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/razieh/eclipse-workspace/CNN_Skeleton_NTURGB/st_gcn/net/st_gcn.py", line 210, in forward
x = self.tcn1(self.gcn1(x)) + (x if
File "/home/razieh/miniconda3/envs/tensorflow141/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/razieh/eclipse-workspace/CNN_Skeleton_NTURGB/st_gcn/net/unit_gcn.py", line 84, in forward
y = y + self.conv_listi
RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1518243271935/work/torch/lib/THC/generic/THCStorage.cu:58
(tensorflow141) razieh@Razieh:~/eclipse-workspace/CNN_Skeleton_NTURGB$ python main.py --config config/st_gcn/nturgbd-cross-view/test.yaml
[ Thu Mar 29 12:56:18 2018 ] Load weights from ./model/ntuxview-st_gcn.pt.
[ Thu Mar 29 12:56:18 2018 ] Model: st_gcn.net.ST_GCN.
[ Thu Mar 29 12:56:18 2018 ] Weights: ./model/ntuxview-st_gcn.pt.
[ Thu Mar 29 12:56:18 2018 ] Eval epoch: 1
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1518243271935/work/torch/lib/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
File "main.py", line 426, in
processor.start()
File "main.py", line 388, in start
epoch=0, save_score=self.arg.save_score, loader_name=['test'])
File "main.py", line 344, in eval
output = self.model(data)
File "/home/razieh/miniconda3/envs/tensorflow141/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/razieh/eclipse-workspace/CNN_Skeleton_NTURGB/st_gcn/net/st_gcn.py", line 153, in forward
x = m(x)
File "/home/razieh/miniconda3/envs/tensorflow141/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/razieh/eclipse-workspace/CNN_Skeleton_NTURGB/st_gcn/net/st_gcn.py", line 210, in forward
x = self.tcn1(self.gcn1(x)) + (x if
File "/home/razieh/miniconda3/envs/tensorflow141/lib/python3.6/site-packages/torch/nn/modules/module.py", line 357, in call
result = self.forward(*input, **kwargs)
File "/home/razieh/eclipse-workspace/CNN_Skeleton_NTURGB/st_gcn/net/unit_gcn.py", line 84, in forward
y = y + self.conv_listi
RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1518243271935/work/torch/lib/THC/generic/THCStorage.cu:58

Could you please guide me how can I solve this problem.
thank you

Potential Use with a new dataset?

Code version (Git Hash) and PyTorch version

6542289, 0.3.1

I am questioning, if I manage to get skeletal poses from another video via OpenPose, could I potentially train that dataset against your network?

Temporal Bias for Short Videos

In my custom dataset, I have a class with clips that are usually 2 or 3 times shorter than the average length of other classes. Since the model pads short clips by repetition to get equal lengths of 300, does this induce a bias in treating shorter clips? Is there a way to address this issue?

Thanks in advance.

cannot test demo

get the error :Can not find pose estimation results and the program exit @yysijie @yjxiong

FileNotFoundError

Thanks for you great work!
When I am learning your work,I came up with the following problem:

Code version (Git Hash) and PyTorch version

Python 3.6.4 |Anaconda custom (64-bit)
dual system: windows & ubuntu

Dataset used

kinetics-skeleton

Expected behavior

Data Preparation:build the database Kinetics-Skeleton

Actual behavior

rui@rui-lenovo-xiaoxin-700-15isk:~/Downloads/st-gcn-master$ python tools/kinetics_gendata.py --data_path /media/rui/Study/kinetic-skeleton/kinetics-skeleton
Traceback (most recent call last):
File "tools/kinetics_gendata.py", line 84, in
gendata(data_path, label_path, data_out_path, label_out_path)
File "tools/kinetics_gendata.py", line 54, in gendata
shape=(len(sample_name), 3, max_frame, 18, num_person_out))
File "/home/rui/anaconda3/lib/python3.6/site-packages/numpy/lib/format.py", line 763, in open_memmap
fp = open(filename, mode+'b')
FileNotFoundError: [Errno 2] No such file or directory: 'data/Kinetics/kinetics-skeleton/val_data.npy'

Steps to reproduce the behavior

I downloaded Kinetics-Skeleton from Baidu Yun,unzipped 2 zip segments in Windows disk which has already mounted on Ubuntu.After run the code,I got this FileNotFoundError error.

Other

The other dataset NTU RGB+D ,which is in the same circumstances as Kinetics-Skeleton ,is rebuited well.

Can I use only two GPUs to train this network?

Supporting Python3

The code can be run by Python3 now. You are advised to update to the latest version if you are a python3 user.

how to draw dag picture of st-gcn in pytorch?

i am a freshman in using pytorch! i want to know how to draw a Dag picture of st-gcn's model in pytorch!

Inference on personal data

Hi! First of all thanks for this great project.
I am currently working on a project where I want to detect specific actions performed by pedestrians and I thought that your code could provide me a great starting point.
Currently I am using OpenPose to extract two skeleton estimations from 10sec video clips at 30fps.
From what I have seen the JSON output format provided by OpenPose is a bit different from the input format of your network. I am planning on writing a script to convert the results to the right format, I just wanted to be sure that I was not missing anything.
Similarly, I believe that I need to modify slightly your code to add a method to generate predictions without labels provided (which I believe is the case in both eval and train method). Am I right with this?
Thanks for your thoughts!

how to calc matrix A

I am a little confused about how to calculate A. Could you please describe it use a demo matrix?
I saw A is of shape (3,18,18) when test on Kinetics, of which dim 0 is an eye matrix and dim 1 is adjacency matrix.
In the paper end of para 2, sect 3.6:
$ \Lambda^{-0.5}(A+I)\Lambda^{-0.5} $ on the second dimension.

If we now have a adjacency matrix of a 4-node graph, say [[0,1,1,1],[1,0,1,0],[1,1,0,0],[1,0,0,0]]
could you please give the result of $\Lambda$ for both uni and multi partitioning strategies?

Thanks

what should i do if i want to evaluate ST-GCN on my dataset?

Hi !
I want to evaluate this model on my own multi-view skeleton dataset without changing the origin code too much. I have skimmed through the codes and i think i should modify class Feeder. However i really dont know where to begin to modify it. Could you give me some suggestions? Thanks
BTW, the shape of the data should be (N, C, T, V, M) and T is the length of the sequence. However the length of the video clip is not equal. So have you do some sampling ?