Giter Club home page Giter Club logo

alphaction's Introduction

AlphAction

AlphAction aims to detect the actions of multiple persons in videos. It is the first open-source project that achieves 30+ mAP (32.4 mAP) with single model on AVA dataset.

This project is the official implementation of paper Asynchronous Interaction Aggregation for Action Detection (ECCV 2020), authored by Jiajun Tang*, Jin Xia* (equal contribution), Xinzhi Mu, Bo Pang, Cewu Lu (corresponding author).


demo1 demo2
demo3

Demo Video

AlphAction demo video [YouTube] [BiliBili]

Installation

You need first to install this project, please check INSTALL.md

Data Preparation

To do training or inference on AVA dataset, please check DATA.md for data preparation instructions. If you have difficulty accessing Google Drive, you can instead find most files (including models) on Baidu NetDisk([link], code: smti).

Model Zoo

Please see MODEL_ZOO.md for downloading models.

Training and Inference

To do training or inference with AlphAction, please refer to GETTING_STARTED.md.

Demo Program

To run the demo program on video or webcam, please check the folder demo. We select 15 common categories from the 80 action categories of AVA, and provide a practical model which achieves high accuracy (about 70 mAP) on these categories.

Acknowledgement

We thankfully acknowledge the computing resource support of Huawei Corporation for this project.

Citation

If this project helps you in your research or project, please cite this paper:

@inproceedings{tang2020asynchronous,
  title={Asynchronous Interaction Aggregation for Action Detection},
  author={Tang, Jiajun and Xia, Jin and Mu, Xinzhi and Pang, Bo and Lu, Cewu},
  booktitle={Proceedings of the European conference on computer vision (ECCV)},
  year={2020}
}

alphaction's People

Contributors

yelantf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

alphaction's Issues

ImportError: libtorch_cpu.so

Thanks for your sharing!
I installed as prompted by the installation script.
But when I run the demo.py,i got the following error:

import AlphAction.custom_ext as _C
ImportError: libtorch_cpu.so: cannot open shared object file: No such file or directory

Can you help me ?

训练代码

非常好的工作,请问作者什么时候开源训练代码呢?

customize dataset

Hello, I have a few questions.

  1. I'd like to know why we don't use 'person_bbox' during training

  2. I don't know what "# disable box_file when train, use only gt to train" means. Isn't box the ground truth?

  3. I want to customize the dataset. My action data usually lasts about 3 seconds with varying length, but I see the clip of AVA as one second, I first came into contact with the concept of keyframes, and why should I take the first frame as a keyframe, which can represent the action classification of this clip? I am very confused. Can you give me the general idea?

looking forward to you reply.

Fine-Tuning

Hi @Fang-Haoshu ,

Thank you for sharing your amazing work. I wanted to ask if you would be making a fine-tuning code available soon or not? It'll be really helpful for my project as well! Thanks

Question about the config files

Hi, thanks again for sharing the great work. I'm running the training codes and found that the schedule in config files is slightly different from the one in the GETTING_STARTED page (https://github.com/MVIG-SJTU/AlphAction/blob/master/GETTING_STARTED.md#training). The setting of base learning and training iteration is not consistent (considering adjust schedule according to the linear scaling rule). I'm just wondering which version is correct for reproducing the result using the current codebase? Thanks!

install setup.py ERROR

Hi!thank you for the project, when I installing the setup.py file i meet ERROR, the ERROR is below, I hope you can help me。

g++: error: /home/zhangzhenbo/AlphAction/build/temp.linux-x86_64-3.7/home/zhangzhenbo/AlphAction/alphaction/csrc/cuda/ROIAlign3d_cuda.o: No such file or directory
error: command 'g++' failed with exit status 1
(alphaction) zhangzhenbo@zhangzhenbo-TUF-Gaming-FX505GM-FX86FM:/AlphAction$ g++ --version
g++ (Ubuntu 7.5.0-3ubuntu1
18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

(alphaction) zhangzhenbo@zhangzhenbo-TUF-Gaming-FX505GM-FX86FM:~/AlphAction$ python setup.py install
running install
running bdist_egg
running egg_info
writing alphaction.egg-info/PKG-INFO
writing dependency_links to alphaction.egg-info/dependency_links.txt
writing requirements to alphaction.egg-info/requires.txt
writing top-level names to alphaction.egg-info/top_level.txt
reading manifest file 'alphaction.egg-info/SOURCES.txt'
writing manifest file 'alphaction.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
building 'alphaction._custom_cuda_ext' extension
Emitting ninja build file /home/zhangzhenbo/AlphAction/build/temp.linux-x86_64-3.7/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
1.10.1
g++ -pthread -shared -B /home/zhangzhenbo/anaconda3/envs/alphaction/compiler_compat -L/home/zhangzhenbo/anaconda3/envs/alphaction/lib -Wl,-rpath=/home/zhangzhenbo/anaconda3/envs/alphaction/lib -Wl,--no-as-needed -Wl,--sysroot=/ /home/zhangzhenbo/AlphAction/build/temp.linux-x86_64-3.7/home/zhangzhenbo/AlphAction/alphaction/csrc/vision.o /home/zhangzhenbo/AlphAction/build/temp.linux-x86_64-3.7/home/zhangzhenbo/AlphAction/alphaction/csrc/cuda/SoftmaxFocalLoss_cuda.o /home/zhangzhenbo/AlphAction/build/temp.linux-x86_64-3.7/home/zhangzhenbo/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.o /home/zhangzhenbo/AlphAction/build/temp.linux-x86_64-3.7/home/zhangzhenbo/AlphAction/alphaction/csrc/cuda/ROIPool3d_cuda.o /home/zhangzhenbo/AlphAction/build/temp.linux-x86_64-3.7/home/zhangzhenbo/AlphAction/alphaction/csrc/cuda/ROIAlign3d_cuda.o -L/home/zhangzhenbo/anaconda3/envs/alphaction/lib/python3.7/site-packages/torch/lib -L/usr/local/cuda-10.1/lib64 -lc10 -ltorch -ltorch_cpu -ltorch_python -lcudart -lc10_cuda -ltorch_cuda -o build/lib.linux-x86_64-3.7/alphaction/_custom_cuda_ext.cpython-37m-x86_64-linux-gnu.so
g++: error: /home/zhangzhenbo/AlphAction/build/temp.linux-x86_64-3.7/home/zhangzhenbo/AlphAction/alphaction/csrc/cuda/SoftmaxFocalLoss_cuda.o: No such file or directory
g++: error: /home/zhangzhenbo/AlphAction/build/temp.linux-x86_64-3.7/home/zhangzhenbo/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.o: No such file or directory
g++: error: /home/zhangzhenbo/AlphAction/build/temp.linux-x86_64-3.7/home/zhangzhenbo/AlphAction/alphaction/csrc/cuda/ROIPool3d_cuda.o: No such file or directory
g++: error: /home/zhangzhenbo/AlphAction/build/temp.linux-x86_64-3.7/home/zhangzhenbo/AlphAction/alphaction/csrc/cuda/ROIAlign3d_cuda.o: No such file or directory
error: command 'g++' failed with exit status 1

How to track target people?

Hi,
thank you for providing very nice method.

I have one question about AIA. How to correspond one people in the frame and one people in next frame as same people?
This method seems to learn temporal information about each people in video. If you can, please tell me how to track people and where it is implemented.

/SigmoidFocalLoss_cuda.cu error

when I run pip install -e . , there is something error. How to solve it ?

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(132): error: a pointer to a bound function may only be used to call the function

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(132): error: type name is not allowed

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(132): error: expected an expression

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(132): error: a pointer to a bound function may only be used to call the function

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(132): error: type name is not allowed

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(132): error: expected an expression

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(132): error: a pointer to a bound function may only be used to call the function

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(132): error: type name is not allowed

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(132): error: expected an expression

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(132): error: a pointer to a bound function may only be used to call the function

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(132): error: type name is not allowed

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(132): error: expected an expression

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(132): error: a pointer to a bound function may only be used to call the function

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(132): error: type name is not allowed

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(132): error: expected an expression

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(132): error: a pointer to a bound function may only be used to call the function

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(132): error: type name is not allowed

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(132): error: expected an expression

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: a pointer to a bound function may only be used to call the function

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: type name is not allowed

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: expected an expression

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: a pointer to a bound function may only be used to call the function

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: type name is not allowed

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: expected an expression

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: a pointer to a bound function may only be used to call the function

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: type name is not allowed

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: expected an expression

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: a pointer to a bound function may only be used to call the function

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: type name is not allowed

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: expected an expression

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: a pointer to a bound function may only be used to call the function

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: type name is not allowed

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: expected an expression

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: a pointer to a bound function may only be used to call the function

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: type name is not allowed

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: expected an expression

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: a pointer to a bound function may only be used to call the function

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: type name is not allowed

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: expected an expression

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: a pointer to a bound function may only be used to call the function

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: type name is not allowed

/home/action_detection/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(178): error: expected an expression

42 errors detected in the compilation of "/tmp/tmpxft_000017b7_00000000-6_SigmoidFocalLoss_cuda.cpp1.ii".
error: command '/usr/local/cuda/bin/nvcc' failed with exit status 1

Problem in training with custom dataset

I'm trying to train the AlphAction model with a custom dataset.
To train, I'm running the following code:

python train_net.py --config-file "path/to/config/file.yaml" \ --transfer --no-head --use-tfboard \ SOLVER.BASE_LR 0.000125 \ SOLVER.STEPS '(560000, 720000)' \ SOLVER.MAX_ITER 880000 \ SOLVER.VIDEOS_PER_BATCH 2 \ TEST.VIDEOS_PER_BATCH 2

I'm getting the error:

loading annotations into memory...
Done (t=0.00s)
Loading box file into memory...
Done (t=0.00s)
loading annotations into memory...
Done (t=0.00s)
Loading box file into memory...
Done (t=0.00s)
Loading box file into memory...
Done (t=0.00s)
2020-11-04 19:54:09,398 alphaction.trainer INFO: Start training
Traceback (most recent call last):
File "./AlphAction/train_net.py", line 245, in
main()
File "./AlphAction/train_net.py", line 234, in main
model = train(cfg, args.local_rank, args.distributed, tblogger, args.transfer_weight, args.adjust_lr, args.skip_val,
File "./AlphAction/train_net.py", line 84, in train
do_train(
File "./AlphAction/alphaction/engine/trainer.py", line 40, in do_train
for iteration, (slow_video, fast_video, boxes, objects, extras, _) in enumerate(data_loader, start_iter):
File "./AlphAction/venv/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "./AlphAction/venv/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data
return self._process_data(data)
File "./AlphAction/venv/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
data.reraise()
File "./AlphAction/venv/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
File "av/utils.pyx", line 27, in av.utils.AVError.init
TypeError: init() takes at least 3 positional arguments (2 given)

I've made ~130(I know not so much) action records in my dataset, I have annotated custom videos and gone through all the steps described in Data.md.

I've got the following dataset directory structure, which is pretty much the same as AVA, so that I don't have to change the code written for AVA.

data/AVA
├── annotations
│   ├── ava_action_list_v2.2_for_activitynet_2019.pbtxt
│   ├── ava_action_list_v2.2.pbtxt
│   ├── ava_file_names_trainval_v2.1.txt
│   ├── ava_include_timestamps_v2.2.txt
│   ├── ava_train_excluded_timestamps_v2.2.csv
│   ├── ava_train_v2.2.csv
│   ├── ava_train_v2.2.json
│   ├── ava_train_v2.2_min.json
│   ├── ava_val_excluded_timestamps_v2.2.csv
│   ├── ava_val_v2.2.csv
│   ├── ava_val_v2.2.json
│   └── ava_val_v2.2_min.json
├── boxes
│   ├── ava_train_det_object_bbox.json
│   ├── ava_val_det_object_bbox.json
│   └── ava_val_det_person_bbox.json
├── clips
│   └── trainval_old
│   ├── conv_1-1-56-576 [46 entries exceeds filelimit, not opening dir]
│   ├── conv_cam1_2020-10-24_12-23-17 [108 entries exceeds filelimit, not opening dir]
│   ├── conv_cam1_2020-10-25_15-34-08 [92 entries exceeds filelimit, not opening dir]
│   ├── conv_cam1_2020-10-25_15-37-09 [88 entries exceeds filelimit, not opening dir]
│   ├── conv_cam1_2020-10-25_15-39-56 [91 entries exceeds filelimit, not opening dir]
│   └── conv_cam1_2020-10-25_15-41-48 [124 entries exceeds filelimit, not opening dir]
├── keyframes
│   └── trainval
│   ├── conv_1-1-56-576 [46 entries exceeds filelimit, not opening dir]
│   ├── conv_cam1_2020-10-24_12-23-17 [108 entries exceeds filelimit, not opening dir]
│   ├── conv_cam1_2020-10-25_15-34-08 [92 entries exceeds filelimit, not opening dir]
│   ├── conv_cam1_2020-10-25_15-37-09 [88 entries exceeds filelimit, not opening dir]
│   ├── conv_cam1_2020-10-25_15-39-56 [91 entries exceeds filelimit, not opening dir]
│   └── conv_cam1_2020-10-25_15-41-48 [124 entries exceeds filelimit, not opening dir]
└── movies
└── trainval
├── conv_1-1-56-576.mp4
├── conv_cam1_2020-10-24_12-23-17.mp4
├── conv_cam1_2020-10-25_15-34-08.mp4
├── conv_cam1_2020-10-25_15-37-09.mp4
├── conv_cam1_2020-10-25_15-39-56.mp4
└── conv_cam1_2020-10-25_15-41-48.mp4

Thanks beforehand for the reply.

关于memory pool中mem_active标志位问题

很感谢您的开源行为,针对您提供的代码,我有以下两个问题希望得到您的回复:
问题描述:在trainer.py中我发现如果mem_active为False则不会进入memory的更新操作,我在cfg文件中将IA_STRUCTURE参数设置为True则会报以下错误:
Traceback (most recent call last):
File "train_net.py", line 246, in
main()
File "train_net.py", line 235, in main
model = train(cfg, args.local_rank, args.distributed, tblogger, args.transfer_weight, args.adjust_lr, args.skip_val,args.no_head)
File "train_net.py", line 58, in train
extra_checkpoint_data = checkpointer.load(cfg.MODEL.WEIGHT, model_weight_only=transfer_weight,adjust_scheduler=adjust_lr, no_head=no_head)
File "/home/mmsys8/disk/CH/Alphaction/AlphAction-master/alphaction/utils/checkpoint.py", line 61, in load
self._load_model(checkpoint, no_head)
File "/home/mmsys8/disk/CH/Alphaction/AlphAction-master/alphaction/utils/checkpoint.py", line 110, in _load_model
load_state_dict(self.model, checkpoint.pop("model"), no_head)
File "/home/mmsys8/disk/CH/Alphaction/AlphAction-master/alphaction/utils/model_serialization.py", line 83, in load_state_dict
model.load_state_dict(model_state_dict)
File "/home/mmsys8/anaconda3/envs/alphaction/lib/python3.5/site-packages/torch/nn/modules/module.py", line 830, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for ActionDetector:
size mismatch for roi_heads.action.feature_extractor.fc1.weight: copying a param with shape torch.Size([1024, 2304]) from checkpoint, the shape in current model is torch.Size([1024, 2816]).
问题:
1.请问是不是如果mem_active标志位是False状态的话,就无法进行论文中AMU算法的实现?
2.请问我如何才能确保AMU算法的实现?应该如何设置参数才能使得以上错误被修正?

再次感谢您的工作,期待您的回复。

Installation error

python: 3.7.1
ninja: 1.7.2
pytorch: 1.5.0
cudatoolkit: 10.2.89
torchvision: 0.6.0

When running the code 'pip install -e .' there is something error.

Requirement already satisfied: PyYAML in c:\programdata\anaconda3\envs\py37action\lib\site-packages (from yacs->alphaction==0.0.0) (5.3.1)
Installing collected packages: alphaction
Running setup.py develop for alphaction
ERROR: Command errored out with exit status 1:
command: 'C:\ProgramData\Anaconda3\envs\py37action\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'D:\IMAGEprocess\program\AlphAction-master\setup.py'"'"'; file='"'"'D:\IMAGEprocess\program\AlphAction-master\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' develop --no-deps
cwd: D:\IMAGEprocess\program\AlphAction-master\

39 errors detected in the compilation of "C:/Users/LINGJU~1/AppData/Local/Temp/tmpxft_0000395c_00000000-10_SigmoidFocalLoss_cuda.cpp1.ii".
SigmoidFocalLoss_cuda.cu
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "C:\ProgramData\Anaconda3\envs\py37action\lib\site-packages\torch\utils\cpp_extension.py", line 1400, in _run_ninja_build
check=True)
File "C:\ProgramData\Anaconda3\envs\py37action\lib\subprocess.py", line 481, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "D:\IMAGEprocess\program\AlphAction-master\setup.py", line 121, in <module>
    cmdclass={"build_ext": torch.utils.cpp_extension.BuildExtension},
  File "C:\ProgramData\Anaconda3\envs\py37action\lib\site-packages\setuptools\__init__.py", line 153, in setup
    return distutils.core.setup(**attrs)
  File "C:\ProgramData\Anaconda3\envs\py37action\lib\distutils\core.py", line 148, in setup
    dist.run_commands()
  File "C:\ProgramData\Anaconda3\envs\py37action\lib\distutils\dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "C:\ProgramData\Anaconda3\envs\py37action\lib\distutils\dist.py", line 985, in run_command
    cmd_obj.run()
  File "C:\ProgramData\Anaconda3\envs\py37action\lib\site-packages\setuptools\command\develop.py", line 34, in run
    self.install_for_development()
  File "C:\ProgramData\Anaconda3\envs\py37action\lib\site-packages\setuptools\command\develop.py", line 136, in install_for_development
    self.run_command('build_ext')
  File "C:\ProgramData\Anaconda3\envs\py37action\lib\distutils\cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "C:\ProgramData\Anaconda3\envs\py37action\lib\distutils\dist.py", line 985, in run_command
    cmd_obj.run()
  File "C:\ProgramData\Anaconda3\envs\py37action\lib\site-packages\setuptools\command\build_ext.py", line 79, in run
    _build_ext.run(self)
  File "C:\ProgramData\Anaconda3\envs\py37action\lib\site-packages\Cython\Distutils\old_build_ext.py", line 186, in run
    _build_ext.build_ext.run(self)
  File "C:\ProgramData\Anaconda3\envs\py37action\lib\distutils\command\build_ext.py", line 339, in run
    self.build_extensions()
  File "C:\ProgramData\Anaconda3\envs\py37action\lib\site-packages\torch\utils\cpp_extension.py", line 580, in build_extensions
    build_ext.build_extensions(self)
  File "C:\ProgramData\Anaconda3\envs\py37action\lib\site-packages\Cython\Distutils\old_build_ext.py", line 195, in build_extensions
    _build_ext.build_ext.build_extensions(self)
  File "C:\ProgramData\Anaconda3\envs\py37action\lib\distutils\command\build_ext.py", line 448, in build_extensions
    self._build_extensions_serial()
  File "C:\ProgramData\Anaconda3\envs\py37action\lib\distutils\command\build_ext.py", line 473, in _build_extensions_serial
    self.build_extension(ext)
  File "C:\ProgramData\Anaconda3\envs\py37action\lib\site-packages\setuptools\command\build_ext.py", line 196, in build_extension
    _build_ext.build_extension(self, ext)
  File "C:\ProgramData\Anaconda3\envs\py37action\lib\distutils\command\build_ext.py", line 533, in build_extension
    depends=ext.depends)
  File "C:\ProgramData\Anaconda3\envs\py37action\lib\site-packages\torch\utils\cpp_extension.py", line 562, in win_wrap_ninja_compile
    with_cuda=with_cuda)
  File "C:\ProgramData\Anaconda3\envs\py37action\lib\site-packages\torch\utils\cpp_extension.py", line 1140, in _write_ninja_file_and_compile_objects
    error_prefix='Error compiling objects for extension')
  File "C:\ProgramData\Anaconda3\envs\py37action\lib\site-packages\torch\utils\cpp_extension.py", line 1413, in _run_ninja_build
    raise RuntimeError(message)
RuntimeError: Error compiling objects for extension

failed to Install

the follow is the error message when type python setup.py build develop.
detector/nms/src/nms_cuda.cpp:4:80: error: ‘AT_CHECK’ was not declared in this scope
#define CHECK_CUDA(x) AT_CHECK(x.type().is_cuda(), #x, " must be a CUDAtensor ")
^
detector/nms/src/nms_cuda.cpp:9:3: note: in expansion of macro ‘CHECK_CUDA’
CHECK_CUDA(dets);
^
error: command 'gcc' failed with exit status 1

Problem: Excessive graphics card memory usage

My operating environment: rtx3090, cuda11.0, pytorch1.7.1
When I run alphaction, my graphics card will use up to 8g of memory.
Is it possible to modify resnet101_8x8f_denseserial.yaml for optimization?
if possible, modify those configuration items will have an effect?

Good Morrning author! when I use webcam run a moment then throw out an Error

**(alphaction) zhangzhenbo@zhangzhenbo-TUF-Gaming-FX505GM-FX86FM:~/AlphAction/demo$ python demo.py --webcam --output-path ../output/Joe_Biden.avi --cfg-path ../config_files/resnet101_8x8f_denseserial.yaml --weight-path ../data/models/aia_models/resnet101_8x8f_denseserial.pth
Starting webcam demo, press Ctrl + C to terminate...
Loading action model weight from ../data/models/aia_models/resnet101_8x8f_denseserial.pth.
Action model weight successfully loaded.
Loading YOLO model..
Network successfully loaded
Loading tracking model..
Network successfully loaded
Showing tracking progress bar (in fps). Other processes are running in the background.
Tracker Progress: 148 frame [00:27, 5.16 frame/s]Process Process-2:
Tracker Progress: 149 frame [00:27, 5.55 frame/s]Traceback (most recent call last):
File "/home/zhangzhenbo/anaconda3/envs/alphaction/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/zhangzhenbo/anaconda3/envs/alphaction/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, self._kwargs)
File "/home/zhangzhenbo/AlphAction/demo/action_predictor.py", line 427, in _compute_prediction
transform_randoms)
File "/home/zhangzhenbo/AlphAction/demo/action_predictor.py", line 118, in update_feature
assert timestamp > self.mem_timestamps[-1], "features are expected to be updated in order."
AssertionError: features are expected to be updated in order.
Tracker Progress: 636 frame [01:35, 7.16 frame/s]

This mean I can't run a long time used webcam?

Re-train with set of different labels

Thanks so much for sharing.

Would it be possible, using the training-procedure along with custom videos and custom annotations to re-train the model to track different actions like shooting, running, tackling?

Cheers

How to install AlphAction without cuda?

I want to use alphAction on my macbook pro.
And get errors when run command 'pip install -e .'
The error is as follow:

Obtaining file:///Users/u/project/AlphAction
    ERROR: Complete output from command python setup.py egg_info:
    ERROR: Compiling detector/nms/src/soft_nms_cpu.pyx because it changed.
    [1/1] Cythonizing detector/nms/src/soft_nms_cpu.pyx
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/Users/u/project/AlphAction/setup.py", line 107, in <module>
        ext_modules=get_extensions(),
      File "/Users/u/project/AlphAction/setup.py", line 92, in get_extensions
        sources=['src/nms_cpu.cpp']),
      File "/Users/u/project/AlphAction/setup.py", line 42, in make_cuda_ext
        '-D__CUDA_NO_HALF2_OPERATORS__',
      File "/Users/u/miniconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 779, in CUDAExtension
        library_dirs += library_paths(cuda=True)
      File "/Users/u/miniconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 869, in library_paths
        if (not os.path.exists(_join_cuda_home(lib_dir)) and
      File "/Users/u/miniconda3/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1783, in _join_cuda_home
        raise EnvironmentError('CUDA_HOME environment variable is not set. '
    OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
    ----------------------------------------
ERROR: Command "python setup.py egg_info" failed with error code 1 in /Users/u/project/AlphAction/

Too slow on test Stage 2

Thank you for sharing!

I try to run testing under the 'serial' configuration. But it seems quite slow on stage 2: almost over ten minutes for each batch. I'm confused about it.

I use 2 GPUs and set the VIDEOS_PER_BATCH 16.

I wonder whether the VIDEOS_PER_BATCH value is too big for me? And does anything else influence the inference time?

Video inference is not satisfactory

Thanks, a very good project. I used a single gpuDemo.py to infer the video and the effect video address output.mp4 . However, the personnel detection frame is very jittery and inaccurate in many places. How can I modify the code to get the effect of your publicity video?

No GPU?

Can this project run on low computing devices with no GPU? I want to run this on raspberry pi 4B, is it possible? What are the minimum system requirements and expected performance?

A question about 'ava_train_det_person_bbox.json'

Hello, the author. I'm trying to train with a custom dataset.
I built my dataset as same as the AVA dataset directory structure. But as a rookie I am confused whether I need to replace the 'ava_train_det_person_bbox.json' ?
Thanks before for your reply.

What are the selected common categories?

Great job. For the demo, may I know what are the selected common categories? thanks a lot
" We select 15 common categories from the 80 action categories of AVA, and provide a practical model which achieves high accuracy (about 70 mAP) on these categories."

When I test the Model ,I got this error.

Traceback (most recent call last):
File "test_net.py", line 8, in
from alphaction.modeling.detector import build_detection_model
File "/home/leo/AlphAction/alphaction/modeling/detector/init.py", line 1, in
from .action_detector import build_detection_model
File "/home/leo/AlphAction/alphaction/modeling/detector/action_detector.py", line 3, in
from ..backbone import build_backbone
File "/home/leo/AlphAction/alphaction/modeling/backbone/init.py", line 1, in
from .backbone import build_backbone
File "/home/leo/AlphAction/alphaction/modeling/backbone/backbone.py", line 2, in
from . import slowfast, i3d
File "/home/leo/AlphAction/alphaction/modeling/backbone/slowfast.py", line 6, in
from alphaction.modeling.common_blocks import ResNLBlock
File "/home/leo/AlphAction/alphaction/modeling/common_blocks.py", line 2, in
from alphaction.modeling.nonlocal_block import NLBlock
File "/home/leo/AlphAction/alphaction/modeling/nonlocal_block.py", line 6, in
from alphaction.layers import FrozenBatchNorm3d
File "/home/leo/AlphAction/alphaction/layers/init.py", line 3, in
from .roi_align_3d import ROIAlign3d
File "/home/leo/AlphAction/alphaction/layers/roi_align_3d.py", line 7, in
import alphaction._custom_cuda_ext as _C
ModuleNotFoundError: No module named 'alphaction._custom_cuda_ext'
Traceback (most recent call last):
File "/home/leo/anaconda3/envs/mmaction2/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/leo/anaconda3/envs/mmaction2/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/leo/anaconda3/envs/mmaction2/lib/python3.6/site-packages/torch/distributed/launch.py", line 263, in
main()
File "/home/leo/anaconda3/envs/mmaction2/lib/python3.6/site-packages/torch/distributed/launch.py", line 259, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/leo/anaconda3/envs/mmaction2/bin/python', '-u', 'test_net.py', '--local_rank=0', '--config-file', 'config_files/resnet50_4x16f_baseline.yaml', 'MODEL.WEIGHT', 'Models/resnet50_4x16f_baseline.pth']' returned non-zero exit status 1.

real time test on video

Hi, the video processing time is not only depend on the parameters of "detect_rate", but also depend on the video size and the number of persons detected in the video. Would you please share me the video you tested? thanks a lot.

Failed to run the demo.py after loading the trakcing model network

Thank you for your great sharing! I succeed to download and install, however after starting the command on running the demo.py:
python demo.py --video-path test_F.mp4 --output-path ../data/output/output.mp4 --cfg-path ../data/config/resnet101_8x8f_denseserial.yaml --weight-path ../data/models/common_15cat_res101.pth --common-cate

The output fail to run further and the terminal output as below:
Starting video demo, video path: test_F.mp4
Loading YOLO model..
Network successfully loaded
Loading tracking model..
0it [00:00, ?it/s]Network successfully loaded
2331it [09:27, 3.20it/s]Traceback (most recent call last):
File "demo.py", line 173, in
main()
File "demo.py", line 136, in main
(orig_img, boxes, scores, ids) = ava_predictor_worker.read_track()
File "/home/user/AlphAction/demo/action_predictor.py", line 260, in read_track
return self.track_queue.get()
File "/home/user/anaconda3/envs/action_det/lib/python3.7/multiprocessing/queues.py", line 113, in get
return _ForkingPickler.loads(res)
File "/home/user/anaconda3/envs/action_det/lib/python3.7/site-packages/torch/multiprocessing/reductions.py", line 294, in rebuild_storage_fd
fd = df.detach()
File "/home/user/anaconda3/envs/action_det/lib/python3.7/multiprocessing/resource_sharer.py", line 58, in detach
return reduction.recv_handle(conn)
File "/home/user/anaconda3/envs/action_det/lib/python3.7/multiprocessing/reduction.py", line 185, in recv_handle
return recvfds(s, 1)[0]
File "/home/user/anaconda3/envs/action_det/lib/python3.7/multiprocessing/reduction.py", line 161, in recvfds
len(ancdata))
RuntimeError: received 0 items of ancdata
2331it [09:28, 4.10it/s]

Can provide me some direction to solve it? Many thanks!

A question about person_box and action_predictor

Wonderful job, as a researcher in the same field, I would like to express my appreciation to the author.
I have one question about the part "compute_prediction" of "sction_predictior.py", the inputs of calculate function are nearest list frames(just like self.frame_stack = self.frame_stack[-self.frame_buffer_numbers:]) and only the box of center frame, it is assumed that the pedestrian has little displacement in input frames? I wonder to konw if add the exact box of each frame (can be extracted from the tracking results) can make the finall result better?(for the big change range of motion, just like hit, fight?) or just I mistook the process ?

how is the inference speed?

hi, I'm running the demo found that the speed is about 5 fps, is that correct? I'm runing the demo on RTX2070 ONE GPU card. thanks

113it [00:19, 4.20it/s]

about feature extractor

Thank you for sharing! I want to get the person and object features after RoIAlign(save them to disk), but I don't know how to do. should I change part_forward equal to 0? can you give me some hints? thanks!

training code

Thank you for sharing your amazing work.I would like to know if there is a specific release date for training code? I've been looking forward to it for a long time.

result of AVA and visualizer seems wrong

Hi , I followed the debug step in issue #8 and find that the result of prediction and visualizer seems not correct,


Loading tracking model..
0it [00:00, ?it/s]Network successfully loaded
644it [00:58, 11.79it/s]Wait for feature preprocess
The input queue is empty. Start working on prediction
0%| | 0/20 [00:00<?, ?it/s]>> predictions:
BoxList(num_boxes=1, image_width=960, image_height=448, mode=xyxy)

predictions:
BoxList(num_boxes=1, image_width=960, image_height=448, mode=xyxy)
predictions:
BoxList(num_boxes=1, image_width=960, image_height=448, mode=xyxy)
0%| | 2/21673 [00:00<18:05, 19.96it/s]>> predictions:
BoxList(num_boxes=2, image_width=960, image_height=448, mode=xyxy)
predictions:
BoxList(num_boxes=1, image_width=960, image_height=448, mode=xyxy)
25%|██████████████████████████████████████▊ | 5/20 [00:00<00:00, 40.95it/s]>> predictions:
BoxList(num_boxes=1, image_width=960, image_height=448, mode=xyxy)
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 21673/21673 [00:00<00:00, 142009.99it/s]
End of video loader
predictions:
BoxList(num_boxes=1, image_width=960, image_height=448, mode=xyxy)
predictions:
BoxList(num_boxes=2, image_width=960, image_height=448, mode=xyxy)
predictions:
BoxList(num_boxes=1, image_width=960, image_height=448, mode=xyxy)
visualizer
predictions:
BoxList(num_boxes=2, image_width=960, image_height=448, mode=xyxy)
50%|█████████████████████████████████████████████████████████████████████████████ | 10/20 [00:00<00:00, 42.29it/s]{} None tensor([[1.],
[2.]])
predictions:
BoxList(num_boxes=1, image_width=960, image_height=448, mode=xyxy)
predictions:
BoxList(num_boxes=1, image_width=960, image_height=448, mode=xyxy)
visualizer
{} None tensor([[1.],
[2.]])
predictions:
BoxList(num_boxes=1, image_width=960, image_height=448, mode=xyxy)
visualizer
{} None tensor([[1.],
[2.]])
predictions:
BoxList(num_boxes=1, image_width=960, image_height=448, mode=xyxy)
predictions:
BoxList(num_boxes=1, image_width=960, image_height=448, mode=xyxy)
75%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌ | 15/20 [00:00<00:00, 43.73it/s]visualizer
{} None tensor([[1.],
[2.]])
predictions:
BoxList(num_boxes=1, image_width=960, image_height=448, mode=xyxy)
visualizer
{} None tensor([[1.],
[2.]])
predictions:
BoxList(num_boxes=1, image_width=960, image_height=448, mode=xyxy)
predictions:
BoxList(num_boxes=1, image_width=960, image_height=448, mode=xyxy)
visualizer
{} None tensor([[1.],
[2.]])
predictions:
BoxList(num_boxes=1, image_width=960, image_height=448, mode=xyxy)
visualizer
{} None tensor([[1.],
[2.]])
predictions:
BoxList(num_boxes=1, image_width=960, image_height=448, mode=xyxy)
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 44.40it/s]
Prediction is done.
visualizer
Wait for writer process to finish...
{} None tensor([[1.],
[2.]])
visualizer | 11/645 [00:00<?, ?it/s]
{} None tensor([[1.],
[2.]])
visualizer
{} None tensor([[1.],
[2.]])
visualizer
{} None tensor([[1.],
[2.]])
visualizer | 15/645 [00:00<00:18, 33.53it/s]
{} None tensor([[1.],
[2.]])
visualizer
{} None tensor([[1.],
[2.]])
visualizer
{} None tensor([[1.],
[2.]])
visualizer
{} None tensor([[1.],
[2.]])
visualizer | 19/645 [00:00<00:18, 33.00it/s]
{} None tensor([[1.],
[2.]])
visualizer
{} None tensor([[1.],
[2.]])
visualizer
{} None tensor([[1.],
[2.]])
visualizer
{} None tensor([[1.],
[2.]])
visualizer▍ | 23/645 [00:00<00:18, 32.92it/s]
{} None tensor([[1.]])
visualizer
{} None tensor([[1.]])
visualizer
{} None tensor([[1.]])
visualizer
{} None tensor([[1.]])
visualizer█▍ | 27/645 [00:00<00:18, 32.63it/s]
{} None tensor([[1.]])
visualizer
{} None tensor([[1.]])
visualizer██▎ | 31/645 [00:00<00:19, 31.82it/s]
{} None tensor([[1.]])
visualizer
{1: {'captions': [], 'bg_colors': []}} tensor([[nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan]]) tensor([1.])
visualizer███████▊ | 54/645 [00:01<00:16, 36.27it/s]
{1: {'captions': [], 'bg_colors': []}} tensor([[nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan]]) tensor([1.])
visualizer
{1: {'captions': [], 'bg_colors': []}} None tensor([[1.]])
visualizer████████▊ | 58/645 [00:01<00:16, 35.29it/s]
{1: {'captions': [], 'bg_colors': []}} None tensor([[1.]])
visualizer██████████▉ | 67/645 [00:01<00:15, 37.14it/s]
{1: {'captions': [], 'bg_colors': []}} None tensor([[1.]])
visualizer███████████▊ | 71/645 [00:01<00:15, 37.45it/s]
{1: {'captions': [], 'bg_colors': []}} None tensor([[1.]])
visualizer████████████████ | 89/645 [00:02<00:13, 40.48it/s]
{1: {'captions': [], 'bg_colors': []}} None tensor([[1.]])
visualizer
{1: {'captions': [], 'bg_colors': []}} None tensor([[1.]])
visualizer
{1: {'captions': [], 'bg_colors': []}} None tensor([[1.]])
visualizer█████████████████▎ | 94/645 [00:02<00:13, 39.95it/s]
{1: {'captions': [], 'bg_colors': []}} None tensor([[1.]])
visualizer███████████████████████ | 119/645 [00:02<00:13, 39.85it/s]
{1: {'captions': [], 'bg_colors': []}} tensor([[nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan]]) tensor([1.])
visualizer
{1: {'captions': [], 'bg_colors': []}} None tensor([[1.]])
visualizer
{1: {'captions': [], 'bg_colors': []}} None tensor([[1.]])
visualizer██████████████████████████████ | 149/645 [00:03<00:11, 44.51it/s]
{1: {'captions': [], 'bg_colors': []}} None tensor([[1.]])
visualizer███████████████████████████████▎ | 154/645 [00:03<00:11, 41.80it/s]
{1: {'captions': [], 'bg_colors': []}} None tensor([[1.]])
visualizer████████████████████████████████▍ | 159/645 [00:03<00:11, 41.57it/s]
{1: {'captions': [], 'bg_colors': []}} None tensor([[1.]])
visualizer
{1: {'captions': [], 'bg_colors': []}} None tensor([[1.]])
visualizer
{1: {'captions': [], 'bg_colors': []}} None tensor([[1.]])
visualizer█████████████████████████████████▋ | 164/645 [00:03<00:11, 41.41it/s]
{1: {'captions': [], 'bg_colors': []}} None tensor([[1.]])
visualizer
{1: {'captions': [], 'bg_colors': []}} None tensor([[1.],
[3.]])
visualizer
{1: {'captions': [], 'bg_colors': []}} None tensor([[1.]])
visualizer
{1: {'captions': [], 'bg_colors': []}} None tensor([[1.]])
visualizer
{1: {'captions': [], 'bg_colors': []}, 4: {'captions': [], 'bg_colors': []}} tensor([[nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan]]) tensor([1., 4.])
visualizer██████████████████████████████████▊ | 169/645 [00:04<00:11, 39.88it/s]
{1: {'captions': [], 'bg_colors': []}, 4: {'captions': [], 'bg_colors': []}} None tensor([[1.],
[4.]])
visualizer
{1: {'captions': [], 'bg_colors': []}, 4: {'captions': [], 'bg_colors': []}} None tensor([[1.],
[4.]])
visualizer
{1: {'captions': [], 'bg_colors': []}, 4: {'captions': [], 'bg_colors': []}} None tensor([[4.]])
visualizer
{1: {'captions': [], 'bg_colors': []}, 4: {'captions': [], 'bg_colors': []}} None tensor([[4.]])
visualizer████████████████████████████████████ | 174/645 [00:04<00:12, 39.15it/s]
{1: {'captions': [], 'bg_colors': []}, 4: {'captions': [], 'bg_colors': []}} None tensor([[1.]])
visualizer
{1: {'captions': [], 'bg_colors': []}, 4: {'captions': [], 'bg_colors': []}} None tensor([[1.]])
visualizer█████████████████████████████████████████▉ | 199/645 [00:04<00:10, 44.47it/s]
{1: {'captions': [], 'bg_colors': []}, 4: {'captions': [], 'bg_colors': []}} tensor([[nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan, nan, nan, nan, nan, nan, nan, nan]]) tensor([1.])
...
...
...
visualizer███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉ | 615/645 [00:16<00:00, 36.55it/s]
{1: {'captions': [], 'bg_colors': []}, 4: {'captions': [], 'bg_colors': []}, 6: {'captions': [], 'bg_colors': []}, 8: {'captions': [], 'bg_colors': []}, 11: {'captions': [], 'bg_colors': []}} None tensor([[11.]])
visualizer
{1: {'captions': [], 'bg_colors': []}, 4: {'captions': [], 'bg_colors': []}, 6: {'captions': [], 'bg_colors': []}, 8: {'captions': [], 'bg_colors': []}, 11: {'captions': [], 'bg_colors': []}} None tensor([[11.]])
load frame closed████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 645/645 [00:16<00:00, 38.23it/s]
write frame closed
Avaworker stopped

and the source video I used you can download by this link:

https://pan.baidu.com/s/17stN1qdQA4g36RhyG-_H1w code: p5wp

customize dataset

Hello, I have a few questions.

  1. I'd like to know why we don't use 'person_bbox' during training

  2. I don't know what "# disable box_file when train, use only gt to train" means. Isn't box the ground truth?

  3. I want to customize the dataset. My action data usually lasts about 3 seconds with varying length, but I see the clip of AVA as one second, I first came into contact with the concept of keyframes, and why should I take the first frame as a keyframe, which can represent the action classification of this clip? I am very confused. Can you give me the general idea?

looking forward to you reply.

when I run used video it‘s not so good!!

Hi! author, I run this demo but It doesn't look as good as your demo!
1 I used resnet101_8x8f_denseserial.yaml and resnet101_8x8f_denseserial.pth.
2 When I used common_15cat_res101.pth it's even worse.

so the reason it's the I have to pre-trained that? and I want know your demo's picture size.

Segmentation fault (core dumped)

Hi, when i run the training command on a single GPU, it shows Segmentation fault (core dumped), and I have modified the multiprocess setting to 0, but it still remains the same. Could you please help to address the problem?
And when i run the training command on multi GPUs, it shows another fault:
subprocess.CalledProcessError: Command '***(the command)' died with <Signals.SIGSEGV: 11>.
My environment is:
Python 3.7.6
PyTorch 1.3.1 built for Cuda 10.0
Cuda runtime version 10.0.
Thanks.

Besides, when I run the demo.py, it remains the state 'Tracker Progress: 1004 frame [02:42, 6.14 frame/s]' for a long time, is it normal?

A problem when run the demo.py

Hi,
Thanks a lot for your open source code, when i run the demo.py, i encounter a problem.

(alphaction) [zhuxt@localhost demo]$ python demo.py --video-path "/home/zhuxt/workspace/actlyzer-app/exec/videos/10_persons_v2.mp4" --output-path output.mp4 --cfg-path /home/zhuxt/workspace/AlphAction/config_files/resnet101_8x8f_denseserial.yaml --weight-path "/home/zhuxt/workspace/AlphAction-master/author_models/common_15cat_res101.pth" --common-cate
Starting video demo, video path: /home/zhuxt/workspace/actlyzer-app/exec/videos/10_persons_v2.mp4
Loading action model weight from /home/zhuxt/workspace/AlphAction-master/author_models/common_15cat_res101.pth.
Action model weight successfully loaded.
Loading YOLO model..
Network successfully loaded
Loading tracking model..
Network successfully loaded
Showing tracking progress bar (in fps). Other processes are running in the background.
Tracker Progress: 1624 frame [03:05, 7.75 frame/s]Process Process-4:
Traceback (most recent call last):
File "/home/zhuxt/anaconda3/envs/alphaction/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/home/zhuxt/anaconda3/envs/alphaction/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/home/zhuxt/workspace/AlphAction/demo/action_predictor.py", line 434, in _compute_prediction
center_timestamp, video_size, ids = self.timestamps[timestamp_idx]
IndexError: list index out of range

As you can see from the log, it can run on the former frames, when it is 1624th frame, it is out of range.
I have test with several videos, it alse arise the same issue.

Installation error

Initialize according to file INSTALL.md. when running the code 'pip install -e .' this is something error.
Installing collected packages: alphaction
Attempting uninstall: alphaction
Found existing installation: alphaction 0.0.0
Can't uninstall 'alphaction'. No files were found to uninstall.
Running setup.py develop for alphaction
ERROR: Command errored out with exit status 1:
command: /home/xianjin/anaconda3/envs/alphaction/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/data/action_recognition/AlphAction/setup.py'"'"'; file='"'"'/data/action_recognition/AlphAction/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' develop --no-deps
cwd: /data/action_recognition/AlphAction/
Complete output (107 lines):
running develop
running egg_info
writing alphaction.egg-info/PKG-INFO
writing dependency_links to alphaction.egg-info/dependency_links.txt
writing requirements to alphaction.egg-info/requires.txt
writing top-level names to alphaction.egg-info/top_level.txt
reading manifest file 'alphaction.egg-info/SOURCES.txt'
writing manifest file 'alphaction.egg-info/SOURCES.txt'
running build_ext
building 'alphaction.custom_cuda_ext' extension
gcc -pthread -B /home/xianjin/anaconda3/envs/alphaction/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/data/action_recognition/AlphAction/alphaction/csrc -I/home/xianjin/anaconda3/envs/alphaction/lib/python3.7/site-packages/torch/include -I/home/xianjin/anaconda3/envs/alphaction/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/xianjin/anaconda3/envs/alphaction/lib/python3.7/site-packages/torch/include/TH -I/home/xianjin/anaconda3/envs/alphaction/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/xianjin/anaconda3/envs/alphaction/include/python3.7m -c /data/action_recognition/AlphAction/alphaction/csrc/vision.cpp -o build/temp.linux-x86_64-3.7/data/action_recognition/AlphAction/alphaction/csrc/vision.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=custom_cuda_ext -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/local/cuda/bin/nvcc -DWITH_CUDA -I/data/action_recognition/AlphAction/alphaction/csrc -I/home/xianjin/anaconda3/envs/alphaction/lib/python3.7/site-packages/torch/include -I/home/xianjin/anaconda3/envs/alphaction/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/home/xianjin/anaconda3/envs/alphaction/lib/python3.7/site-packages/torch/include/TH -I/home/xianjin/anaconda3/envs/alphaction/lib/python3.7/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/xianjin/anaconda3/envs/alphaction/include/python3.7m -c /data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu -o build/temp.linux-x86_64-3.7/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.o -D__CUDA_NO_HALF_OPERATORS
-D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --compiler-options '-fPIC' -O3 -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_custom_cuda_ext -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
/home/xianjin/anaconda3/envs/alphaction/lib/python3.7/site-packages/torch/include/ATen/cuda/NumericLimits.cuh(83): warning: calling a constexpr host function("from_bits") from a host device function("lowest") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/xianjin/anaconda3/envs/alphaction/lib/python3.7/site-packages/torch/include/ATen/cuda/NumericLimits.cuh(84): warning: calling a constexpr __host__ function("from_bits") from a __host__ __device__ function("max") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/xianjin/anaconda3/envs/alphaction/lib/python3.7/site-packages/torch/include/ATen/cuda/NumericLimits.cuh(85): warning: calling a constexpr __host__ function("from_bits") from a __host__ __device__ function("lower_bound") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/xianjin/anaconda3/envs/alphaction/lib/python3.7/site-packages/torch/include/ATen/cuda/NumericLimits.cuh(86): warning: calling a constexpr __host__ function("from_bits") from a __host__ __device__ function("upper_bound") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(128): error: a pointer to a bound function may only be used to call the function

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(128): error: type name is not allowed

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(128): error: expected an expression

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(128): error: a pointer to a bound function may only be used to call the function

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(128): error: type name is not allowed

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(128): error: expected an expression

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(128): error: a pointer to a bound function may only be used to call the function

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(128): error: type name is not allowed

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(128): error: expected an expression

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(128): error: a pointer to a bound function may only be used to call the function

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(128): error: type name is not allowed

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(128): error: expected an expression

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(128): error: a pointer to a bound function may only be used to call the function

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(128): error: type name is not allowed

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(128): error: expected an expression

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(128): error: a pointer to a bound function may only be used to call the function

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(128): error: type name is not allowed

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(128): error: expected an expression

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: a pointer to a bound function may only be used to call the function

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: type name is not allowed

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: expected an expression

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: a pointer to a bound function may only be used to call the function

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: type name is not allowed

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: expected an expression

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: a pointer to a bound function may only be used to call the function

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: type name is not allowed

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: expected an expression

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: a pointer to a bound function may only be used to call the function

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: type name is not allowed

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: expected an expression

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: a pointer to a bound function may only be used to call the function

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: type name is not allowed

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: expected an expression

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: a pointer to a bound function may only be used to call the function

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: type name is not allowed

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: expected an expression

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: a pointer to a bound function may only be used to call the function

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: type name is not allowed

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: expected an expression

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: a pointer to a bound function may only be used to call the function

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: type name is not allowed

/data/action_recognition/AlphAction/alphaction/csrc/cuda/SigmoidFocalLoss_cuda.cu(173): error: expected an expression

42 errors detected in the compilation of "/tmp/tmpxft_00004211_00000000-6_SigmoidFocalLoss_cuda.cpp1.ii".
error: command '/usr/local/cuda/bin/nvcc' failed with exit status 1
----------------------------------------

ERROR: Can't roll back alphaction; was not uninstalled
ERROR: Command errored out with exit status 1: /home/xianjin/anaconda3/envs/alphaction/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/data/action_recognition/AlphAction/setup.py'"'"'; file='"'"'/data/action_recognition/AlphAction/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' develop --no-deps Check the logs for full command output.

cuda error?

"cuda runtime error(48):no kernel image is available for execution on the device at ./alphaction/csrc/cuda/ROIAlign3d_cuda.cu:300"
Please explain how to solve this error, thanks.

How to get the pretrain model of resnet-34

Thanks for your interesting work!
I want to get a light model of alphaction, so i want to change the backbone from resnet50 to resnet34.
But i found if i train the alphaction model from scatch, it's hard to converge.
So i think i should get a pretrain model of resnet34, i have downloaded the kinetics-700 dataset.
Could you share some information how to get the pretrain model on kinetics-700?

Questions about the training schedule

Thanks for sharing the great work! I'm trying to reproduce the baseline result (ResNet-50) in Pytorch following the schedule you provided. However, I only get 21.4% mAP, much lower than 26.5% reported in the paper. I have a few questions as follows.

  1. As the training schedule is reported as "iterations" in paper and the codebase, do you have any idea how many "epochs" it is roughly equivalent to? I used 10 epochs in my experiments.

  2. The learning rate used in this paper (0.004 for clip_size 64) is quite small compared with other papers (e.g., in LFB, 0.04 for clip_size 16). It seems the model is not sufficiently training using this small learning rate after 10 epochs. I'm wondering whether I've misunderstood something here. I tried using base_lr=0.008 and got 23.2% mAP.

Again, thanks for your work and it'll be great if you could help me with this problem. My training schedule is summarized here: (Max Epochs: 10, Base_lr: 0.004, Batch_size: 64, Lr_decay: at 6 / 8 epochs)

A Problem when running demo

Hi, I met a problem when trying to run the demo.py, it seems that there is something missing. Could you please provide a solution?Thanks.

import AlphAction.custom_ext as _C
ModuleNotFoundError: No module named 'AlphAction.custom_ext'

pytorch 1.5+ support

this line of code cuased error:

error: ‘AT_CHECK’ was not declared in this scope

it should update to TORCH_CHECK

multi-thread problem

Hello, I encountered a problem when I executed the demo.py

the error log as followed :

Starting video demo, video path: ../kf001.mp4
after Initialise Visualizer @@
multiprocessing.set_start_method @@
torch.multiprocessing.set_sharing_strategy @@
count() @@ count(0)
Loading YOLO model..
yolo self.model_cfg ../detector/yolo/cfg/yolov3-spp.cfg
yolo self.model_weights ../data/models/detector_models/yolov3-spp.weights
self.model.net_info-height, 608
Network successfully loaded
args.gpus [0] <class 'list'>
args.device cuda <class 'torch.device'>

model_weight_url @@ ../data/models/aia_models/resnet101_8x8f_denseserial.pth
Loading tracking model..
after AVAPredictorWorker @@
0it [00:00, ?it/s]Network successfully loaded
644it [00:56, 11.33it/s]Wait for feature preprocess
The input queue is empty. Start working on prediction
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 21673/21673 [00:00<00:00, 145254.70it/s]
End of video loader
50%|█████████████████████████████████████████████████████████████████████████████ | 10/20 [00:00<00:00, 43.17it/s]/home/xa/miniconda3/lib/python3.6/multiprocessing/semaphore_tracker.py:143: UserWarning: semaphore_tracker: There appear to be 1 leaked semaphores to clean up at shutdown
len(cache))
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:00<00:00, 43.97it/s]
Prediction is done.
Wait for writer process to finish...

Exception in thread Thread-1: | 92/645 [00:02<00:16, 32.77it/s]
Traceback (most recent call last):
File "/home/xa/miniconda3/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/home/xa/.local/lib/python3.6/site-packages/tqdm/_monitor.py", line 62, in run
for instance in self.tqdm_cls._instances:
File "/home/xa/miniconda3/lib/python3.6/_weakrefset.py", line 60, in iter
for itemref in self.data:
RuntimeError: Set changed size during iteration

100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 645/645 [00:18<00:00, 34.42it/s]
write frame closed
load frame closed
Avaworker stopped


would you please tell me how to fix it? thank u ~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.