Giter Club home page Giter Club logo

aten's Introduction

Adaptive Temporal Encoding Network for Video Instance-level Human Parsing

By Qixian Zhou, Xiaodan Liang, Ke Gong, Liang Lin (ACM MM18) Video demo

Requirements

Python3, TensorFlow 1.3+, Keras 2.0.8+

Dataset

The model is trained and evaluated on our proposed VIP dataset for video instance-level human parsing. Please check it for more dataset details. VIP dataset contains 404 video sequences, including 304 sequences for training set, 50 sequences for validation set and 50 sequences for test set. For every 25 consecutive frames in each video, one frame is annotated densely with pixel-wise semantic part categories and instance-level identification. We release the source videos, frames and the fine annotations for training and validation set. You can evaluate the performance of your model on validation set with our released evaluation code.

The VIP dataset is available on both OneDrive and Baidu drive.

The share link of Baidu drive is:

link: https://pan.baidu.com/s/1rt9wmRf6o8HoBzj7EscyeQ

pwd: cpbt

Models

Models are released on OneDrive and baidu drive:

  • Parsing-RCNN(frame-level) weights(parsing_rcnn.h5).

  • ATEN(p=2,l=3) weights(aten_p2l3.h5).

Installation

  1. Clone this repository
  2. Keras with convGRU2D installation.
cd keras_convGRU
python setup.py install
  1. Compile flow_warp ops(optional). The flow_warp.so have been generated(Ubuntu14.04, gcc4.8.4, python3.6, tf1.4). To compile flow_warp ops, you can excute the code as follows:
cd ops
make
  1. Dataset setup. Download the VIP dataset(both VIP_Fine and VIP_Sequence) and decompress them. The directory structure of VIP should be as follows:

VIP
----Images
--------videos1
--------...
--------videos404
----adjacent_frames
--------videos1
--------...
--------videos404
----front_frame_list
----Category_ids
----Human_ids
----Instance_ids
----lists
........

  1. Model setup. Download released weights and place in models floder.

Training

# ATEN training on VIP
python scripts/vip/train_aten.py

# Parsing-RCNN(frame-level) training on VIP
python scripts/vip/train_parsingrcnn.py

Inference

# ATEN inference on VIP
python scripts/vip/test_aten.py

# Parsing-RCNN(frame-level) inference on VIP
python scripts/vip/test_parsingrcnn.py

the results are stored in ./vis

Evaluate

  1. modify the path in evaluate/*.py
  2. run the code to evaluate your results generated by visualize.py
# for human parsing
python evaluate/test_parsing.py

# for instance segmentation
python evaluate/test_ap.py

# for instance-level human parsing
python evaluate/test_inst_part_ap.py

Reference

@inproceedings{zhou2018,
    Author = {Qixian Zhou, Xiaodan Liang, Ke Gong, Liang Lin},
    Title = {Adaptive Temporal Encoding Network for Video Instance-level Human Parsing},
    Booktitle = {Proc. of ACM International Conference on Multimedia (ACM MM)},
    Year = {2018}
} 

Acknowledgements

This code is based on other source code on github:

  1. matterport/Mask_RCNN(https://github.com/matterport/Mask_RCNN), an implementation of Mask R-CNN on Python 3, Keras, and TensorFlow.
  2. KingMV/ConvGRU(https://github.com/KingMV/ConvGRU), an implementation of ConvGRU2D on Keras.

aten's People

Contributors

hcplab-sysu avatar sberryman avatar zhouqixian avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

aten's Issues

VIP Dataset on Google Drive

Any chance you can upload the VIP dataset to Google Drive? I'm unable to download it from Baidu drive here in the states.

Thanks!

Dataset download link does not work

Dear Authors,
Thank you very much for your contribution. Your Onedrive link is not working, also I can not download the dataset using Baidu Link. Can you please share the dataset link with me? Thanks in advance.

about fps and dataset annotation toolboxes

Hi ,Thanks for your sharing. I want to konw whether this task can predict online, and also want to know what annotation toolboxes you have used for human parsing dataset.

flow_wrap.so make error:

ERROR:

`/ATEN/ops>make
Makefile:5: /usr/local/lib/python3.6/site-packages/tensorflow/include
nvcc -std=c++11 -c --expt-relaxed-constexpr --gpu-architecture=sm_61
-o ./build/flow_warp_gpu.o flow_warp/flow_warp.cu.cc
-I /usr/local/lib/python3.6/site-packages/tensorflow/include -I/usr/local/lib/python3.6/site-packages/tensorflow/include/external/nsync/public -I/usr/local/cuda/include -I/usr/local -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC
/usr/local/lib/python3.6/site-packages/tensorflow/include/absl/strings/string_view.h(501): error: constexpr function return is non-constant

/usr/local/lib/python3.6/site-packages/tensorflow/include/google/protobuf/arena_impl.h(55): warning: integer conversion resulted in a change of sign

/usr/local/lib/python3.6/site-packages/tensorflow/include/google/protobuf/arena_impl.h(309): warning: integer conversion resulted in a change of sign

/usr/local/lib/python3.6/site-packages/tensorflow/include/google/protobuf/arena_impl.h(310): warning: integer conversion resulted in a change of sign

/usr/local/lib/python3.6/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h(651): warning: missing return statement at end of non-void function "Eigen::internal::igammac_cf_impl<Scalar, mode>::run [with Scalar=float, mode=Eigen::internal::VALUE]"
detected during:
instantiation of "Scalar Eigen::internal::igammac_cf_impl<Scalar, mode>::run(Scalar, Scalar) [with Scalar=float, mode=Eigen::internal::VALUE]"
(855): here
instantiation of "Scalar Eigen::internal::igamma_generic_impl<Scalar, mode>::run(Scalar, Scalar) [with Scalar=float, mode=Eigen::internal::VALUE]"
(2096): here
instantiation of "Eigen::internal::igamma_retval<Eigen::internal::global_math_functions_filtering_base<Scalar, void>::type>::type Eigen::numext::igamma(const Scalar &, const Scalar &) [with Scalar=float]"
/usr/local/lib/python3.6/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsHalf.h(34): here

/usr/local/lib/python3.6/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h(712): warning: missing return statement at end of non-void function "Eigen::internal::igamma_series_impl<Scalar, mode>::run [with Scalar=float, mode=Eigen::internal::VALUE]"
detected during:
instantiation of "Scalar Eigen::internal::igamma_series_impl<Scalar, mode>::run(Scalar, Scalar) [with Scalar=float, mode=Eigen::internal::VALUE]"
(863): here
instantiation of "Scalar Eigen::internal::igamma_generic_impl<Scalar, mode>::run(Scalar, Scalar) [with Scalar=float, mode=Eigen::internal::VALUE]"
(2096): here
instantiation of "Eigen::internal::igamma_retval<Eigen::internal::global_math_functions_filtering_base<Scalar, void>::type>::type Eigen::numext::igamma(const Scalar &, const Scalar &) [with Scalar=float]"
/usr/local/lib/python3.6/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsHalf.h(34): here
......
instantiation of "Eigen::internal::gamma_sample_der_alpha_retval<Eigen::internal::global_math_functions_filtering_base<Scalar, void>::type>::type Eigen::numext::gamma_sample_der_alpha(const Scalar &, const Scalar &) [with Scalar=double]"
/usr/local/lib/python3.6/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/arch/CUDA/CudaSpecialFunctions.h(154): here

1 error detected in the compilation of "/tmp/tmpxft_00004c83_00000000-6_flow_warp.cu.cpp1.ii".
make: *** [flow_warp_gpu.o] Error 1

`

System information:

python3.6
tensorflow1.13
keras2.2.4
CUDA9.2
cuDNN9.2
nvcc -V:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Tue_Jun_12_23:07:04_CDT_2018
Cuda compilation tools, release 9.2, V9.2.148

try to solve problem

1,In Makefile, modify --gpu-architecture=sm_52 to --gpu-architecture=sm_61, can't solve
2,reference tensorflow/tensorflow#22766, add -DNDEBUG, the error information
`/ATEN/ops>make
Makefile:5: /usr/local/lib/python3.6/site-packages/tensorflow/include
nvcc -std=c++11 -c --expt-relaxed-constexpr --gpu-architecture=sm_61
-o ./build/flow_warp_gpu.o flow_warp/flow_warp.cu.cc
-I /usr/local/lib/python3.6/site-packages/tensorflow/include -I/usr/local/lib/python3.6/site-packages/tensorflow/include/external/nsync/public -I/usr/local/cuda/include -I/usr/local -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -DNDEBUG
/usr/local/lib/python3.6/site-packages/tensorflow/include/absl/strings/string_view.h(501): warning: expression has no effect

/usr/local/lib/python3.6/site-packages/tensorflow/include/google/protobuf/arena_impl.h(55): warning: integer conversion resulted in a change of sign

/usr/local/lib/python3.6/site-packages/tensorflow/include/google/protobuf/arena_impl.h(309): warning: integer conversion resulted in a change of sign

/usr/local/lib/python3.6/site-packages/tensorflow/include/google/protobuf/arena_impl.h(310): warning: integer conversion resulted in a change of sign

/usr/local/lib/python3.6/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsImpl.h(651): warning: missing return statement at end of non-void function "Eigen::internal::igammac_cf_impl<Scalar, mode>::run [with Scalar=float, mode=Eigen::internal::VALUE]"
detected during:
instantiation of "Scalar Eigen::internal::igammac_cf_impl<Scalar, mode>::run(Scalar, Scalar) [with Scalar=float, mode=Eigen::internal::VALUE]"
(855): here
instantiation of "Scalar Eigen::internal::igamma_generic_impl<Scalar, mode>::run(Scalar, Scalar) [with Scalar=float, mode=Eigen::internal::VALUE]"
(2096): here
instantiation of "Eigen::internal::igamma_retval<Eigen::internal::global_math_functions_filtering_base<Scalar, void>::type>::type Eigen::numext::igamma(const Scalar &, const Scalar &) [with Scalar=float]"
/usr/local/lib/python3.6/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../src/SpecialFunctions/SpecialFunctionsHalf.h(34): here
......
Assembler messages:
Fatal error: can't create ./build/flow_warp_gpu.o: No such file or directory
make: *** [flow_warp_gpu.o] Error 1
`
so how to solve the problem? Thanks

module 'keras.layers' has no attribute 'ConvGRU2D'

Hi I've already install the requirements as specified in Readme.md and the compilation was sucessful. But I got the error when I was trying to run the program by python scripts/vip/train_aten.py
/data/APL/ATEN/aten_model.py(676)conv_gru_unit() x = KL.ConvGRU2D(filters=256, kernel_size=(3, 3), name="gru_recurrent_unit", AttributeError: module 'keras.layers' has no attribute 'ConvGRU2D'
The output of dir(keras.layers) is:
['Activation', 'ActivityRegularization', 'Add', 'AlphaDropout', 'AtrousConv1D', 'AtrousConv2D', 'AtrousConvolution1D', 'AtrousConvolution2D', 'Average', 'AveragePooling1D', 'AveragePooling2D', 'AveragePooling3D', 'AvgPool1D', 'AvgPool2D', 'AvgPool3D', 'BatchNormalization', 'Bidirectional', 'Concatenate', 'Conv1D', 'Conv2D', 'Conv2DTranspose', 'Conv3D', 'Conv3DTranspose', 'ConvLSTM2D', 'ConvLSTM2DCell', 'ConvRNN2D', 'ConvRecurrent2D', 'Convolution1D', 'Convolution2D', 'Convolution2DTranspose', 'Convolution3D', 'Cropping1D', 'Cropping2D', 'Cropping3D', 'CuDNNGRU', 'CuDNNLSTM', 'Deconv2D', 'Deconv3D', 'Deconvolution2D', 'Deconvolution3D', 'Dense', 'DepthwiseConv2D', 'Dot', 'Dropout', 'ELU', 'Embedding', 'Flatten', 'GRU', 'GRUCell', 'GaussianDropout', 'GaussianNoise', 'GlobalAveragePooling1D', 'GlobalAveragePooling2D', 'GlobalAveragePooling3D', 'GlobalAvgPool1D', 'GlobalAvgPool2D', 'GlobalAvgPool3D', 'GlobalMaxPool1D', 'GlobalMaxPool2D', 'GlobalMaxPool3D', 'GlobalMaxPooling1D', 'GlobalMaxPooling2D', 'GlobalMaxPooling3D', 'Highway', 'Input', 'InputLayer', 'InputSpec', 'K', 'LSTM', 'LSTMCell', 'Lambda', 'Layer', 'LeakyReLU', 'LocallyConnected1D', 'LocallyConnected2D', 'Masking', 'MaxPool1D', 'MaxPool2D', 'MaxPool3D', 'MaxPooling1D', 'MaxPooling2D', 'MaxPooling3D', 'Maximum', 'MaxoutDense', 'Minimum', 'Multiply', 'PReLU', 'Permute', 'RNN', 'ReLU', 'Recurrent', 'RepeatVector', 'Reshape', 'SeparableConv1D', 'SeparableConv2D', 'SeparableConvolution1D', 'SeparableConvolution2D', 'SimpleRNN', 'SimpleRNNCell', 'Softmax', 'SpatialDropout1D', 'SpatialDropout2D', 'SpatialDropout3D', 'StackedRNNCells', 'Subtract', 'ThresholdedReLU', 'TimeDistributed', 'UpSampling1D', 'UpSampling2D', 'UpSampling3D', 'Wrapper', 'ZeroPadding1D', 'ZeroPadding2D', 'ZeroPadding3D', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'absolute_import', 'activations', 'add', 'advanced_activations', 'average', 'concatenate', 'constraints', 'conv_utils', 'convolutional', 'convolutional_recurrent', 'copy', 'core', 'cudnn_recurrent', 'deserialize', 'deserialize_keras_object', 'division', 'dot', 'embeddings', 'func_dump', 'func_load', 'has_arg', 'initializers', 'interfaces', 'local', 'maximum', 'merge', 'minimum', 'multiply', 'namedtuple', 'noise', 'normalization', 'np', 'object_list_uid', 'pooling', 'print_function', 'python_types', 'recurrent', 'regularizers', 'serialize', 'subtract', 'to_list', 'transpose_shape', 'warnings', 'wrappers']
I'm using docker tf 1.10

How does this work compare to CIHP_PGN?

Hi, I'm looking at both your work and CIHP_PGN. The conclusions from both your papers are very similar.

CIHP_PGN Paper

In this paper, we presented a novel detection-free Part Grouping Network to investigate instance-level human parsing, which is a more pioneering and challenging work in analyzing human in the wild. To push the research boundary of human parsing to match real-world scenarios much better, we further introduce a new large-scale (...)
Experimental results on PASCAL-Person-Part [6] and our CIHP dataset demonstrate the superiority of our proposed approach, which surpasses previous methods for both semantic part segmentation and edge detection tasks, and achieves state-of-the-art performance for instance-level human parsing.

Your Paper

In this work, we investigate video instance-level human parsing that is a more pioneering and realistic task in analyzing human in the wild. To fill the blank of video human parsing data resources, we further introduce a large-scale (...)
Experimental results on DAVIS [36] and our VIP dataset demonstrate the superiority of our proposed approach, which achieves state-of-the-art performance on both video instance-level human parsing and video segmentation tasks.

I'm wondering - which produces better accuracy, this work or CIHP_PGN? Considering that both claim "more pioneering", "demonstrate the superiority of our proposed approach", and "achieve state-of-the-art", can you help explain the differences? I'm not clear which I should use. Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.