Giter Club home page Giter Club logo

Comments (6)

QuantuMope avatar QuantuMope commented on July 19, 2024 1

I just wanted to echo that I too suffered from a quite noticeable delay using ros_openpose on a GTX 2070 Super.
I modified the code according to @Smilels which led to better performance.
There is still a slight delay but much more improved. This leads me to suspect that the bottleneck is occurring in the image transfer.

from ros_openpose.

Alex-Beh avatar Alex-Beh commented on July 19, 2024

Hello, may I know your cuda version, cudnn version and the output of nvidia-smi?

I try run the launch file and report the following error:
F0901 23:54:35.099694 14837 cudnn_relu_layer.cpp:13] Check failed: status == CUDNN_STATUS_SUCCESS (4 vs. 0) CUDNN_STATUS_INTERNAL_ERROR

from ros_openpose.

ravijo avatar ravijo commented on July 19, 2024

Hi @Smilels

Thanks for the acknowledgment.

I am surprised to hear that you are seeing a 1s delay. 1s is too much.

Do you have a 1s delay of the displayed video? And what is the frequency of the topic /frame?

Let me give you an idea about what's going on in ros_openpose. The main file rosOpenpose.cpp uses two workers, one for input and another for output. The input worker provides color images to the OpenPose wrapper. The output worker receives the keypoints detected in the pixel. It converts pixels to 3D coordinates and publishes them to the topic /frame. The output worker receives the keypoints only when an image is given to OpenPose by the input worker. There is an explicit sleep of 10 ms, i.e., SLEEP_MS = 10 defined here. You can reduce/remove this delay by editing rosOpenpose.cpp.

I can get around 27fps outputs using pure openpose.

Can you please explain a bit more? What are the inputs to OpenPose? What was the model? The reason for asking these questions is that I have not seen this much FPS in OpenPose in my workstation so far. Here is the link to OpenPose benchmarking report.

Another possible reason for this delay is the legacy version of OpenCV. OpenPose is found to be running slower in Webcam because it uses OpenCV to capture images from a webcam. As ros_openpose is developed for ROS, it is bound to use the legacy version of OpenCV. ros_openpose uses cv-bridge and does serialization of image (cv::Mat) into ROS image (sensor_msgs::Image).

Last but not least, if you have time, I suggest you do do a benchmarking of the complete system. It will be helpful in finding the real culprit. More details on OpenPose benchmarking can be found here

Cheers!

from ros_openpose.

ravijo avatar ravijo commented on July 19, 2024

Hi @Alex-Beh

Hello, may I know your cuda version, cudnn version and the output of nvidia-smi?

Please read the documentation here.

I try run the launch file and report the following error:
F0901 23:54:35.099694 14837 cudnn_relu_layer.cpp:13] Check failed: status == CUDNN_STATUS_SUCCESS (4 vs. 0) CUDNN_STATUS_INTERNAL_ERROR

This error is unrelated to ros_openpose. Are you sure that OpenPose is working fine? As I suggested here, in order to run ros_openpose, first you need to have a working OpenPose.

from ros_openpose.

Smilels avatar Smilels commented on July 19, 2024

Dear @ravijo, thanks for your dedicated reply.

Hi @Smilels

Thanks for the acknowledgment.

I am surprised to hear that you are seeing a 1s delay. 1s is too much.

Do you have a 1s delay of the displayed video? And what is the frequency of the topic /frame?

Let me give you an idea about what's going on in ros_openpose. The main file rosOpenpose.cpp uses two workers, one for input and another for output. The input worker provides color images to the OpenPose wrapper. The output worker receives the keypoints detected in the pixel. It converts pixels to 3D coordinates and publishes them to the topic /frame. The output worker receives the keypoints only when an image is given to OpenPose by the input worker. There is an explicit sleep of 10 ms, i.e., SLEEP_MS = 10 defined here. You can reduce/remove this delay by editing rosOpenpose.cpp.

I clearly see that the images in the display window are slower than my real motion.
I tried to set SLEEP_MS = 0 , but nothing changes.
I even wrote a test publisher and published a test message in the image callback function (in cameraReader.cpp). This frequency of this test topic still has around 30Hz.
Then I suppose that the latency is from the WUserInput class.
The color images fed into openpose are somehow delayed.
But the frequency of /frame is around 20Hz.

I can get around 27fps outputs using pure openpose.

Can you please explain a bit more? What are the inputs to OpenPose? What was the model? The reason for asking these questions is that I have not seen this much FPS in OpenPose in my workstation so far. Here is the link to OpenPose benchmarking report.

This is my environment configuration:

Name Value
OS Ubuntu 16.04 LTS (64-bit)
RAM 125.6 GB
Processor Intel® Core i9-7900X [email protected] × 20
Kernel Version 4.15.0-107-generic
ROS Kinetic
GCC Version 5.4.0
OpenCV Version 3.3.1
OpenPose Version 1.6.1
GPU 2 * GeForce GTX 1080
CUDA Version 9.0.176
cuDNN Version 7.4.2
CUDA Version 9.0.176
cuDNN Version 7.4.2

I use two GeForce GTX 1080 gpus, maybe that's the reason why I can get 27fps of the body-only model.

Another possible reason for this delay is the legacy version of OpenCV. OpenPose is found to be running slower in Webcam because it uses OpenCV to capture images from a webcam. As ros_openpose is developed for ROS, it is bound to use the legacy version of OpenCV. ros_openpose uses cv-bridge and does serialization of image (cv::Mat) into ROS image (sensor_msgs::Image).

Finally, I solved this problem by getting rid of the input worker and output worker.
I use opWrapper.waitAndEmplace to process the sPtrVecSPtrDatum type input directly.

Anyway, If the input worker works well on your side, I'm still confused about why this issue happens on my computer.

Last but not least, if you have time, I suggest you do do a benchmarking of the complete system. It will be helpful in finding the real culprit. More details on OpenPose benchmarking can be found here
I haven't read this document before.
Thanks for your support.

Cheers!
Cheers.

from ros_openpose.

ravijo avatar ravijo commented on July 19, 2024

Hi @Smilels

Thanks for sharing the information.

Indeed, your environment is very powerful.

I use opWrapper.waitAndEmplace to process the sPtrVecSPtrDatum type input directly.

The opWrapper.waitAndEmplace(datumToProcess) is said to be ideal for fast prototyping when performance is not an issue. Nevertheless, I am glad that you made it worked.

Anyway, If the input worker works well on your side, I'm still confused about why this issue happens on my computer.

To be precise, I don't know. It could be due to anything. For example, recently, I started using CUDA 10. However, I have not done any benchmarking of the complete system (including ros_openpose). I am also confused, just like you right now.

I hope if I get time, I will figure it out. Even if I can't figure it out, I hope to make the 1s delay really short. Meanwhile, if you get any clue, please feel free to post it here. I am closing this issue for now.

Cheers!

from ros_openpose.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.