mpatacchiola / deepgaze Goto Github PK

Computer Vision library for human-computer interaction. It implements Head Pose and Gaze Direction Estimation Using Convolutional Neural Networks, Skin Detection through Backprojection, Motion Detection and Tracking, Saliency Map.

License: MIT License

Python 100.00%

convolutional-neural-networks motion-tracking color-detection face-detection skin-detection motion-detection head-pose-estimation human-computer-interaction histogram-comparison histogram-intersection

deepgaze's Introduction

Updates

Update 22/01/2020 You may be interested in following my new Youtube channel for weekly videos about Computer Vision, Machine Learning, Deep Learning, and Robotics.

Update 16/07/2019 Stable version of Deepgaze 2.0 is available on branch 2.0.

Update 20/03/2019 Started the porting on Python/OpenCV 3.0, check the branch 2.0 for a preliminary version.

Update 10/06/2017 The PDF of the article "Head pose estimation in the wild using Convolutional Neural Networks and adaptive gradient methods" is available for free download in the next 50 days using this special link

Update 04/06/2017 Article "Head pose estimation in the wild using Convolutional Neural Networks and adaptive gradient methods" have been accepted for publication in Pattern Recogntion (Elsevier). The Deepgaze CNN head pose estimator module is based on this work.

Update 31/05/2017 Implementation of the new package saliency_map.py. The package contains an implementation of the FASA algorithm for saliency detection [example] [wiki]

Update 22/03/2017 Fixed a bug in mask_analysis.py and almost completed a more robust version of the CNN head pose estimator.

What is Deepgaze?

Deepgaze is a library for human-computer interaction, people detection and tracking which uses Convolutional Neural Networks (CNNs) for face detection, head pose estimation and classification. The focus of attention of a person can be approximately estimated finding the head orientation. This is particularly useful when the eyes are covered, or when the user is too far from the camera to grab the eye region with a good resolution. When the eye region is visible it is possible to estimate the gaze direction, which is much more informative and can give a good indication of the FOA. Deepgaze contains useful packages for:

Head pose estimation (Perspective-n-Point, Convolutional Neural Networks)
Face detection (Haar Cascade)
Skin and color detection (Range detection, Backprojection)
Histogram-based classification (Histogram Intersection)
Motion detection (Frame differencing, MOG, MOG2)
Motion tracking (Particle filter)
Saliency map (FASA)

Deepgaze is based on OpenCV and Tensorflow, some of the best libraries in computer vision and machine learning. Deepgaze is an open source project and any contribution is appreciated, feel free to fork the repository and propose integrations.

This library is the result of a recent work, if you use the library in academic work please cite the following paper:

Patacchiola, M., & Cangelosi, A. (2017). Head pose estimation in the wild using Convolutional Neural Networks and adaptive gradient methods. Pattern Recognition, http://dx.doi.org/10.1016/j.patcog.2017.06.009.

Why should I use Deepgaze?

Because Deepgaze makes your life easier! The implementation of many algorithms such as face detectors, pose estimators and object classificators can be painful. Deepgaze has been designed to implement those algorithms in a few lines of code. Deepgaze is helpful for both beginners and advanced users who want to save time. All the code contained in Deepgaze is optimised and it is based on state-of-the-art algorithms.

What is a Convolutional Neural Network?

A convolutional neural network (CNN, or ConvNet) is a type of feed-forward artificial neural network in which the connectivity pattern between its neurons is inspired by the organization of the animal visual cortex, whose individual neurons are arranged in such a way that they respond to overlapping regions tiling the visual field. Convolutional networks were inspired by biological processes and are variations of multilayer perceptrons designed to use minimal amounts of preprocessing. They have wide applications in image and video recognition, recommender systems and natural language processing [wiki]

Main contributors

This is an updated list of the main contributors of the project. We are looking for contributors! If you want to contribute adding a new module or improving an existing one, send an email to our team!

Massimiliano Patacchiola: project leader and main contributor
Joel Gooch: head pose estimation
Ishit Mehta: CNN-cascade face detection
Luca Surace: Haar-cascade multi-face detection
Hrishikesh Kamath: version 2.0 porting, notebooks, test scripts

Prerequisites

The current version of Deepgaze is based on Python 2.7, a porting for Python 3.0 has been scheduled for the next year.

To use the libray you have to install:

Numpy [link]

sudo pip install numpy

OpenCV 2.x (not compatible with OpenCV >= 3.x) [link]

sudo apt-get install libopencv-dev python-opencv

Tensorflow [link]

sudo pip install tensorflow

Some examples may require additional libraries:

dlib [link]

Installation

ATTENTION: this version is obsolete, please check the branch 2.0 on this repository

Download the repository from [here] or clone it using git:

git clone https://github.com/mpatacchiola/deepgaze.git

To install the package you have to enter in the Deepgaze folder and run the setup.py script (it may require root privileges):

cd deepgaze
sudo python setup.py install

If you want to track all the installed files you can record the installation process in a text file using the --record flag:

sudo python setup.py install --record record.txt

Done! Now give a look to the examples below.

Examples

Head Pose Estimation using the Perspective-n-Point algorithm in OpenCV [code] [video]
Head Pose Estimation in-the-wild using Perspective-n-Point and dlib face detector [code] [video]
Head Pose Estimation in images using Convolutional Neural Networks [code]

Color detection using the Histogram Backprojection algorithm [blog] [code]

Skin detection using the HSV range color detector [code]

Face detection using the HSV range color detector [code]

Corner detection comparison of four algorithms on a video streaming [code] [video]

Motion detection and tracking using frame differencing on a video streaming [code]

Motion detection and tracking comparison of three algorithms on a video streaming [code] [video]

Motion tracking with unstable measurements using Particle Filter [code] [video]

Motion tracking with multiple backprojection for playing chrome's dinosaur game [blog] [code] [video]

Classify object using their colour fingerprint (histogram intersection) [blog] [code]

Implementation of the FASA (Fast, Accurate, and Size-Aware Salient Object Detection) algorithm [code] [wiki] [link]

Acknowledgements

The example "head pose estimation using Perspective-n-Point" is partially based on the C++ version you can find here, and on the workshop "Developing an attention system for a social robot" which was part of the 2nd International Summer School on Social Human-Robot Interaction.
To implement the Bayes and Particle Filters I followed the great repository of rlabbe which you can find here

deepgaze's People

Contributors

Stargazers

Watchers

Forkers

g10dras caomw mahmoudheshambackup vyraun sebasstrogg stevenlol wanjinchang benjamesbabala lyk125 v-italy bikmaeff ahn19 leezqcst brightown zymcool walkoncross hexiangquan fenix0817 yochju undercontroller caozhengquan mailmahee ml-lab huleg johnjsb soledad89 tspannhw jotathebest minvex gofullthrottle boyihu sarmadm amal-vincent vcramach li-js yang-fei boosting aiwebbot allenmao remyyang liuhuiwisdom coocoky jnulab analystmonkey pengkiki wangqiang1588 johndpope wcy0319 gp1313 nazifberat alialerwi jajohe ishit lukeoverride nagyist thuanvh tangxinkevin tharuniitk thetesla facemachine rossanag ashwinrajendraprasad repo-fork walterwang xanthoko suzanasvm wookay aminghafari ewancai zwb195 whizzzkid dsmadcow jadedfire huanleo francispan sathishruw arasharchor jinwook-shim wqw547243068 avarf baucheng gsoosg zorrock elecun po-hsuan-huang lei-wang-github zhangqizky jxx315 grplyler dongrit ratnadeep22 veenlee gvc0461082002 voletiv howieeeeeeee borisnadion navdevl tengerye peterzhousz lloves

deepgaze's Issues

Pushing tipo fixes ?

Hi Massimiliano!

just tried the very first example and it worked out of the box :)
Only correction done is as follows:

index 18a299c..b56a84d 100644
--- a/examples/ex_pnp_head_pose_estimation_webcam.py
+++ b/examples/ex_pnp_head_pose_estimation_webcam.py
@@ -121,7 +121,8 @@ def main():
                                   P3D_STOMION])

     #Declaring the two classifiers
-    my_cascade = haarCascade("./etc/haarcascade_frontalface_alt.xml", "./etc/haarcascade_profileface.xml")
+    my_cascade = haarCascade("./etc/xml/haarcascade_frontalface_alt.xml", "./etc/xml/haarcascade_profileface.xml")
+    #TODO If missing, example file can be retrieved from http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2
     my_detector = faceLandmarkDetection('./etc/shape_predictor_68_face_landmarks.dat')

     #Error counter definition

I will try further but let me know I you would like to see changes pushed on a branch meanwhile.

Thanks,
RC

Cnn Pose Estimation Sensitivity to Face Size in Image

Hi there,

What's the guideline to crop face images before running the CNN roll, pitch, yaw models? Depending on my cropping, I'm getting up to 20 degree swings in angle estimation. Is this normal? Even with forward facing poses, I'm getting large swings in pitch.

Thanks,
JM

About head-pose-estimation by Python implementaiton

Hello I follow your code(ex_dlib_pnp_head_pose_estimation_video.py) to implementation the pose estimation,
I want to get the face pose, range from +90 to -90, like the following picture

I use the six landmark and their world coordinate to get pose

image_points = np.array([
                            (landmarks[4], landmarks[5]),     # Nose tip
                            (landmarks[10], landmarks[11]),   # Chin
                            (landmarks[0], landmarks[1]),     # Left eye left corner
                            (landmarks[2], landmarks[3]),     # Right eye right corne
                            (landmarks[6], landmarks[7]),     # Left Mouth corner
                            (landmarks[8], landmarks[9])      # Right mouth corner
                        ], dtype="double")

    # 3D model points.
    model_points = np.array([
                            (0.0, 0.0, 0.0),             # Nose tip
                            (0.0, -330.0, -65.0),        # Chin
                            (-165.0, 170.0, -135.0),     # Left eye left corner
                            (165.0, 170.0, -135.0),      # Right eye right corne
                            (-150.0, -150.0, -125.0),    # Left Mouth corner
                            (150.0, -150.0, -125.0)      # Right mouth corner                         
                        ])

(success, rotation_vector, translation_vector) = cv2.solvePnP(model_points, image_points, camera_matrix, dist_coeffs, flags=cv2.CV_ITERATIVE)
rvec_matrix = cv2.Rodrigues(rotation_vector)[0]
proj_matrix = np.hstack((rvec_matrix, translation_vector))
eulerAngles = -cv2.decomposeProjectionMatrix(proj_matrix)[6] 
yaw   = eulerAngles[1]
pitch = eulerAngles[0]
roll  = eulerAngles[2]
if pitch > 0:
  pitch = 180 - pitch
elif pitch < 0:
  pitch = -180 - pitch
yaw = -yaw

But I have some problem, each case(yaw, pitch, row) is correct, for example, In face roll case, roll is true, but, pitch is false. Or, in complex situation, one of three will false. Could u give me some advise? Thanks.

training script for CNN headpose estimation

Hi @mpatacchiola , thanks for the great work! I was wondering that, would you in future share the complete training script for the model currently used in CNN headpose estimation? Thanks!

Can i use my own new images to train the model?

Can i use my own new images to train the model? I am working on cnn head pose estimation. your work is excellent. I was stuck with a question that "Can i use my own new images to train the model? " whether this deepgaze will help me so in doing? Thank you.

Difference in original C++ implementation and Python implementation in this repo

Hi,
I used the same horse image that the original authors used for their implementation and compared the results for them.
Turns out, there seems to be quite a large difference between the resulting saliency maps.
Can you please tell me why?
Does opencv inherently have some differences between the internal function implementations?
I've attached the file that shows the difference. I ran both without any threshold filtering and gaussian deblurring. The saliency map on the left is the output from C++. The one on the right is the output from Python.

Any help is appreciated!

about the 3D_Coord of the landmarks

Hi
Thank you for sharing the code.
Could you tell me where did you get these data? Actually, I want to find out 3D coordinates of 64 landmarks detected by dlib.
Thank you.

python2 compile errors

python2 -m compileall .

gives:

Listing . ...
Listing ./.git ...
Listing ./.git/branches ...
Listing ./.git/hooks ...
Listing ./.git/info ...
Listing ./.git/logs ...
Listing ./.git/logs/refs ...
Listing ./.git/logs/refs/heads ...
Listing ./.git/logs/refs/remotes ...
Listing ./.git/logs/refs/remotes/origin ...
Listing ./.git/objects ...
Listing ./.git/objects/0b ...
Listing ./.git/objects/1a ...
Listing ./.git/objects/27 ...
Listing ./.git/objects/81 ...
Listing ./.git/objects/99 ...
Listing ./.git/objects/9c ...
Listing ./.git/objects/ed ...
Listing ./.git/objects/info ...
Listing ./.git/objects/pack ...
Listing ./.git/refs ...
Listing ./.git/refs/heads ...
Listing ./.git/refs/remotes ...
Listing ./.git/refs/remotes/origin ...
Listing ./.git/refs/tags ...
Listing ./__pycache__ ...
Listing ./build ...
Listing ./build/lib.linux-x86_64-2.7 ...
Listing ./build/lib.linux-x86_64-2.7/deepgaze ...
Listing ./build/lib.linux-x86_64-2.7/deepgaze/__pycache__ ...
Listing ./deepgaze ...
Listing ./deepgaze/__pycache__ ...
Listing ./doc ...
Listing ./doc/images ...
Listing ./etc ...
Listing ./etc/tensorflow ...
Listing ./etc/tensorflow/head_pose ...
Listing ./etc/tensorflow/head_pose/pitch ...
Listing ./etc/tensorflow/head_pose/roll ...
Listing ./etc/tensorflow/head_pose/yaw ...
Listing ./etc/xml ...
Listing ./examples ...
Listing ./examples/__pycache__ ...
Listing ./examples/ex_cnn_cascade_training_face_detection ...
Listing ./examples/ex_cnn_cascade_training_face_detection/__pycache__ ...
Listing ./examples/ex_cnn_head_pose_axes ...
Listing ./examples/ex_cnn_head_pose_estimation_images ...
Listing ./examples/ex_cnn_head_pose_estimation_images/__pycache__ ...
Listing ./examples/ex_color_classification_images ...
Listing ./examples/ex_color_classification_images/__pycache__ ...
Listing ./examples/ex_color_detection_image ...
Listing ./examples/ex_color_detection_image/__pycache__ ...
Listing ./examples/ex_diff_motion_detection_video ...
Listing ./examples/ex_diff_motion_detection_video/__pycache__ ...
Listing ./examples/ex_dnn_head_pose_estimation_training ...
Listing ./examples/ex_dnn_head_pose_estimation_training/__pycache__ ...
Compiling ./examples/ex_dnn_head_pose_estimation_training/ex_dnn_head_pose_estimation_training.py ...
  File "./examples/ex_dnn_head_pose_estimation_training/ex_dnn_head_pose_estimation_training.py", line 136
    weights_output3 = tf.Variable(tf.truncated_normal([num_hidden_units_3, num_labels], 0.0, 1.0))
                  ^
SyntaxError: invalid syntax

Compiling ./examples/ex_dnn_head_pose_estimation_training/ex_prima_parser.py ...
Sorry: IndentationError: expected an indented block (ex_prima_parser.py, line 393)
Listing ./examples/ex_face_center_color_detection ...
Listing ./examples/ex_face_center_color_detection/__pycache__ ...
Listing ./examples/ex_fasa_saliency_map ...
Listing ./examples/ex_fasa_saliency_map/__pycache__ ...
Listing ./examples/ex_haar_face_detection ...
Listing ./examples/ex_haar_face_detection/__pycache__ ...
Listing ./examples/ex_motion_detectors_comparison_video ...
Listing ./examples/ex_motion_detectors_comparison_video/__pycache__ ...
Listing ./examples/ex_multi_backprojection_hand_tracking_gaming ...
Listing ./examples/ex_multi_backprojection_hand_tracking_gaming/__pycache__ ...
Listing ./examples/ex_particle_filter_mouse_tracking ...
Listing ./examples/ex_particle_filter_mouse_tracking/__pycache__ ...
Listing ./examples/ex_particle_filter_object_tracking_video ...
Listing ./examples/ex_particle_filter_object_tracking_video/__pycache__ ...
Listing ./examples/ex_skin_detection_images ...
Listing ./examples/ex_skin_detection_images/__pycache__ ...

python3 compatibility

python3 -m compileall .

gives:

Listing '.'...
Listing './.git'...
Listing './.git/branches'...
Listing './.git/hooks'...
Listing './.git/info'...
Listing './.git/logs'...
Listing './.git/logs/refs'...
Listing './.git/logs/refs/heads'...
Listing './.git/logs/refs/remotes'...
Listing './.git/logs/refs/remotes/origin'...
Listing './.git/objects'...
Listing './.git/objects/0b'...
Listing './.git/objects/1a'...
Listing './.git/objects/27'...
Listing './.git/objects/81'...
Listing './.git/objects/99'...
Listing './.git/objects/9c'...
Listing './.git/objects/ed'...
Listing './.git/objects/info'...
Listing './.git/objects/pack'...
Listing './.git/refs'...
Listing './.git/refs/heads'...
Listing './.git/refs/remotes'...
Listing './.git/refs/remotes/origin'...
Listing './.git/refs/tags'...
Listing './build'...
Listing './build/lib.linux-x86_64-2.7'...
Listing './build/lib.linux-x86_64-2.7/deepgaze'...
Listing './deepgaze'...
Listing './doc'...
Listing './doc/images'...
Listing './etc'...
Listing './etc/tensorflow'...
Listing './etc/tensorflow/head_pose'...
Listing './etc/tensorflow/head_pose/pitch'...
Listing './etc/tensorflow/head_pose/roll'...
Listing './etc/tensorflow/head_pose/yaw'...
Listing './etc/xml'...
Listing './examples'...
Listing './examples/ex_cnn_cascade_training_face_detection'...
Listing './examples/ex_cnn_head_pose_axes'...
Compiling './examples/ex_cnn_head_pose_axes/ex_cnn_head_pose_estimation_axes.py'...
***   File "./examples/ex_cnn_head_pose_axes/ex_cnn_head_pose_estimation_axes.py", line 90
    print rvec
             ^
SyntaxError: Missing parentheses in call to 'print'

Listing './examples/ex_cnn_head_pose_estimation_images'...
Listing './examples/ex_color_classification_images'...
Listing './examples/ex_color_detection_image'...
Listing './examples/ex_diff_motion_detection_video'...
Listing './examples/ex_dnn_head_pose_estimation_training'...
Compiling './examples/ex_dnn_head_pose_estimation_training/ex_dnn_head_pose_estimation_training.py'...
***   File "./examples/ex_dnn_head_pose_estimation_training/ex_dnn_head_pose_estimation_training.py", line 136
    weights_output3 = tf.Variable(tf.truncated_normal([num_hidden_units_3, num_labels], 0.0, 1.0))
                  ^
SyntaxError: invalid syntax

Compiling './examples/ex_dnn_head_pose_estimation_training/ex_prima_parser.py'...
*** Sorry: IndentationError: expected an indented block (ex_prima_parser.py, line 393)
Listing './examples/ex_face_center_color_detection'...
Listing './examples/ex_fasa_saliency_map'...
Compiling './examples/ex_fasa_saliency_map/ex_fasa_saliency_map_webcam.py'...
***   File "./examples/ex_fasa_saliency_map/ex_fasa_saliency_map_webcam.py", line 34
    print video_capture.get(cv2.cv.CV_CAP_PROP_FRAME_WIDTH)
                      ^
SyntaxError: invalid syntax

Listing './examples/ex_haar_face_detection'...
Listing './examples/ex_motion_detectors_comparison_video'...
Listing './examples/ex_multi_backprojection_hand_tracking_gaming'...
Listing './examples/ex_particle_filter_mouse_tracking'...
Listing './examples/ex_particle_filter_object_tracking_video'...
Listing './examples/ex_skin_detection_images'...

Please make it more python3 compatible.

input image to the cnn_head_pose_estimation

Hi, thank you for your works!
I am not clear that what the images are exactly processed before we input to the nets? Is it just detected square face box or some translation and scaling have been down?
I test the same face image with different size, one just square the face, another is bigger and with more background, the result yaw, pitch and roll are different. So it's important to know the pictures input to model.
wish for your replies.

how to get Roll, Yaw, Pitch Angles?

Dear mpatacchiola,

After solvePnP, how to get Roll, Yaw, Pitch Angles?

CNN/Dlib/OpenCv which method works better ?

Hi, I am new in TF and I would like to have you comments/advises/remarks about the CNN methods for face orientation. I already tried Opencv and Dlib, and I obtain better results with Dlib. I added a Kalman Filter to make the pose even smoother. I haven't tried the CNN method yet and I would like to get your opinion about this method, can it give even better results than Dlib for the face orientation?

Why did you use haar cascades AND dlib for a face bounding box?

head pose estimation interpretation

Hi Massimiliano,

looking at prima dataset image https://ibb.co/ia5pB5 these are its labels (tilt:pitch:-90 , pan:yaw:0), when trying the same image on deepgaze (https://github.com/mpatacchiola/deepgaze/blob/master/examples/ex_cnn_headp_pose_estimation_images/ex_cnn_head_pose_estimation_images.py) the results are Estimated [roll, pitch, yaw] ..... [-1.32919,-5.41702,-9.20919] could you kindly mention the reason this representation is quite different from prima dataset.

Thanks in advance :)

inconsistent file extension

In etc/tensorflow/head_pose there are subfolders containing the model weights. There are two named:
cnn_cccdd_30k.tf and one cnn_cccdd_30k (without the ".tf"). Please decide which consistent name you want to use.

Saliency Map doesn't work for black and white image?

Hi,
I've been using the FASA implementation in this repository and it doesn't work for grayscale images.
Any workaround?

Thanks in advance!

Which opencv version would deepgaze using.

I'm using opencv 3.3.1 opencv_python‑3.3.1‑cp36‑cp36m‑win_amd64.whl
And run deepgaze example with it. There are many compatible issues. Like this: fourcc = cv2.cv.CV_FOURCC(*'XVID')
are changed like this.
fourcc = cv2.VideoWriter.fourcc(*'XVID')
And many more issues..
So.. Which version should I use?

Gaze estimation prediction

Hi, first of all thank you for deepgaze. It has really been helpful to me for the head pose estimation. I wanted to know how can i use the gaze prediction if the pupils are visible in the image.

Thanks. Looking forward to your reply.

measuring 3d coorfdination for each part of face.

regarding to this code
P3D_RIGHT_SIDE = numpy.float32([-100.0, -77.5, -5.0]) #0
P3D_GONION_RIGHT = numpy.float32([-110.0, -77.5, -85.0]) #4
P3D_MENTON = numpy.float32([0.0, 0.0, -122.7]) #8
P3D_GONION_LEFT = numpy.float32([-110.0, 77.5, -85.0]) #12
P3D_LEFT_SIDE = numpy.float32([-100.0, 77.5, -5.0]) #16
P3D_FRONTAL_BREADTH_RIGHT = numpy.float32([-20.0, -56.1, 10.0]) #17
P3D_FRONTAL_BREADTH_LEFT = numpy.float32([-20.0, 56.1, 10.0]) #26
P3D_SELLION = numpy.float32([0.0, 0.0, 0.0]) #27
P3D_NOSE = numpy.float32([21.1, 0.0, -48.0]) #30
P3D_SUB_NOSE = numpy.float32([5.0, 0.0, -52.0]) #33
P3D_RIGHT_EYE = numpy.float32([-20.0, -65.5,-5.0]) #36
P3D_RIGHT_TEAR = numpy.float32([-10.0, -40.5,-5.0]) #39
P3D_LEFT_TEAR = numpy.float32([-10.0, 40.5,-5.0]) #42
P3D_LEFT_EYE = numpy.float32([-20.0, 65.5,-5.0]) #45
#P3D_LIP_RIGHT = numpy.float32([-20.0, 65.5,-5.0]) #48
#P3D_LIP_LEFT = numpy.float32([-20.0, 65.5,-5.0]) #54
P3D_STOMION = numpy.float32([10.0, 0.0, -75.0]) #62

#The points to track
#These points are the ones used by PnP

to estimate the 3D pose of the face

TRACKED_POINTS = (0, 4, 8, 12, 16, 17, 26, 27, 30, 33, 36, 39, 42, 45, 62)
ALL_POINTS = list(range(0,68)) #Used for debug only

that code explain that, there are 68 points of the face. every point is represent by 3d coord.
the thing that i confused is how can i determine that ? (let say how can i measure that stomion is 10, 0, -75

i need to measure the gap between eyelid to determine whether the aye is closed or not by measuring both axis, and i have no idea to find 3d coord, could you please help me?

Gaze Tracking

Even though this frameworks is called "deepgaze" I am still not sure, if you can do Gaze Tracking and Gaze Mapping. Would appreciate a clarification.

cnn_head_pose_estimation: problems with restoring checkpoint

Dear mpatacchiola & anyone reading this,

I am using your program in order to determine the head pose estimation. However I get an error which reads as follows:

NotFoundError (see above for traceback): Unsuccessful TensorSliceReader constructor: Failed to find any matching files for /home/micha/Desktop/BEP/NEW/deepgaze/etc/tensorflow/head_pose/pitch/./cnn_cccdd_30k
[[Node: save_1/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save_1/Const_0_0, save_1/RestoreV2/tensor_names, save_1/RestoreV2/shape_and_slices)]]

So the checkpoint file is recognized, however it seems that it can't find cnn_cccdd_30k from the contents of 'checkpoint'

model_checkpoint_path: "./cnn_cccdd_30k"
all_model_checkpoint_paths: "./cnn_cccdd_30k"

I tried changing the paths by removing the dot in front of it (same error), by adding .meta or .tf, but that didn't work and I still get errors, albeit different one:

NotFoundError (see above for traceback): Tensor name "conv1p_b" not found in checkpoint files /home/micha/Desktop/BEP/NEW/deepgaze/etc/tensorflow/head_pose/pitch/./cnn_cccdd_30k.tf
[[Node: save_1/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save_1/Const_0_0, save_1/RestoreV2/tensor_names, save_1/RestoreV2/shape_and_slices)]]

I also tried shuffling around with forward slashes to see whether that could be the problem, but to no avail. I also tried putting the absolute path to cnn_cccdd_30k.tf in the original file, but this didn't work in combination with checkpoint_dir in tf.train.get_checkpoint_state.

This is my code to call CnnHeadPoseEstimator:

yawFP = "/home/micha/Desktop/BEP/NEW/deepgaze/etc/tensorflow/head_pose/yaw"
pitchFP = "/home/micha/Desktop/BEP/NEW/deepgaze/etc/tensorflow/head_pose/pitch"
HPE = CnnHeadPoseEstimator(yawFP, pitchFP)

This is the relevant code I changed the code in cnn_head_pose_estimator.py a little bit so I could work with a different checkpoint file, keeping the normal one as backup: (ps, this doesn't change the error when I work with the original code in cnn_head_pose_estimator.py)

y_ckpt = tf.train.get_checkpoint_state(checkpoint_dir=YawFilePath, latest_filename='checkpoint_2')
p_ckpt = tf.train.get_checkpoint_state(checkpoint_dir=PitchFilePath, latest_filename='checkpoint_2')

When I print these variables y_ckpt and p_ckpt, I get this, which I believe is correct.

model_checkpoint_path: "/home/micha/Desktop/BEP/NEW/deepgaze/etc/tensorflow/head_pose/yaw/./cnn_cccdd_30k"
all_model_checkpoint_paths: "/home/micha/Desktop/BEP/NEW/deepgaze/etc/tensorflow/head_pose/yaw/./cnn_cccdd_30k"

model_checkpoint_path: "/home/micha/Desktop/BEP/NEW/deepgaze/etc/tensorflow/head_pose/pitch/cnn_cccdd_30k"
all_model_checkpoint_paths: "/home/micha/Desktop/BEP/NEW/deepgaze/etc/tensorflow/head_pose/pitch/cnn_cccdd_30k"

Hopefully somebody would be able to help me!

Thanks in advance,

Micha

Head pose from video

Can I use video file for head pose estimation ?

pitch, yaw and roll

i think that all people try to asking you this..
is it possible if ex_pnp_head_pose_estimation_webcam.py , measure how much degree the head is moving?
let say that your head tilting 45 degree, nodding 30 degree, etc

about cnn head pose and update

Hello, on the cnn head pose estimation in deepgze, I found no face detection module during the test. Is the network model trained on face detection? At the same time, this project will continue to be updated for gaze estimation, thanks
and If I want to train cnn head pose estimation, the data set is made into a .pickle format like dnn?

is gaze direction estimation actually supported?

Based on the title of the repo, and the docs intro I got an impression that it's possible to do gaze direction estimation. But it seems only head position is supported, that is if I keep my head still and only move my eyes - no changes will be noticed.

deepgaze/deepgaze/color_detection.py line 68

line 68 "kernel_siz" is kernel_siz"e" ?
Miss a word "e" ?

gaze estimation

I read main.py and don't see gaze direction estimation in the code.
Maybe I miss something in your code, can you tell me which statement has this gaze direction estimate?

Thanks,

can´t find detector and Haar

Hello, I installed everything in the proper way and could import all libraries, and my cara starts tu run. but i get this error:
`The video source has been opened correctly...
Estimated camera matrix:
[[ 554.25628662 0. 320. ]
[ 0. 554.25628662 240. ]
[ 0. 0. 1. ]]

ValueError Traceback (most recent call last)
in ()
310
311 if name == "main":
--> 312 main()

in main()
115 my_cascade = haarCascade("./etc/xml/haarcascade_frontalface_alt.xml", "./etc/xml/haarcascade_profileface.xml")
116 #TODO If missing, example file can be retrieved from http://dlib.net/files/shape_predictor_68_face_landmarks.dat.bz2
--> 117 my_detector = faceLandmarkDetection('./etc/shape_predictor_68_face_landmarks.dat')
118
119 #Error counter definition

C:\Users\usuario\deepgaze\deepgaze\face_landmark_detection.pyc in init(self, landmarkPath)
33 #Check if the file provided exist
34 if(os.path.isfile(landmarkPath)==False):
---> 35 raise ValueError('haarCascade: the files specified do not exist.')
36
37 self._predictor = dlib.shape_predictor(landmarkPath)

ValueError: haarCascade: the files specified do not exist.
`

windows上不可以跑

resize frame size

i am trying to resize the frame by changing this code

video_capture = cv2.VideoCapture(0)
video_capture.set(3,340) #pixel
video_capture.set(5,320) #width
video_capture.set(4,240) #height
video_capture.set(6,15) #fps

but, nothing change, the frame size still 640x480,
could you please tell me how, to change it?

Prediction Code for weights created by ex_dnn_training.py throws exception

Hi I'm new to TF, while trying to plug-in the weights created by ex_dnn_training.py in ex_cnn_head_pose_estimation_images_pitch.py i get following error
Tensor name "out_pitch_w" not found in checkpoint files ../dnn_1600i_ben_4h_3o-50000

Is the Model / Hyperparameter for given weight cnn_cccdd_30k.tf and that defined in ex_dnn_training same?

flags=cv2.cv.CV_HAAR_SCALE_IMAGE

Hi,

I have below errors:


File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/deepgaze/haar_cascade.py", line 149, in _findFrontalFace
    flags=cv2.CV_HAAR_SCALE_IMAGE
AttributeError: 'module' object has no attribute 'CV_HAAR_SCALE_IMAGE'

if I edit :


File "/Users/Projects/deepgaze/deepgaze/haar_cascade.py", line 149, in _findFrontalFace
    flags=cv2.CV_HAAR_SCALE_IMAGE
AttributeError: 'module' object has no attribute 'CV_HAAR_SCALE_IMAGE'

opencv 3.4 and python 2.7

any idea ?

thx

headpose roll returns AttributeError

Hi Massimiliano,

trying the example for headpose/roll example always return with Error:

AttributeError: CnnHeadPoseEstimator instance has no attribute 'load_roll_variables'

thanks in advance

Is there any example for this library?

Hello，I used to do some job about head pose estimation . Your job are amazing , and I want to have a try ,but I don't find any example about how to use it , could you give me some advice ?
Many thanks!

missing weight file load

in examples/ex_cnn_head_pose_axes/ex_cnn_head_pose_estimation_axes.py there are the two line missing:

 my_head_pose_estimator.load_roll_variables(os.path.realpath("../../etc/tensorflow/head_pose/roll/cnn_cccdd_30k.tf"))
 my_head_pose_estimator.load_pitch_variables(os.path.realpath("../../etc/tensorflow/head_pose/pitch/cnn_cccdd_30k.tf"))

Is this intended?

Special link isn't active

Hello, can I please have an updated link to you paper?

The special link is expired.

X Error: BadDrawable (invalid Pixmap or Window parameter) 9

Hi there

I'm trying to run the file ex_fasa_saliency_map_images.py in a docker container and I'm experiencing the following errors

X Error: BadDrawable (invalid Pixmap or Window parameter) 9
  Major opcode: 62 (X_CopyArea)
  Resource id:  0x80000d

It could well be because of my docker setup, whoever xeyes works without issue from the docker container to host X env.

Thanks

there is no gaze estimation example using CNNs

Hi,

I have been trying a face detection algorithm with face landmark detection. Now I have been trying to estimation head pose and gaze pose. There is already an example using cnns to estimate head position, but there is no gaze estimation example using cnns as well as traditional methods to compare their performance.

just yaw?

Estimate the head pose by cnn, just get the yaw?what about roll and pitch?

how can i train for the cnn_head_pose_estimation

Hello, I was recently studying head attitude estimation, I am very interested in your project, I have a question, how do I train my own model for cnn_head_pose_estimation.thank you.

head pose estimation, aflw dataset

Thanks for your work. I have a question about the CNN_Head pose estimation. Before I use the AFLW dataset to train the network, do I need to resize the face picture to 64*64? How to complete it?

Visualize roll, pitch, yaw in image

Hi @mpatacchiola, great work, it has been fun trying your examples!

Regarding the CNN implementation, we get the roll, pitch and yaw values, and they seem OK, but I tried to visualize (as you do projecting the axis in the pnp examples) with no success so far. I tried different ways of getting the rotation vector (or matrix and then cv2.Rodrigues to get the vector) from the values, but none seems to work. I don't know if it's a difference in the coordinate system you use or other detail I'm missing.

Any help with this would be most appreciated :)

Best regards,

Fine Tuning

How to fine tune your model?
I don't have sufficient data to retrain your model from scratch.
I want to fine tune your model on my data which has only two classes ?

extending the Deepgaze class for roll estimation

Hi Massimiliano

As communicated on gmail, I have made by changes and opened this issue just to remember what went on.

deep Gaze without gaze estimation?

Hi,

I wondering whether there is a gaze estimation implementation available or it is not, :)

Generate Antropometric constant values

Can you give me some hints on how to generate the Antropometric constant values based on the paper "Head-and-Face Anthropometric Survey of U.S. Respirator Users" since I cannot find the data about 3D model points.

I dont understand the 3D_Coord of the selected 15 3D_Points

Dear Massimiliano Patacchiola,

Thank you for your code!
I dont understand the 3D_Coord of the selected 15 3D_Points.

for example,

P3D_MENTON = numpy.float32([0.0, 0.0, -122.7]) #8
P3D_SELLION = numpy.float32([0.0, 0.0, 0.0]) #27

the first and second coords is all 0.

P3D_NOSE = numpy.float32([21.1, 0.0, -48.0]) #30
P3D_SUB_NOSE = numpy.float32([5.0, 0.0, -52.0]) #33
the nose_point is on top of the sub_nose point in the plane, but the first coords of

nose is 21.1, the sub_nose is 5.

P3D_RIGHT_SIDE = numpy.float32([-100.0, -77.5, -5.0]) #0
P3D_RIGHT_EYE = numpy.float32([-20.0, -65.5,-5.0]) #36
the first coords of the two points vary widely.

P3D_RIGHT_SIDE = numpy.float32([-100.0, -77.5, -5.0]) #0
P3D_GONION_RIGHT = numpy.float32([-110.0, -77.5, -85.0]) #4
P3D_MENTON = numpy.float32([0.0, 0.0, -122.7]) #8
P3D_GONION_LEFT = numpy.float32([-110.0, 77.5, -85.0]) #12
P3D_LEFT_SIDE = numpy.float32([-100.0, 77.5, -5.0]) #16
P3D_FRONTAL_BREADTH_RIGHT = numpy.float32([-20.0, -56.1, 10.0]) #17
P3D_FRONTAL_BREADTH_LEFT = numpy.float32([-20.0, 56.1, 10.0]) #26
P3D_SELLION = numpy.float32([0.0, 0.0, 0.0]) #27
P3D_NOSE = numpy.float32([21.1, 0.0, -48.0]) #30
P3D_SUB_NOSE = numpy.float32([5.0, 0.0, -52.0]) #33
P3D_RIGHT_EYE = numpy.float32([-20.0, -65.5,-5.0]) #36
P3D_RIGHT_TEAR = numpy.float32([-10.0, -40.5,-5.0]) #39
P3D_LEFT_TEAR = numpy.float32([-10.0, 40.5,-5.0]) #42
P3D_LEFT_EYE = numpy.float32([-20.0, 65.5,-5.0]) #45

P3D_LIP_RIGHT = numpy.float32([-20.0, 65.5,-5.0]) #48

P3D_LIP_LEFT = numpy.float32([-20.0, 65.5,-5.0]) #54

P3D_STOMION = numpy.float32([10.0, 0.0, -75.0]) #62

On the number of training angles?

Hello, I learned in the paper you are using the prima data set, I browsed the data set, only given two angles, how do you get to the third angle,Did you mark a few angles at the time of training? Thank you

How to handle the installation of deepgaze errors?

run：
cd deepgaze
sudo python setup.py install
error：
qys@qys-GL552VW:~/deepgaze$ sudo python setup.py install
/usr/lib/python2.7/distutils/dist.py:267: UserWarning: Unknown distribution option: 'include_package_data'
warnings.warn(msg)
running install
running build
running build_py
running install_lib
running install_egg_info
Removing /usr/local/lib/python2.7/dist-packages/deepgaze-0.1.egg-info
Writing /usr/local/lib/python2.7/dist-packages/deepgaze-0.1.egg-info

file deepgaze.py (for module deepgaze) not found

hello,
I met a problem,When I perform setup.py appear this problem, please help me thank you