Giter Club home page Giter Club logo

l2cs-net's Introduction

animated


L2CS-Net

The official PyTorch implementation of L2CS-Net for gaze estimation and tracking.

Installation

Install package with the following:

pip install git+https://github.com/edavalosanaya/L2CS-Net.git@main

Or, you can git clone the repo and install with the following:

pip install [-e] .

Now you should be able to import the package with the following command:

$ python
>>> import l2cs

Usage

Detect face and predict gaze from webcam

from l2cs import Pipeline, render
import cv2

gaze_pipeline = Pipeline(
    weights=CWD / 'models' / 'L2CSNet_gaze360.pkl',
    arch='ResNet50',
    device=torch.device('cpu') # or 'gpu'
)
 
cap = cv2.VideoCapture(cam)
_, frame = cap.read()    

# Process frame and visualize
results = gaze_pipeline.step(frame)
frame = render(frame, results)

Demo

  • Download the pre-trained models from here and Store it to models/.
  • Run:
 python demo.py \
 --snapshot models/L2CSNet_gaze360.pkl \
 --gpu 0 \
 --cam 0 \

This means the demo will run using L2CSNet_gaze360.pkl pretrained model

Community Contributions

MPIIGaze

We provide the code for train and test MPIIGaze dataset with leave-one-person-out evaluation.

Prepare datasets

  • Download MPIIFaceGaze dataset from here.
  • Apply data preprocessing from here.
  • Store the dataset to datasets/MPIIFaceGaze.

Train

 python train.py \
 --dataset mpiigaze \
 --snapshot output/snapshots \
 --gpu 0 \
 --num_epochs 50 \
 --batch_size 16 \
 --lr 0.00001 \
 --alpha 1 \

This means the code will perform leave-one-person-out training automatically and store the models to output/snapshots.

Test

 python test.py \
 --dataset mpiigaze \
 --snapshot output/snapshots/snapshot_folder \
 --evalpath evaluation/L2CS-mpiigaze  \
 --gpu 0 \

This means the code will perform leave-one-person-out testing automatically and store the results to evaluation/L2CS-mpiigaze.

To get the average leave-one-person-out accuracy use:

 python leave_one_out_eval.py \
 --evalpath evaluation/L2CS-mpiigaze  \
 --respath evaluation/L2CS-mpiigaze  \

This means the code will take the evaluation path and outputs the leave-one-out gaze accuracy to the evaluation/L2CS-mpiigaze.

Gaze360

We provide the code for train and test Gaze360 dataset with train-val-test evaluation.

Prepare datasets

  • Download Gaze360 dataset from here.

  • Apply data preprocessing from here.

  • Store the dataset to datasets/Gaze360.

Train

 python train.py \
 --dataset gaze360 \
 --snapshot output/snapshots \
 --gpu 0 \
 --num_epochs 50 \
 --batch_size 16 \
 --lr 0.00001 \
 --alpha 1 \

This means the code will perform training and store the models to output/snapshots.

Test

 python test.py \
 --dataset gaze360 \
 --snapshot output/snapshots/snapshot_folder \
 --evalpath evaluation/L2CS-gaze360  \
 --gpu 0 \

This means the code will perform testing on snapshot_folder and store the results to evaluation/L2CS-gaze360.

l2cs-net's People

Contributors

ahmednull avatar capjamesg avatar edavalosanaya avatar thohemp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

l2cs-net's Issues

video or image demo

Hi Ahmed and thanks for sharing this great work!
Are you planning to share a code for video demo?

Thanks,
Carmi

Confusion about the gazeto3d function

Hi there,

Thanks for sharing your amazing work. I'm a bit confused about the output format of the model and how gazeto3d function works. Could you please give some intuitions about what the input 'gaze' and the output 'gaze_gt' refer to? Much appreciated!

ResNet18 Pretrained weights ?

Thank you so much for your awesome work!
By any chance, do you have the weights to a pretrained model with the ResNet18 architecture ?

data preprocessing

Thank you very much for your excellent work, but I found that the data_processing_gaze360. pdf in your laboratory homepage could not be found. Could you please upload a copy when you are free?

Pose of the faces?

Thanks for the code~

I am trying to find the position where the gaze vector intercept with a vertical 2D plan in the camera space, therefore I want to know the coords of faces in the camera space, I am wondering if the network already predicted the pose of the faces? If yes is that a way that I can extract it instead of running my own head pose estimation?

about model.py

Very interesting research, thanks for publishing the code.
I am still a student and inexperienced, so please forgive me if my question is due to my lack of knowledge.
According to the definition of L2CS in model.py, it seems that the fc_fineture layer is provided but not incorporated. What is the role of the fc_fineture layer?

Hi. Urgent

Hi Ahmednull; Apply data preprocessing from here. The link is not working. Would you be interested please. Or how can you provide a different link on the related data preprocessing method? please take care

question for expectation

thanks for your great job.
i have a question, could you explain the process of calculating the expectation of the probability distribution?

What do these codes mean?
pitch_predicted = \ torch.sum(pitch_predicted * idx_tensor, 1) * 3 - 42 yaw_predicted = \ torch.sum(yaw_predicted * idx_tensor, 1) * 3 - 42

problems occurred when training

Hello author, first of all, thank you very much for open sourcing your work. My English and code skills are not very good, so here I will briefly talk about the problem when training, I have the following error when training with mpiigaze dataset: get_ignored_params() takes 1 positional argument but 2 were given;
when using Adam optimizer get_ ignored_params, get_non_ignored_params, get_fc_params all have the above error, I don't know how to change it, can you give some advice to me, thanks a lot.

Wrong data preprocessing in demo.py

The preprocessing is not aligned with your training in your demo code. For the data preprocessing in training, the face images are first cropped to squares. It seems that u just ignore this and resize the image without keeping the aspect ratio in the demo.py.

the best result

Hello,
Thank for your great work. I have a question. When I replicated the work, I got a better result than 3.92 on MPIIFaceGaze. However, when I tested your pre-training results, I got a 4.17. I want to konw is 3.92 the best result ?Are all the results performed on 448 *448 pictures?

Question regarding model inference consistency

Hi. When you look at the inference predictions on provided image, when face/eye direction barely changes, inference predictions vary a lot sometimes. I was wondering if I can improve it a bit.

I was wondering about kalman filter but I will lose some fps on it. Are there some changes on architectural level I can do ? Or maybe something else ?

Thanks !

More bins MPII Face Gaze

Hey,
at first, thank You for great work and sharing results.

I have a question about number of bins. Has this number been tuned in trial-and-error process? Can I increase the number of bins to improve performance, or 28 is optimal number?

If change the number of bins, how i should change the equation used to map bins to degrees? #3

Best,
Jan

ValueError Attempting to Process Frames Without Faces

Running into a numpy ValueError: need at least one array to stack when running the pipeline on a frame of video that has no faces detected. Seems to point back to line 90 in the Pipeline file: pitch, yaw = self.predict_gaze(np.stack(face_imgs)). Using Numpy 1.24.3

My guess is that this is due to face_imgs being an empty list when there are no detected faces. Working on a fix in my fork and will PR when complete.

demo.py yaw and pitch

Hello, I need to convert yaw and pitch to target in the screen. It is possible with this demo? Any suggestion is welcome

Find error from readme

The command to run demo.py is 'python demo.py --snapshot models/L2CSNet_gaze360.pkl --device 0 --cam 0'
It was. Also, the data_modified function does not exist in line 71 of the utils folder.

After changing the two errors, demo.py can be run.
Did I take the right approach?

date_modified

Hello, thank you very much for sharing such a great project. However, I seem to be encountering a problem where it is telling me that the "date_modified" module cannot be found when training the model. It seems to be a custom module, but there is no definition in the source code. Can you help me with this issue? Thank you.

Why there are 15 MPIIGaze trained models?

First of all thanks Ahmed for the work. Here the question:

Why there are 15 MPIIGaze trained models and not just one as with the Gaze360 Dataset? And in this case how should the inference be performed?

face_detection not found

Hi I am trying to run the demo but not able to find the repo, any idea from where to import it

Is the pitch and yaw measured by your L2CS model in radians for gaze images?

Is the pitch and yaw measured by your L2CS model in radians for gaze images?
I tried to use your L2CS model to detect the gaze Angle of offline images, and I found that the detection results of pitch and yaw were both 0.xxx. I guess they are radian values, but I am not sure whether they are correct. Can you add more details on the results value at your leisure? Thanks a lot!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.