"We use an off-the-shelf face detection and pose-extraction pipeline to both identify

Hi all, I just pushed additional s that can crop and extract poses from in-the-w

Hi, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Hi, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url

Hi <a class="user-mention notranslate" data-hovercard-type=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

FFHQ dataset camera pose labeling,about nvlabs/eg3d

Comments (26)

luminohope commented on July 17, 2024 29

Hi all, I just pushed additional scripts that can crop and extract poses from in-the-wild portrait images in a way that is compatible with the FFHQ checkpoints. Hope that is useful. https://github.com/NVlabs/eg3d/blob/main/dataset_preprocessing/ffhq/preprocess_in_the_wild.py

from eg3d.

X-niper commented on July 17, 2024 15

Hi Eric,

I also have questions about the pose extraction part, it seems the origin of deep3dfacerecon_pytorch's coordinates is different from that of eg3d. Can you provide the translation vector of it? I am not very sure if they only have difference in the origin, maybe also on the axis orientation. If this is true, the whole 4*4 transformation matrix is needed.

Thank you!

Hi, I get similar results using the following code. I am not sure it's correct but the 4x4 extrinsics looks similar to the extrinsics in the provided dataset.json file. It seems that the camera distance is assumed fixed at 2.7. You can have a try. I hope the code helps.

import numpy as np 

def compute_rotation(angles):
    x, y, z = angles
    rot_x = np.array([
        [1, 0, 0],
        [0, np.cos(x), -np.sin(x)],
        [0, np.sin(x), np.cos(x)]
    ])
    rot_y = np.array([
        [np.cos(y), 0, np.sin(y)],
        [0, 1, 0],
        [-np.sin(y), 0, np.cos(y)]
    ])
    rot_z = np.array([
        [np.cos(z), -np.sin(z), 0],
        [np.sin(z), np.cos(z), 0],
        [0, 0, 1]
    ])
    return np.matmul(rot_z, np.matmul(rot_y, rot_x))


'''
euler angles and translation are estimated from deep3dfacerecon_pytorch
'''
def get_extrinsics_from_euler_and_translation(euler:np.ndarray, trans:np.ndarray):
    theta_x, theta_y, theta_z = euler[0], euler[1], euler[2]
    theta_x = np.pi - theta_x
    theta_y = -theta_y
    theta_z = theta_z 
    rot_mat = compute_rotation([theta_x, theta_y, theta_z])
    trans_x = -trans[0]
    trans_y = trans[1]
    trans_z = np.sqrt(2.7 ** 2 - trans_x ** 2 - trans_y ** 2)
    trans_new = np.matmul(rot_mat, np.array([trans_x, trans_y, trans_z]))
    mat_4x4 = np.eye(4)
    mat_4x4[0:3, 0:3] = rot_mat
    mat_4x4[0:3, 3] = -trans_new
    return mat_4x4


if __name__ == "__main__":
    test_fname = '00039.png'
    euler_estimated = np.array([ 0.15566148, -0.18466546,-0.01471091])
    trans_estimated = np.array([0.03106074, 0.02740563, 0.07686124])
    mat_4x4 = get_extrinsics_from_euler_and_translation(euler_estimated, trans_estimated)
    print(mat_4x4)

from eg3d.

X-niper commented on July 17, 2024 9

Hi, @blandocs @41xu @mlnyang

I am sorry to say that my guess of the get_extrinsics_from_euler_and_translation function is not exactly the same as the way that Eric uses in this work. Please email Eric to get the code for this step. What I want to say is that we can never derive the same process without the code shared by Eric. The code is in Eric's private repo, so I can't copy it here....

Besides, please follow the runme.py script in this repo to crop your face (you can see that there is one function imported from Deep3DReconPytorch) data before feeding it to estimate poses, then the poses could be extracted correctly.

Best of luck

from eg3d.

jiaxinxie97 commented on July 17, 2024 8

Hi Eric,

I also have questions about the pose extraction part, it seems the origin of deep3dfacerecon_pytorch's coordinates is different from that of eg3d. Can you provide the translation vector of it? I am not very sure if they only have difference in the origin, maybe also on the axis orientation. If this is true, the whole 4*4 transformation matrix is needed.

Thank you!

from eg3d.

X-niper commented on July 17, 2024 3

Hi l4rz,

We're not planning on releasing the pose-extraction code as part of this repo but you can use Deep3DFaceRecon_pytorch to crop the images and extract camera parameters. If you email me directly, I can give you the modifications you'll need if you want to export EG3D dataset files.

Eric

Hi, Eric,

I follow this link and get angles and translation using deep3dfacerecon_pytorch, I compute rotation matrix using the following function

def compute_rotation(angles):
    x, y, z = angles 
    rot_x = np.array([
        [1, 0, 0],
        [0, np.cos(x), -np.sin(x)],
        [0, np.sin(x), np.cos(x)]
    ])
    rot_y = np.array([
        [np.cos(y), 0, np.sin(y)],
        [0, 1, 0],
        [-np.sin(y), 0, np.cos(y)]
    ])
    rot_z = np.array([
        [np.cos(z), -np.sin(z), 0],
        [np.sin(z), np.cos(z), 0],
        [0, 0, 1]
    ])
    return np.matmul(rot_z, np.matmul(rot_y, rot_x))

the rotation matrices I get are different from the rotation matrices you provide in the dataset.json. May I know the angle to rotation matrices function you use when you process the data? It seems that I need to multiply the second and third row with -1 to get the same results.

In summary, I wonder how to get the 4x4 extrinsics from angles and trans estimated with deep3dfacerecon_pytorch.

from eg3d.

e4s2022 commented on July 17, 2024 2

Hi, @X-niper

I also notice this similar issue, I found the rotation matrix directly obtained from compute_rotation is different from the provided extrinsics in dataset.json. Could you please tell the convention differences between the DeepFace3DRecon_pytorch and EG3D, and explain a bit more about how you fill in this gap from your code?

I am a newbee just starting in 3D vision. Thank in advance.

from eg3d.

blandocs commented on July 17, 2024 2

Hi @ericryanchan and @X-niper, thank you for the great work and comments :)
I read all of comments in this thread, and have various experiments but I failed to find the solution.

My goal is generating dataset.json (which contains extrinsic matrix) in any images.
Below steps are what I've done so far.

Step 1. Get coeffs from Deep3DRecon (also used mtcnn to detect the face).

# By using facerecon_model.py we can get the coeff
output_coeff = self.net_recon(self.input_img)
result = self.facemodel.split_coeff(output_coeff)

# get euler angle: result['angle']
# get translation info: result['trans']

Step 2. get extrinsic matrix using the @X-niper 's code.

mat_4x4 = get_extrinsics_from_euler_and_translation(result['angle'], result['trans'])

Step 3. compare my mat_4x4 result with the ground-truth extrinsic matrix in EG3D's dataset.json. I followed EG3D FFHQ preprocessing and use the image which is exactly the same as what EG3D used.

Unfortunately, the extrinsic matrices are similar but different from each other. And accordingly, the rendered image using my extrinsics is also different from the original one. I used 00001.png image.

# my estimated 4x4 extrinsic matrix: 
[[ 0.98808561 -0.03921765  0.14882481 -0.40602439]
 [-0.00870278 -0.97967942 -0.20038081  0.550637  ]
 [ 0.15365908  0.19669821 -0.96834847  2.61188504]
 [ 0.          0.          0.          1.        ]]
# euler angle: (2.941191690164157, -0.1542702688887573, -0.008807490044743849)

# ground-truth 4x4 extrinsic matrix from dataset.json(EG3D): 
[[ 0.98884588  0.01061465  0.14856379 -0.37894215]
 [ 0.04037552 -0.97921258 -0.19877771  0.5142959 ]
 [ 0.14336558  0.20255886 -0.96871883  2.62333806]
 [ 0.          0.          0.          1.        ]]
# euler angle: (2.9354628454139498, -0.1438612908670918, 0.04080828469625245)

left one: render inversion result using ground-truth extrinsic parameters from dataset.json(EG3D)
right one: render inversion result using my estimated extrinsic parameters.

Could you give me any advice or camera matrix extraction codes?

Thank you in advance and have a great day!

from eg3d.

blandocs commented on July 17, 2024 1

@bernakabadayi

Sorry for the confusion, I used 00002.png camera parameters.
But, I'm not sure this confusion is the problem.

After I use the code from the author, I can totally resolve it :)

from eg3d.

ericryanchan commented on July 17, 2024

Hi l4rz,

We're not planning on releasing the pose-extraction code as part of this repo but you can use Deep3DFaceRecon_pytorch to crop the images and extract camera parameters. If you email me directly, I can give you the modifications you'll need if you want to export EG3D dataset files.

Eric

from eg3d.

X-niper commented on July 17, 2024

Hi，@bd20222

We can get the Euler angles from the extrinsics in the dataset.json. I compared the derived Euler angles with the one estimated by DeepFace3DRecon_pytorch and found the differences. I found that the camera distance is fixed at 2.7 when I read the generate_sample.py code and the guess is validated with the translation got from extrinsics in the dataset.json.

I attach the code below, with which you can get Euler angles from rotation matrix.

def compute_angle_from_matrix(matrix3x3):
    M = matrix3x3
    theta_y = np.arcsin(-M[2,0])
    theta_z = np.arctan2(M[1,0], M[0,0])
    theta_x = np.arctan2(M[2,1], M[2,2])
    return (theta_x, theta_y, theta_z)

from eg3d.

e4s2022 commented on July 17, 2024

@X-niper

Thx, I understand the general steps, then can you expain what's the meaning of

theta_x = np.pi - theta_x
theta_y = -theta_y
theta_z = theta_z

Maybe through an intuitive example? I guess it has something to do with the order and orientation of the axes.

I also found the implementation of class LookAtPoseSampler in camera_utils.py is similar to your degeree transfomation.

class LookAtPoseSampler:
    """
    Same as GaussianCameraPoseSampler, except the
    camera is specified as looking at 'lookat_position', a 3-vector.

    Example:
    For a camera pose looking at the origin with the camera at position [0, 0, 1]:
    cam2world = LookAtPoseSampler.sample(math.pi/2, math.pi/2, torch.tensor([0, 0, 0]), radius=1)
    """

    @staticmethod
    def sample(horizontal_mean, vertical_mean, lookat_position, horizontal_stddev=0, vertical_stddev=0, radius=1, batch_size=1, device='cpu'):
        h = torch.randn((batch_size, 1), device=device) * horizontal_stddev + horizontal_mean
        v = torch.randn((batch_size, 1), device=device) * vertical_stddev + vertical_mean
        v = torch.clamp(v, 1e-5, math.pi - 1e-5)

        theta = h 
        v = v / math.pi
        phi = torch.arccos(1 - 2*v)  

        camera_origins = torch.zeros((batch_size, 3), device=device)

        camera_origins[:, 0:1] = radius*torch.sin(phi) * torch.cos(math.pi-theta)
        camera_origins[:, 2:3] = radius*torch.sin(phi) * torch.sin(math.pi-theta)
        camera_origins[:, 1:2] = radius*torch.cos(phi)

        # forward_vectors = math_utils.normalize_vecs(-camera_origins)
        forward_vectors = math_utils.normalize_vecs(lookat_position - camera_origins) 
        return create_cam2world_matrix(forward_vectors, camera_origins)

from eg3d.

41xu commented on July 17, 2024

Hi Eric,
I also have questions about the pose extraction part, it seems the origin of deep3dfacerecon_pytorch's coordinates is different from that of eg3d. Can you provide the translation vector of it? I am not very sure if they only have difference in the origin, maybe also on the axis orientation. If this is true, the whole 4*4 transformation matrix is needed.
Thank you!

Hi, I get similar results using the following code. I am not sure it's correct but the 4x4 extrinsics looks similar to the extrinsics in the provided dataset.json file. It seems that the camera distance is assumed fixed at 2.7. You can have a try. I hope the code helps.
import numpy as np 

def compute_rotation(angles):
    x, y, z = angles
    rot_x = np.array([
        [1, 0, 0],
        [0, np.cos(x), -np.sin(x)],
        [0, np.sin(x), np.cos(x)]
    ])
    rot_y = np.array([
        [np.cos(y), 0, np.sin(y)],
        [0, 1, 0],
        [-np.sin(y), 0, np.cos(y)]
    ])
    rot_z = np.array([
        [np.cos(z), -np.sin(z), 0],
        [np.sin(z), np.cos(z), 0],
        [0, 0, 1]
    ])
    return np.matmul(rot_z, np.matmul(rot_y, rot_x))


'''
euler angles and translation are estimated from deep3dfacerecon_pytorch
'''
def get_extrinsics_from_euler_and_translation(euler:np.ndarray, trans:np.ndarray):
    theta_x, theta_y, theta_z = euler[0], euler[1], euler[2]
    theta_x = np.pi - theta_x
    theta_y = -theta_y
    theta_z = theta_z 
    rot_mat = compute_rotation([theta_x, theta_y, theta_z])
    trans_x = -trans[0]
    trans_y = trans[1]
    trans_z = np.sqrt(2.7 ** 2 - trans_x ** 2 - trans_y ** 2)
    trans_new = np.matmul(rot_mat, np.array([trans_x, trans_y, trans_z]))
    mat_4x4 = np.eye(4)
    mat_4x4[0:3, 0:3] = rot_mat
    mat_4x4[0:3, 3] = -trans_new
    return mat_4x4


if __name__ == "__main__":
    test_fname = '00039.png'
    euler_estimated = np.array([ 0.15566148, -0.18466546,-0.01471091])
    trans_estimated = np.array([0.03106074, 0.02740563, 0.07686124])
    mat_4x4 = get_extrinsics_from_euler_and_translation(euler_estimated, trans_estimated)
    print(mat_4x4)

Hi@X-niper,

I follow Deep3DFace and extract the euler and trans coeff, but the result of 00039 I got is different from your's in code. I wonder how did you get the euler and the trans?
I use a facial landmarks dector to extract landmark, and follows the instruction in https://github.com/sicxu/Deep3DFaceRecon_pytorch#test-with-custom-images.
The facerecon network will return the output_coeff output_coeff = self.net_recon(self.input_img) , so we can get euler and trans.
Is there anything wrong with my process?

from eg3d.

X-niper commented on July 17, 2024

@41xu Hi, I use the original ffhq cropped image. I use mtcnn to detect facial landmarks.

from eg3d.

lyx0208 commented on July 17, 2024

Hi, @ericryanchan
Thanks for your great work! I'm recently doing inversion on your framework and get rather good results on FFHQ dataset with the cropping method and camera pose data according to runme.py. Is it possible to share the code cropping and calculate the camera pose on images in the wild to me through email?

from eg3d.

e4s2022 commented on July 17, 2024

Also want to know how to extract the camera pose of in the wild images. Please let me know if you have any idea, thank you guys. Use the exact output coeffs of Deep3DRecon directly seems not right

from eg3d.

41xu commented on July 17, 2024

# euler angle: (2.9354628454139498, -0.1438612908670918, 0.04080828469625245)

same

from eg3d.

mlnyang commented on July 17, 2024

Same here. While doing inversion on celebA-HQ, I used @X-niper 's code to get extrinsic, and fed to the generator.

left : GT
middle : using X-nifer's code
last : using frontal face GT extrinsic from dataset.json (I picked it randomly)
I used lookat extrinsic as visualization. It is strange that GT extrinsic from dataset.json works well (same for other samples as well)

@ericryanchan , Could you please share the code calculating 4x4 extrinsic from euler angle and translation?

from eg3d.

bernakabadayi commented on July 17, 2024

Hi @ericryanchan and @X-niper, thank you for the great work and comments :) I read all of comments in this thread, and have various experiments but I failed to find the solution.

My goal is generating dataset.json (which contains extrinsic matrix) in any images. Below steps are what I've done so far.

Step 1. Get coeffs from Deep3DRecon (also used mtcnn to detect the face).
# By using facerecon_model.py we can get the coeff
output_coeff = self.net_recon(self.input_img)
result = self.facemodel.split_coeff(output_coeff)

# get euler angle: result['angle']
# get translation info: result['trans']
Step 2. get extrinsic matrix using the @X-niper 's code.
mat_4x4 = get_extrinsics_from_euler_and_translation(result['angle'], result['trans'])
Step 3. compare my mat_4x4 result with the ground-truth extrinsic matrix in EG3D's dataset.json. I followed EG3D FFHQ preprocessing and use the image which is exactly the same as what EG3D used.

Unfortunately, the extrinsic matrices are similar but different from each other. And accordingly, the rendered image using my extrinsics is also different from the original one. I used 00001.png image.
# my estimated 4x4 extrinsic matrix: 
[[ 0.98808561 -0.03921765  0.14882481 -0.40602439]
 [-0.00870278 -0.97967942 -0.20038081  0.550637  ]
 [ 0.15365908  0.19669821 -0.96834847  2.61188504]
 [ 0.          0.          0.          1.        ]]
# euler angle: (2.941191690164157, -0.1542702688887573, -0.008807490044743849)

# ground-truth 4x4 extrinsic matrix from dataset.json(EG3D): 
[[ 0.98884588  0.01061465  0.14856379 -0.37894215]
 [ 0.04037552 -0.97921258 -0.19877771  0.5142959 ]
 [ 0.14336558  0.20255886 -0.96871883  2.62333806]
 [ 0.          0.          0.          1.        ]]
# euler angle: (2.9354628454139498, -0.1438612908670918, 0.04080828469625245)
left one: render inversion result using ground-truth extrinsic parameters from dataset.json(EG3D) right one: render inversion result using my estimated extrinsic parameters.

Could you give me any advice or camera matrix extraction codes?

Thank you in advance and have a great day!

Hi @blandocs,
The matrix I got from dataset.json for the image is as follows. That one is different than your GT. Where did you get this file? Mine is from here.

eg3d/dataset_preprocessing/ffhq/runme.py

Line 70 in 0b38adc

 gdown.download('https://drive.google.com/uc?id=14mzYD1DxUjh7BGgeWKgXtLHWwvr-he1Z', 'final_crops/dataset.json', quiet=False) 

['00001.png', [0.9950790405273438, 0.016893357038497925, -0.09763375669717789, 0.24722558007395418, 0.00013333745300769806, -0.9855860471725464, -0.16917484998703003, 0.45264816608040714, -0.09908440709114075, 0.16832932829856873, -0.9807382822036743, 2.6502809568612045, 0.0, 0.0, 0.0, 1.0, 4.2647, 0.0, 0.5, 0.0, 4.2647, 0.5, 0.0, 0.0, 1.0]]

from eg3d.

BenjiKCF commented on July 17, 2024

so if I am using object other than faces, can I still use the Deep3DFaceRecon_pytorch to generate the camera matrix? How am I going to get a camera matrix for it? Thanks.

from eg3d.

usmancheema89 commented on July 17, 2024

Hi all, I just pushed additional scripts that can crop and extract poses from in-the-wild portrait images in a way that is compatible with the FFHQ checkpoints. Hope that is useful. https://github.com/NVlabs/eg3d/blob/main/dataset_preprocessing/ffhq/preprocess_in_the_wild.py

Thank you for the code. I tried the preprocess_in_the_wild but im getting weird random crops from all over the images:

the Deep3DFaceRecon works fine as their 3D face prediction results are close to the reported paper.

from eg3d.

luminohope commented on July 17, 2024

@usmancheema89 can you send me a few of these original pictures?

from eg3d.

usmancheema89 commented on July 17, 2024

Sure :)
please find attached the zip file.

sample Images.zip

from eg3d.

luminohope commented on July 17, 2024

@usmancheema89 it seems most likely that the landmarks from MTCNN are failing for the incorrectly aligned examples.
Below is some debugging images.
The ones where the landmarks look correct appear to be correct.
.
The landmark detector doesn't have to be MTCNN if you find something else that is more robust. Any 68 or 5 point-based one should work.

from eg3d.

usmancheema89 commented on July 17, 2024

@luminohope Thanks for figuring out the issue. I assumed the landmark points detected by the "Deep3DFaceRecon" were being used downstream, they seemed to be working fine which confused me.
I have tested insightface's landmark detection and that seems to be working. I will test that out and update :)

from eg3d.

commented on July 17, 2024

@usmancheema89 it seems most likely that the landmarks from MTCNN are failing for the incorrectly aligned examples. Below is some debugging images. The ones where the landmarks look correct appear to be correct. . The landmark detector doesn't have to be MTCNN if you find something else that is more robust. Any 68 or 5 point-based one should work.

Hi @usmancheema89,
I have the same problem as you, lots of images were incorrectly aligned. Did you find a solution for this ?
Thanks a lot !!

from eg3d.

usmancheema89 commented on July 17, 2024

@Belgacemi I didn't further explore the issue as I got into some other work.
But as Luminohope suggested, you may change the face lmk detection network for improved results

from eg3d.

FFHQ dataset camera pose labeling about eg3d HOT 26 OPEN

Comments (26)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent