Giter Club home page Giter Club logo

Comments (23)

DerKleineLi avatar DerKleineLi commented on August 16, 2024 2

@mli0603 @hugoycj Thank you for your help. I think I found the issue:
The generated transforms.json by the convert_tnt_to_json.py points the file path to the uncalibrated images. Instead of "file_path": "images/xxx.jpg", it should be "file_path": "dense/images/xxx.jpg"
Having this fixed I'm getting good result at iteration 15k with batch size 16, grad_accum_iter=1 on 1 gpu:
image
image

from neuralangelo.

hugoycj avatar hugoycj commented on August 16, 2024 1

@DerKleineLi @mli0603 Some updates for Meeting Room training results at 60K steps. It's :
rgb_render_60000_ba0ad47bb40320ae50b3
media_images_val_vis_normal_60000_819909930e528679b6a5

I use BlenderNeuralangelo to manually regenerate the bounding_box and sphere. I have also modified the ray_num from 512 to 2048 to accelerate the training.

from neuralangelo.

MulinYu avatar MulinYu commented on August 16, 2024

Here is the results:
normal
barn

from neuralangelo.

mli0603 avatar mli0603 commented on August 16, 2024

Hi @MulinYu

Thank you for your interest in the project!

we recently found a bug in T&T preprocessing. It was fixed in 62483fe. Can you check if this fixes your issue?

from neuralangelo.

DerKleineLi avatar DerKleineLi commented on August 16, 2024

Hi @mli0603

I don't think the fix works, as bounding_box only affects the aabb_range in transforms.json, which is not used during training.

With the latest commit I tried to train the Meetingroom and got similar result as above:
未标题-1
from left to right: gt, render, normal, depth. This is the evaluation reslut at iteration 80k

I followed #14 to setup the hyperparameters using grad_accum_iter, which should match the experiment in the paper:

_parent_: projects/neuralangelo/configs/base.yaml

max_iter: 4000000

wandb_scalar_iter: 800
wandb_image_iter: 80000
validation_iter: 40000

model:
    object:
        sdf:
            mlp:
                inside_out: True   # True for Meetingroom.
            encoding:
                coarse2fine:
                    init_active_level: 8
                    step: 40000
    appear_embed:
        enabled: True
        dim: 8

data:
    type: projects.neuralangelo.data
    root: .../data/TNT/Meetingroom
    num_images: 371  # The number of training images.
    train:
        image_size: [835,1500]
        batch_size: 16
        subset:
    val:
        image_size: [300,540]
        batch_size: 16
        subset: 16
        max_viz_samples: 16

trainer:
    grad_accum_iter: 8

optim:
    sched:
        warm_up_end: 40000
        two_steps: [2400000,3200000]

Do you have any intuition on how the result should look like at this iteration and what could be wrong with my experiment?

from neuralangelo.

DerKleineLi avatar DerKleineLi commented on August 16, 2024

It seems that the bounding sphere has a wrong size in my case:
image
Will try to fix that and update with the result. Thanks for the hint in the latest README :)

from neuralangelo.

mli0603 avatar mli0603 commented on August 16, 2024

Hi @DerKleineLi

Good news! Glad you have found the problem.

However, I wonder if you used preprocess_tnt.sh to generate the bounding regions. I would expect the bounding region to work fairly well for T&T, otherwise it is a bug and I need to fix it.

from neuralangelo.

DerKleineLi avatar DerKleineLi commented on August 16, 2024

Thanks for the comment @mli0603!

For Meetingroom I manually downloaded the images and colmap reconstruction from the official release of TNT, placed the file as described in DATA_PROCESSING.md, and ran

python projects/neuralangelo/scripts/convert_tnt_to_json.py --tnt_path .../data/TNT

For the visualization I ran projects/neuralangelo/scripts/visualize_colmap.ipynb with

colmap_path = ".../data/TNT/Meetingroom"
json_fname = f"{colmap_path}/transforms.json"

Hope it helps!

from neuralangelo.

mli0603 avatar mli0603 commented on August 16, 2024

Hi @DerKleineLi

If you look at the notebook, there is a read_scale=0.25 which changes the visualization. If you set read_scale=1.0, then it will use the default bounding radius.

There is no bug in terms of pre-processing and I think your issue is somewhere else.

from neuralangelo.

Willyzw avatar Willyzw commented on August 16, 2024

Hi, I can confirm the observed degraded results.

Sofar I have trained the Neuralangelo model on Barn and Courthouse scenes. Training logs on Wandb can be accessed here: https://api.wandb.ai/links/willyzw/whf9tz8o

I mostly followed the default configeration with the minor change to batch size: 1 for Courthouse, and 2 for Barn. As I thought this may have some impace. Any insights would be helpful!

from neuralangelo.

MulinYu avatar MulinYu commented on August 16, 2024

Hi @MulinYu

Thank you for your interest in the project!

we recently found a bug in T&T preprocessing. It was fixed in 62483fe. Can you check if this fixes your issue?

Hello,

I've utilized the updated script to produce a new JSON file for the Barn and subsequently retrained it using a batch size of 16. However, the outcome is still unsatisfactory after 70,000 iterations. Could you pleaes share your JSON files for TNT? If I continue to obtain bad results using your JSON files, it would indicate that the preprocessing stage is not the source of the issue.
new-barn-normal
new-barn

Best,
Mulin

from neuralangelo.

mli0603 avatar mli0603 commented on August 16, 2024

Thanks for this info. We are looking into this issue. We will update once we pin down the error.

from neuralangelo.

DerKleineLi avatar DerKleineLi commented on August 16, 2024

@hugoycj
Amazing result! Could you share more details about your pipeline?
Did you use the official colmap reconstruction of TNT? How large is the bounding box? Have you modified other hyperparameters?
Thank you!

from neuralangelo.

hugoycj avatar hugoycj commented on August 16, 2024

Yeap. I follow the same steps in provided convert_tnt_to_json.py scripts, followed by calculating bounding box using Blender. One interesting thing I found for Meetingroom scene is the sparse point clouds are too noisy which make the auto-generated bounding box around 2~3 times larger than the real scene, which may make it difficult to converge I guess.

Here are the updated params in transforms.json under datasets/tanks_and_temples/Meetingroom dir. I think you could replace these params directly in your JSON file and visualize it again.

"aabb_scale": 16.0,
  "aabb_range": [
    [
      -7.6214215713955715,
      6.522051862701991
    ],
    [
      -3.6237776073426597,
      4.891977370864412
    ],
    [
      -11.059166537834281,
      9.311011724588386
    ]
  ],
  "sphere_center": [
    -0.5496848543467903,
    0.6340998817608761,
    -0.8740774066229475
  ],
  "sphere_radius": 11.449168401589803

from neuralangelo.

hugoycj avatar hugoycj commented on August 16, 2024

I have also tried the colmap scripts using my own data. The results are stills far from good maybe due to the training time, but we could observe some fine detailed region is generated.
image

from neuralangelo.

hugoycj avatar hugoycj commented on August 16, 2024

One simple suggestion for training on custom data: try to capture video in ultrawide mode, this could may the pose estimation more accurate and may be beneficial to neuralangelo training.

from neuralangelo.

hugoycj avatar hugoycj commented on August 16, 2024

@mli0603
image
I am curious about the dramatic drop in PSNR during 20k and 30k when training on MeetingRoom. Do you have any idea why this happen?

from neuralangelo.

chenhsuanlin avatar chenhsuanlin commented on August 16, 2024

@DerKleineLi thanks for this! It's indeed a bug on our side. We will fix it in the coming days.

from neuralangelo.

MulinYu avatar MulinYu commented on August 16, 2024

@mli0603 @hugoycj Thank you for your help. I think I found the issue: The generated transforms.json by the convert_tnt_to_json.py points the file path to the uncalibrated images. Instead of "file_path": "images/xxx.jpg", it should be "file_path": "dense/images/xxx.jpg" Having this fixed I'm getting good result at iteration 15k with batch size 16, grad_accum_iter=1 on 1 gpu: image image

Dear @DerKleineLi

You are right, the generated json file let the dataloader read the images before undistortion. I also get better results on Barn after changing the image path to 'dense/images'.

Thanks alot.
Best.
Mulin

from neuralangelo.

lxxue avatar lxxue commented on August 16, 2024

I have also tried the colmap scripts using my own data. The results are stills far from good maybe due to the training time, but we could observe some fine detailed region is generated. image

Hi @hugoycj,

May I ask if you get good results on your own data? I also tried with one sequence randomly captured with an iPhone but the results seem to be worse than what I can get from MonoSDF.

from neuralangelo.

chenhsuanlin avatar chenhsuanlin commented on August 16, 2024

Update: we have updated the data structure of the COLMAP artifacts in the latest main branch to avoid confusion. We are now storing the raw images in images_raw and the undistorted images in images (the final images for Neuralangelo). The dense folder has also been removed. Please see the new data preparation for details on the new structure.

from neuralangelo.

mli0603 avatar mli0603 commented on August 16, 2024

The tanks and temples preprocessing script has been updated (see 54172de) to fix the above bug and reflect the latest changes of data preprocessing.

from neuralangelo.

DerKleineLi avatar DerKleineLi commented on August 16, 2024

Hi all, I still need some help with the TNT dataset. Here's the result I get using the latest commit:
image
It's much blurrier than the paper's result. If anyone have reached a better result, I would appreciate it if you share me your settings and results. Thanks in advance!

@hugoycj I wonder if you have trained the Meeetingroom till the end, how was the result? I also followed your pipeline a bit and here's what I got at iteration 60k:
image
Here's the config:

_parent_: projects/neuralangelo/configs/base.yaml

model:
    object:
        sdf:
            mlp:
                inside_out: True   # True for Meetingroom.
            encoding:
                coarse2fine:
                    init_active_level: 8
    appear_embed:
        enabled: True
        dim: 8
    render:
        rand_rays: 2048

data:
    type: projects.neuralangelo.data
    root: .../data/TNT/Meetingroom
    num_images: 371  # The number of training images.
    train:
        image_size: [835,1500]
        batch_size: 1
        subset:
    val:
        image_size: [300,540]
        batch_size: 1
        subset: 1
        max_viz_samples: 16

The result is very different from yours. Could you check whether the config fits yours? If so, the problem could lie in the data preprocessing, could you then share the complete transforms.json file? Only the global parameters are not enough because as you change the bbox, the transform of each frame would also change.

Another question regarding the blender plugin: There are limits as we change the size of the bbox. The image shows the smallest possible bbox of the plugin. I wonder if this is the same as in your case and have you edited the script to allow smaller bboxes?
image

from neuralangelo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.