Comments (3)
Hi,
- I'm not quite sure which part is wrong. But I suggest you accumulate the depth point clouds (or do TSDF-Fusion) and do a visualisation to check the depth scale, camera intrinsics and camera poses are good. For our evaluation on TUM RGB-D, we didn't use dataloader to read frames. Instead, we play the rosbag and use orb-slam-ros-wrapper to process the frames online, and vMAP will subscribe the rostopic from orb-slam to obtain the posed frames and updated keyframe poses (usually after orb-slam's local BA and global BA). So the mapping on TUM-RGBD data will be real-time and finished right after the rosbag playing finished.
- For integrating Detic, I think you can follow this demo. And the "instance" info is returned in the results python dictionary. After that, you should keep the semantic and instance info, and build the data association (based on semantic consistency and 3D overlap) by following this function. Or, I recently tried Video segmentation e.g., XMem as the data association tracker, that also works well.
Hope it helps and please let me know if it still cannot work well!
from vmap.
Hi,
- I'm not quite sure which part is wrong. But I suggest you accumulate the depth point clouds (or do TSDF-Fusion) and do a visualisation to check the depth scale, camera intrinsics and camera poses are good. For our evaluation on TUM RGB-D, we didn't use dataloader to read frames. Instead, we play the rosbag and use orb-slam-ros-wrapper to process the frames online, and vMAP will subscribe the rostopic from orb-slam to obtain the posed frames and updated keyframe poses (usually after orb-slam's local BA and global BA). So the mapping on TUM-RGBD data will be real-time and finished right after the rosbag playing finished.
- For integrating Detic, I think you can follow this demo. And the "instance" info is returned in the results python dictionary. After that, you should keep the semantic and instance info, and build the data association (based on semantic consistency and 3D overlap) by following this function. Or, I recently tried Video segmentation e.g., XMem as the data association tracker, that also works well.
Hope it helps and please let me know if it still cannot work well!
Thanks for your comments, I changed the depth scale in the TUM_RGBD dataset from 1000 to 5000, and got good results in the subsequent reconstruction, as shown below, I don't know whether such a reconstruction effect has reached your paper the effect shown in .
In addition, I still have some doubts. I tried to use RGBD camera to shoot my own data set, and used ColMAP to estimate the external parameters of the camera. I used Track-Anything to prepare the mask of the object, but both iMAP and vMAP were used for reconstruction. Less than expected effect. What I want to ask is:
-
For the scene mentioned in the paper, how many RGB images generally need to be prepared to reconstruct it.
-
Does the pose need to be particularly accurate? Whether the pose estimated by COLMAP cannot meet the requirements of vMAP.
from vmap.
Hi @bilibilijin thanks for your updates.
Yeah, I think the results you showed matches.
- For Replica and ScanNet we use and process all the frames sequentially. For TUM RGB-D and Our recordings with Kinect, we play ros-bag in real-time and report the reconstruction results when the sequence ends. The real-time processing is around 5Hz, indicating the frame taken rate is about every 0.2s. So the total frame numbers will be around {sequence_time/0.2s}. For example, the sequence time of our Kinect recordings is around 170s, so the total processed frames will be around 700.
- I don't think the pose needs to be super accurate as we adopt continually updated poses from ORB-SLAM3. COLMAP is definitely far accurate enough.
from vmap.
Related Issues (20)
- Questions about embedding function and forward pass of the model HOT 2
- Running iMAP demo HOT 5
- Questions regarding the vectorised training HOT 1
- how can i run vmap on scannet datasets? HOT 6
- How to test on TUM-RGBD datasets HOT 2
- 3D mesh vis HOT 2
- Image height and width HOT 1
- data generation for other replica scenes HOT 6
- Depth L1 results: Table C in Supplementary Material HOT 1
- Question about the Object Initialisation and Association HOT 2
- Question about mh,mw? HOT 1
- About object mesh. HOT 1
- How to get scene mesh rather than object mesh? HOT 6
- vis script
- Question about ground-truth object mesh in Replica dataset HOT 2
- Question about vmap in pytorch HOT 1
- Sudden Killed in WSL2 ubuntu20.04 3090Ti HOT 2
- Test vmap on TUM HOT 8
- Questions regarding semantic_instance files HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vmap.