Giter Club home page Giter Club logo

x-stereolab's Introduction

✨ meteorshowers ✨,                  Blog Badge Mail Badge

  • 🔭 I am working on NN-Planning in XPENG Motors for the next generation Autonomous Driving Platform.
  • 🔭 I used to work with Dr. Xiaozhi Chen on Autonomous Driving 3D perception in DJI (I lead the bevdet team from 0-1).
  • 🌱 I used to work on multi-modal AICG (nlp-cv) work with Shenjian Zhao in ByteDance search-nlp group.

My github stats

Blogs
1. HITNet 谷歌首个高精度实时端到端双目匹配网络结构
2. ActiveStereoNet:谷歌首个基于深度学习的结构光双目匹配系统
3. 推断速度提高几十倍,谷歌研究员提出实时端到端双目系统深度学习网络stereonet

x-stereolab's People

Contributors

meteorshowers avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

x-stereolab's Issues

作者大神您好!

“1.32 EPE_all with 8X single model 1.48EPE_all with 8X multi model on sceneflow dataset by end-to-end training.”
请问 8X single model 指的是1/8分辨率的未经过分层细化的模型吗?
8X multi model指的是1/8分辨率并且经过多层(3级)细化后的模型吗?
请问作者,经过细化后的模型的错误率(EPE)不是应该比不refine的要低吗?还是小白理解的有问题?请大神指点一二~~~
万分感谢~~~~~~~~

Trained models

Is it possible you could you share some of the trained models you have created?

Source code release?

Is there a plan to release the source code on this project? I'd like to test-drive it myself.

What is bug?

Hello, thanks for your good implementation.
You deleted StereoNet code in this commit (e2cd703 ).
What a kind of bug you found?

"Implement doubt in EdgeAwareRefinement module"

你好,
仔细对照了你的代码和论文的原文,发现有个地方好像和论文不太对应。具体问题是:
在基于边缘的视差修正的模块里里面,我看到你的代先将小分辨的视差图进行了bilinear的上采样,然后又根据原始图像的宽度和小分辨视差图的宽度计算了比值,如果大于1.5就乘以系数8.这样做是有什么考虑么?我好像没在原文看到有这个操作。

谢谢啦

作者您好!请教您关于FPS的问题

大神您好,我用GTX1060-6G显卡的笔记本跑您的模型,测得FPS=20;用Tesla K80-12G服务器跑您的模型测得FPS=8.,这两台设备跑的速度远达不到您所提到的30-25FPS以及达不到论文里的要求,不知道除了设备自身的问题还有哪些问题可以考虑? 另外,用这个在SceneFlow上的预训练模型做KITTI数据集上的微调得到的效果很差,请问大神,您能否放出微调的部分代码供参考?非常感谢大神!

----再一次表示感谢!蟹蟹~~~

When is the HITNET code going to be released ?

You can atleast release the first version of HITNet, i just want the same accuracy as mentioned in the paper even some offset will be also do fine,and the code need not be optimized.

sceneflow epe

I have tested the stereonet model pretrained from sceneflow dataset. However I got 2.93 epe which is much larger than 1.38. Should the pretrained model be finetuned in sceneflow dataset to reach 1.38?

图像分辨率问题

I would like to ask the next author, if the size of the image used for network training is 960x540, then when I want to use this network test, the resolution of the photos taken by my camera is very large, such as 4608x3456. When I take this picture to train, the parallax map effect is very bad. When I reduce the original image to 960x540, the parallax map effect is very good OK, but I don't know how to restore the parallax value of the original image. Hope that the author can guide, thank you very much!

Pretrain Model

Hi The link for pretrain model is 404,Could you provide the pretrain model for StereoNet? Thanks

typeerro

cannot convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first

Model Speed

Hi,
Thanks for sharing your code. I have confusion about the speed of the model.
As described in the paper,

Refer to the illustration in Table 2,
The StereoNet used a model with a downsampling factor of 8 and 3 refinement levels (i.e., StereoNet-8x-multi) can reach a runtime=0.015s on an NVidiaTitan X, i.e., 60fps.

It indicates the model you released which contains only 1 refinement level (i.e., StereoNet-8x-single) can be faster than 60fps.
However, the speed is only about 20fps when I run your code on my NVIidiaTitan X.

Is there something I have missed? Please help me figure out. Thanks

maxdisp

How to set the network input parameter maxdisp

outputs dimension

I used the pretrained model. The testing dataset is the default flyingthings3D TEST.
After 'outputs = model(imgL, imgR)' I got len(outputs)=2 , which caused the line 'for x in range(stages): ... output = torch.squeeze(outputs[x], 1)' crashed. How can I solve this, please? Thanks.

Error in code

I am trying to train the network and I am getting the following errors

Traceback (most recent call last):
  File "./main8Xmulti.py", line 22, in <module>
    from models.StereoNet8Xmulti import StereoNet
ModuleNotFoundError: No module named 'models.StereoNet8Xmulti'

Changing from models.StereoNet8Xmulti import StereoNet to from models.StereoNet_single import StereoNet fixes that error, but then I get this error

Traceback (most recent call last):
  File "./main8Xmulti.py", line 365, in <module>
    main()
  File "./main8Xmulti.py", line 170, in main
    train(TrainImgLoader, model, optimizer, log, epoch)
  File "./main8Xmulti.py", line 223, in train
    losses[idx].update(loss[idx].item() / args.loss_weights[idx])
IndexError: list index out of range

loss only has 2 elements in the list, but losses has 4. How can I fix this?

ModuleNotFoundError: No module named 'dsgn', Is there a solution?

(xstereolab) root@c792cba59ad3:/mnt/X-StereoLab-master# python3 tools/train_net_disp.py --cfg ./configs/config_disp.py --savemodel ./outputs/MODEL_NAME -btrain 4 -d 0-3 --multiprocessing-distributed
Traceback (most recent call last):
File "tools/train_net_disp.py", line 25, in
from disparity.models import *
File "/mnt/X-StereoLab-master/disparity/models/init.py", line 1, in
from .stereonet import StereoNet
File "/mnt/X-StereoLab-master/disparity/models/stereonet.py", line 11, in
from dsgn.utils.bounding_box import compute_corners, quan_to_angle, \

Confusion about the running time

In the original paper, we can roughly get that when using model "8x multi", EPE is 1.1 (from table 1 ) and running time is 0.015 (from table 2, which is roughly 60fps).

In your implementation, you get 30FPS using model "8x single" while getting epe 1.38
Can you explain it more ? Or maybe the running time 0.015 is't corresponding to "8x multi"?

warp function

def warp(x, disp):
"""
warp an image/tensor (im2) back to im1, according to the optical flow
x: [B, C, H, W] (im2)
flo: [B, 2, H, W] flow
"""
B, C, H, W = x.size()
# mesh grid
xx = torch.arange(0, W).view(1, -1).repeat(H, 1)
yy = torch.arange(0, H).view(-1, 1).repeat(1, W)
xx = xx.view(1, 1, H, W).repeat(B, 1, 1, 1)
yy = yy.view(1, 1, H, W).repeat(B, 1, 1, 1)
grid = torch.cat((xx, yy), 1).float()

if x.is_cuda:
    grid = grid.cuda()
vgrid = grid
vgrid[:,:1,:,:] = vgrid[:,:1,:,:] - disp

# scale grid to [-1,1]
vgrid[:, 0, :, :] = 2.0 * vgrid[:, 0, :, :] / max(W - 1, 1) - 1.0
vgrid[:, 1, :, :] = 2.0 * vgrid[:, 1, :, :] / max(H - 1, 1) - 1.0

vgrid = vgrid.permute(0, 2, 3, 1)
output = nn.functional.grid_sample(x, vgrid)
return output

找不到py文件

你好,在disparity.models下面找不到loss.py,是忘记上传了吗?
image

还有在stereonet.py找不到dsgn这个文件夹
image

构建Cost Volume的问题

作者您好,感谢您开源实现的代码

在cost volume构建的时候有些困扰。
为什么只沿着一个维度方向进行平移相减(如代码中沿width方向平移),却没有计算height方向?

for i in range(disp):
        if i > 0:
            cost[:, :, i, :, i:] = refimg_feature[ :, :, :, i:] - targetimg_feature[:, :, :, :-i]
        else:
            cost[:, :, i, :, :] = refimg_feature - targetimg_feature

About image pyramid

Hello, I would like to ask whether the image pyramid in the hierarchical refinement networks is your own idea or the idea of communicating with the author of the paper? Looking forward to your reply

speed

after runing the model of StereoNet8Xmulit.py in 1080Ti, which using about 0.27s with one pairs of images. It's much slower than the result that giving from you. Could you give some advices about it? Thanks a lot!

HITNet code Please!

Its already been a month you said that the code will be released soon but it didn't ,so can you please share the first version of the code?

Network output

Hello, is the final output of this network a list? What do the two tensors in the list represent?

activestereonet

how can I train a active stereonet,I can not find the corresponding code

作者您好!运行代码出现属性错误问题~

作者您好~,我运行您的代码,出现了以下的错误:
AttributeError:module 'torch.nn.functional' has no attribute 'interpolate'
请问是缺少文件吗?或者我配置的环境以及版本不正确呢?
请您指教,非常期待您的回复!~~~

万分感谢! 蟹蟹~~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.