yuefanhao / superpoint-superglue-tensorrt Goto Github PK
View Code? Open in Web Editor NEWSuperPoint and SuperGlue with TensorRT. Deploy with C++.
License: Apache License 2.0
SuperPoint and SuperGlue with TensorRT. Deploy with C++.
License: Apache License 2.0
root@1a88e71715e9:/workspace/SuperPoint-SuperGlue-TensorRT/build# ./superpointglue_image ../config/config.yaml ../weights/ ${PWD}/../image/image0.png ${PWD}/../image/image1.png
Config file is ../config/config.yaml
model_dir: ../weights/superpoint_v1_sim_int32.onnx
First image size: 320x240
Second image size: 320x240
Building inference engine......
False at plan
Error in SuperPoint building engine. Please check your onnx model path.
The environment used is the provided Docker. But I encountered the above problem. I am certain that the path of the model is correct. I have modified the code of superpoint.cpp as follows:
TensorRTUniquePtr<IHostMemory> plan{builder->buildSerializedNetwork(*network, *config)};
if (!plan) {
+ std::cout << "False at plan\n";
return false;
}
So what should I do to solve this problem?
Hi, I see that when you export the onnx file for superpoint/superglue, there are some post-processing commented out. However is there a reason for putting these postprocesses to run on the cpu? Is it because the postprocessing export will be unsupported by the arithmetic, or is it faster to run it on the cpu side?
Hi, thanks for the great repository. I desperately needed it.
While converting super point model to Tensor-RT, Your code compares the output from torch to onnx with very small values on rtol and atol which is good as it ensures the output from onnx is same as that from torch.
However, your super glue conversion script doesn't add such types of checks. I tried to investigate and realized the results are substantially different. Can you explain why?
I tried to change onnx opset to 12 as well, but was of no use.
Kindly guide how can I improve this situation?
Regards,
@yuefanhao Hi, thank you for your great work!!!
i can get the superpoint result, but there is no matcher result, i have found in other issue, mean that the tensorrt version is too hight, but because of other application using high tensorrt version, so i can not downgrading the tensorrt version.
I'm guessing whether the program can be modified to adapt to a higher version of Tensorrt?
I ran the code on Jetson AGX Orin. And I noticed that the first image pair always took much longer time(like 5 seconds!)to infer. Do you know the reason? Thanks a lot!
nice work!
But I have a question could you help me?
When I want to use a new model of superpoint (for example, add batch normal layer), do I also need to train a new superglue model?
I tried your superglue onnx conversion script and it managed to generate an onnx file, but “Error in SuperGlue building engine.” occured in the following process. I also tried nvidia trtexec tool to build the trt engine , it reported the following error:
[6] Invalid Node - /kenc/encoder/encoder.0_1/Conv
The bias tensor is required to be an initializer for the Conv operator. Try applying constant folding on the model using Polygraphy: https://github.com/NVIDIA/TensorRT/tree/master/tools/Polygraphy/examples/cli/surgeon/02_folding_constants
however, the onnx model you provided works fine on trtexec tool. I find that there is a little size diffence between your onnx model(48.2MB) and my self-converted onnx model(48.4Mb). Should I do some thing more after the onnx export?
Failed when extracting features from first image.
could you help me on this? thanks
您好,我没有看到config文件中包含图像内参信息,难道这个提取和匹配过程不需要图像内参吗,比如去畸变等操作
crash problem in SuperPoint::build()
Hello!
I try to configure this project, and run into some error, hope for suggestion!
I am PhD student from Shanghai Jiao Tong University.
because I have not configure the cuda-11.6, but I update TensorRT to TensorRT-8.4.1.5
And I can build the project done successfully
my modify the CMakeLists.txt as following:
cmake_minimum_required(VERSION 3.5)
project(superpointglue)
set(CMAKE_CXX_STANDARD 14)
set(CMAKE_BUILD_TYPE "release")
add_definitions(-w)
set(ENABLE_BACKWARD true)
if(ENABLE_BACKWARD)
add_definitions(-D USE_BACKWARD)
endif()
add_subdirectory(${PROJECT_SOURCE_DIR}/3rdparty/tensorrtbuffer)
set(TENSORRT_ROOT $ENV{HOME}/3rdParty/TensorRT-8.4.1.5)
SET(CUDA_TOOLKIT_ROOT_DIR "/usr/local/cuda")
set(Torch_DIR "$ENV{HOME}/3rdParty/libtorch/share/cmake/Torch") # zph desktop
# find_package(OpenCV 4.2 REQUIRED) # origin
find_package(OpenCV 3.4.10 REQUIRED) # wzy
find_package(Eigen3 REQUIRED)
find_package(Torch REQUIRED) # wzy
find_package(CUDA REQUIRED)
find_package(yaml-cpp REQUIRED)
include_directories(
${PROJECT_SOURCE_DIR}
${PROJECT_SOURCE_DIR}/include
${OpenCV_INCLUDE_DIRS}
${EIGEN3_INCLUDE_DIRS}
${CUDA_INCLUDE_DIRS}
${YAML_CPP_INCLUDE_DIR}
${TORCH_INCLUDE_DIRS} # wzy
${TENSORRT_ROOT}/include # wzy
)
add_library(${PROJECT_NAME}_lib SHARED
src/super_point.cpp
src/super_glue.cpp
)
target_link_libraries(${PROJECT_NAME}_lib
nvinfer
nvonnxparser
${OpenCV_LIBRARIES}
${CUDA_LIBRARIES}
yaml-cpp
tensorrtbuffer
# ${TENSORRT_ROOT}/lib # wzy
# ${TORCH_LIBRARIES} # wzy
)
add_executable(${PROJECT_NAME}_image inference_image.cpp)
add_executable(${PROJECT_NAME}_sequence inference_sequence.cpp)
target_link_libraries(${PROJECT_NAME}_image ${PROJECT_NAME}_lib)
target_link_libraries(${PROJECT_NAME}_sequence ${PROJECT_NAME}_lib)
if (ENABLE_BACKWARD)
target_link_libraries(${PROJECT_NAME}_image dw)
endif()
I configure this and get the following
-- Found CUDA: /usr/local/cuda-10.2 (found version "10.2")
-- Found CUDA: /usr/local/cuda (found suitable exact version "10.2")
-- Found CUDA: /usr/local/cuda (found version "10.2")
-- Caffe2: CUDA detected: 10.2
-- Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc
-- Caffe2: CUDA toolkit directory: /usr/local/cuda
-- Caffe2: Header version is: 10.2
-- Found cuDNN: v8.2.0 (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libcudnn.so)
-- /usr/local/cuda/lib64/libnvrtc.so shorthash is 08c4863f
-- Autodetected CUDA architecture(s): 7.5
-- Added CUDA NVCC flags for: -gencode;arch=compute_75,code=sm_75
-- Configuring done
-- Generating done
run ./superpointglue_image ../config/config.yaml ../weights/ ${PWD}/../image/image0.png ${PWD}/../image/image1.png
get the following bug
Config file is ../config/config.yaml
First image size: 320x240
Second image size: 320x240
Building inference engine......
SuperPoint and SuperGlue inference engine build success.
---------------------------------------------------------
[SuperPoint::infer]
Stack trace (most recent call last):
#4 Object "[0xffffffffffffffff]", at 0xffffffffffffffff, in
#3 Object "./superpointglue_image", at 0x5631db9fecd9, in
#2 Object "/lib/x86_64-linux-gnu/libc.so.6", at 0x7fd36e921c86, in __libc_start_main
#1 Object "./superpointglue_image", at 0x5631db9fe8b8, in
#0 Object "/home/zph/projects/SuperPoint-SuperGlue-TensorRT/build/libsuperpointglue_lib.so", at 0x7fd372a505ce, in SuperPoint::infer(cv::Mat const&, Eigen::Matrix<double, 259, -1, 0, 259, -1>&)
Segmentation fault (Signal sent by the kernel [(nil)])
[1] 26575 segmentation fault (core dumped) ./superpointglue_image ../config/config.yaml ../weights/
when I delete the .engine
and try to set 640*480 image in config and rerun
./superpointglue_image ../config/config.yaml ../weights/ ${PWD}/../image/image0.png ${PWD}/../image/image1.png
I get following bug
First image size: 640x480
Second image size: 640x480
Building inference engine......
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
Stack trace (most recent call last):
#12 Object "[0xffffffffffffffff]", at 0xffffffffffffffff, in
#11 Object "./superpointglue_image", at 0x56016a9fecd9, in
#10 Object "/lib/x86_64-linux-gnu/libc.so.6", at 0x7f85ea947c86, in __libc_start_main
#9 Object "./superpointglue_image", at 0x56016a9fe763, in
#8 Object "/home/zph/projects/SuperPoint-SuperGlue-TensorRT/build/libsuperpointglue_lib.so", at 0x7f85eea7393b, in SuperPoint::build()
#7 Object "/home/zph/projects/SuperPoint-SuperGlue-TensorRT/build/libsuperpointglue_lib.so", at 0x7f85eea73421, in SuperPoint::deserialize_engine()
#6 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6", at 0x7f85eafc22db, in operator new(unsigned long)
#5 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6", at 0x7f85eafc1d53, in __cxa_throw
#4 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6", at 0x7f85eafc1b20, in std::terminate()
#3 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6", at 0x7f85eafc1ae5, in
#2 Object "/usr/lib/x86_64-linux-gnu/libstdc++.so.6", at 0x7f85eafbb956, in
#1 Object "/lib/x86_64-linux-gnu/libc.so.6", at 0x7f85ea9667f0, in abort
#0 Object "/lib/x86_64-linux-gnu/libc.so.6", at 0x7f85ea964e87, in gsignal
Aborted (Signal sent by tkill() 27576 1000)
[1] 27576 abort (core dumped) ./superpointglue_image ../config/config_new.yaml ../weights/
I try the docker command, and get:
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: requirement error: unsatisfied condition: cuda>=11.6, please update your driver to a newer version, or use an earlier cuda container: unknown.
ERRO[0000] error waiting for container:
Hi,
I got the superpoints but after superglue, it show nothing as indices0 and indices1 all -1,
I compiled and run in Windows 11, CUDA 12.3, TensorRT 8.6.1.6 with the same images and models.
Thanks
-Scott
Hi, i rewrite the superglue serialization part referring your code in python. Although the engine finally generated, the inference result is wrong. I have no idea how to fix it. Could you please give a hand. Tks.
my environment:
cuda: 11.1
tensorRT: 8.4.3.1
below is my code:
import tensorrt as trt
import cuda.ccudart as cudart
import cuda.cuda as cuda
onnx_path = 'superglue_indoor_folded.onnx'
logger = trt.Logger(trt.Logger.WARNING) # create a trt logger
builder = trt.Builder(logger) # create a builder
network = builder.create_network(1 <<
int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)) # create a network
config = builder.create_builder_config() # build a config
# set up configs
config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, 1 << 30)
# config.set_flag(trt.BuilderFlag.FP16)
config.profiling_verbosity=trt.ProfilingVerbosity.DETAILED
print(config.profiling_verbosity)
# create optimization profile()
profile = builder.create_optimization_profile()
profile.set_shape("keypoints_0",(1,1,2),(1,512,2),(1,1024,2))
profile.set_shape("keypoints_1",(1,1,2),(1,512,2),(1,1024,2))
profile.set_shape("scores_0",(1,1),(1,512),(1,1024))
profile.set_shape("scores_1",(1,1),(1,512),(1,1024))
profile.set_shape("descriptors_0",(1,256,1),(1,256,512),(1,256,1024))
profile.set_shape("descriptors_1",(1,256,1),(1,256,512),(1,256,1024))
config.add_optimization_profile(profile) # add profile to config
# parse onnx to trt
parser = trt.OnnxParser(network, logger)
success = parser.parse_from_file(onnx_path)
print("Parse onnx to engine success")
for idx in range(parser.num_errors):
print(parser.get_error(idx))
if not success:
pass
err, stream = cuda.cuStreamCreate(0)
assert(err == cuda.CUresult.CUDA_SUCCESS)
config.profile_stream = stream
# serialize the network
plan = builder.build_serialized_network(network,config)
runtime = trt.Runtime(logger)
engine = runtime.deserialize_cuda_engine(plan)
engine = engine.serialize()
if engine is None:
print("[error] serialize engine failed!")
with open("superglue_indoor_folded.engine",'wb') as f:
f.write(engine)
print("output serialized engine.")
Hello, I would like to understand if the code runs on DLA cores or not? Even after changing the settings for dla_cores from -1 to 0 or 1, for a jetson device, I see no improvement in the model. When looking at the clock frequencies of the two cores, they are shown inactive on a nvidia jetson device.
It took me almost an hour to generate the superpoint and superglue's engines.I wonder why and is there any way to improve this situation?
Hey! i am new to TensorRT. I am working on jetson agx xavier. Can you provide me code for applying these two algorithms on video (offline or via webcam). Thanks.
Would it possible to use classical matching pipeline? I have played with the output from superpoint, but not sure the data structure how the features are stored
Hello, why do I use your sample for superpoint feature extraction? The number of features obtained is 0, and all the data provided in the warehouse is used.
我注意到博主说如果图像尺寸不是320*240的话,需要重新生成engine,更改config文件,那么我有个问题想咨询博主,是否可以通过 TensorRT 的 Dynamic Shapes 功能,处理可变尺寸的输入?这个功能能适配于superglue吗?感谢解答!
./superpointglue_image ../config/config.yaml ../weights/ ${PWD}/../image/image0.png ${PWD}/../image/image1.png
Config file is ../config/config.yaml
First image size: 320x240
Second image size: 320x240
Building inference engine......
KTM assertion failure: /root/gpgpu/MachineLearning/myelin/src/compiler/kernel_timing_model/src/timingModel.cpp:399 dramBandwidth > 0
[1] 102964 abort (core dumped) ./superpointglue_image ../config/config.yaml ../weights/
这个怎么解啊。。 感谢
方便加下微信不。。
Hi, thank you for this code.
I am trying to use HFNet to detect features, and Superglue from this repo to match them, but I see very few matches, on two exactly matching images. I am doing:
(*mpExtractorLeft)(image, vKeyPoints, localDescriptors, globalDescriptors); //to extract features to cv::Keypoint / cv::Mat
Then:
//matching
std::string config_path = "config.yaml";
std::string model_dir = "weights";
Configs configs(config_path, model_dir);
std::cout << "Building inference engine......" << std::endl;
Eigen::Matrix<double, 259, Eigen::Dynamic> feature_points0, feature_points1;
std::vector<cv::DMatch> superglue_matches;
//fill eigen from desc and keys
//image 1
feature_points0.resize(259, vKeyPoints.size());
for (int i = 0; i < vKeyPoints.size(); i++) {
feature_points0(0, i) = 1;
}
for (int j = 0; j < vKeyPoints.size(); ++j) {
feature_points0(1, j) = vKeyPoints[j].pt.x;
feature_points0(2, j) = vKeyPoints[j].pt.y;
}
for (int m = 3; m < 259; ++m) {
for (int n = 0; n < localDescriptors.rows; ++n) //rows is kpnts size, cols is 256
{
feature_points0(n, m) = localDescriptors.at<double>(n,m);
}
}
feature_points1 = feature_points0;
auto superglue = std::make_shared<SuperGlue>(configs.superglue_config);
if (!superglue->build()) {
std::cerr << "Error in SuperGlue building engine. Please check your onnx model path." << std::endl;
return 0;
}
auto start = std::chrono::high_resolution_clock::now();
superglue->matching_points(feature_points0, feature_points1, superglue_matches,true);
auto end = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
std::cout << "matching took: " << duration.count() << std::endl;
from 712 features (the exact same image) I get 35 matches.
Can you see what I am doing wrong here?
Thank you!
您好 我vs2015上面可以正常运行,在vs2019上面 auto builder=TensorRTUniquePtrnvinfer1::IBuilder(nvinfer1::createInferBuilder(blogger.getTRLogger()); 这一步结果是empty 请问是因为模型和vs2019不兼容吗
Hi,
In super_point.cpp line 280, std::bind1st is removed in C++20 .
How to make it work with C++20?
Thanks,
-Scott
After building the libsuperpointglue_lib.so and superpointglue_image, I use the command like this:
./superpointglue_image ../config/config.yaml ../weights/ ../image/12.jpg ../image/34.jpg
but an error occurred like this, "Error in SuperGlue building engine. Please check your onnx model path",
I'm sure the onnx model path is true, then add log in the C++ file, the error is parser->parseFromFile, it return false,
why the superpoint is right, and the superglue is not?
After:
superglue->matching_points(feature_points0, feature_points1, superglue_matches);
There is no matching result.
Then I go to check the condition:
if (indices0(i) < indices1.size() && indices0(i) >= 0 && indices1(indices0(i)) == i)
but indices0(i) are always equal to -1
So, could you please give some advice about this? Thanks.
Hi, I have using the docker image to run your code.
When I build the source code the result is below:
root@30ba2b16fff9:/workspace/SuperPoint-SuperGlue-TensorRT/build# cmake ..
-- The C compiler identification is GNU 9.4.0
-- The CXX compiler identification is GNU 9.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found CUDA: /usr/local/cuda (found version "11.6")
-- Found OpenCV: /usr (found suitable version "4.2.0", minimum required is "4.2")
-- Configuring done
-- Generating done
-- Build files have been written to: /workspace/SuperPoint-SuperGlue-TensorRT/build
root@30ba2b16fff9:/workspace/SuperPoint-SuperGlue-TensorRT/build#
And after making, I ran the "./superpointglue_image ../config/config.yaml ../weights/ ${PWD}/../image/image0.png ${PWD}/../image/image1.png
"
However, the processing has been pendded on inference engine as below.
"Config file is ../config/config.yaml
First image size: 320x240
Second image size: 320x240
Building inference engine......"
Do you have some suggestion?
Thank you very much.
I have the same issue described in #17 . That being said, I'm on an NVIDIA 3050 GPU -- any assistance would be appreciated.
我用您的engine文件就会在 context_ = TensorRTUniquePtrnvinfer1::IExecutionContext(engine_->createExecutionContext());
if (!context_) {
return false;
}这一步return false 是不是要自己生成engine文件呢?
when run SuperPoint::infer => BufferManager buffers(engine_, 0, context_.get());
In buffers.h, line 226, vol *= tensorrt_common::volume(dims);
as dims are all 0 (nbDms = -1, {0,0,0,0,0,0,0}),
result exception!
Thanks,
-Scott
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.