Giter Club home page Giter Club logo

seqnet's Introduction

SeqNet: Learning Descriptors for Sequence-Based Hierarchical Place Recognition

[ArXiv+Supplementary] [IEEE Xplore RA-L 2021] [ICRA 2021 YouTube Video]

and

SeqNetVLAD vs PointNetVLAD: Image Sequence vs 3D Point Clouds for Day-Night Place Recognition

[ArXiv] [CVPR 2021 Workshop 3DVR]


Sequence-Based Hierarchical Visual Place Recognition.

News:

Jan 27, 2024 : Download all pretrained models from here, Nordland dataset from here and precomputed descriptors from here

Jan 18, 2022 : MSLS training setup included.

Jan 07, 2022 : Single Image Vanilla NetVLAD feature extraction enabled.

Oct 13, 2021 : Oxford & Brisbane Day-Night pretrained models download link. (use the latest link provided above)

Aug 03, 2021 : Added Oxford dataset files and a direct link to download the Nordland dataset. (use the latest link provided above)

Jun 23, 2021: CVPR 2021 Workshop 3DVR paper, "SeqNetVLAD vs PointNetVLAD", now available on arXiv.

Setup

Conda

conda create -n seqnet numpy pytorch=1.8.0 torchvision tqdm scikit-learn faiss tensorboardx h5py -c pytorch -c conda-forge

Download

Run bash download.sh to download single image NetVLAD descriptors (3.4 GB) for the Nordland-clean dataset [a] and the Oxford dataset (0.3 GB), and Nordland-trained model files (1.5 GB) [b]. Other pre-trained models for Oxford and Brisbane Day-Night can be downloaded from here. [Please see download links at the top news from 27 Jan 2024]

Run

Train

To train sequential descriptors through SeqNet on the Nordland dataset:

python main.py --mode train --pooling seqnet --dataset nordland-sw --seqL 10 --w 5 --outDims 4096 --expName "w5"

or the Oxford dataset (set --dataset oxford-pnv for pointnetvlad-like data split as described in the CVPR 2021 Workshop paper):

python main.py --mode train --pooling seqnet --dataset oxford-v1.0 --seqL 5 --w 3 --outDims 4096 --expName "w3"

or the MSLS dataset (specifying --msls_trainCity and --msls_valCity as default values):

python main.py --mode train --pooling seqnet --dataset msls --msls_trainCity melbourne --msls_valCity austin --seqL 5 --w 3 --outDims 4096 --expName "msls_w3"

To train transformed single descriptors through SeqNet:

python main.py --mode train --pooling seqnet --dataset nordland-sw --seqL 1 --w 1 --outDims 4096 --expName "w1"

Test

On the Nordland dataset:

python main.py --mode test --pooling seqnet --dataset nordland-sf --seqL 5 --split test --resume ./data/runs/Jun03_15-22-44_l10_w5/ 

On the MSLS dataset (can change --msls_valCity to melbourne or austin too):

python main.py --mode test --pooling seqnet --dataset msls --msls_valCity amman --seqL 5 --split test --resume ./data/runs/<modelName>/

The above will reproduce results for SeqNet (S5) as per Supp. Table III on Page 10.

[Expand this] To obtain other results from the same table in the paper, expand this.
# Raw Single (NetVLAD) Descriptor
python main.py --mode test --pooling single --dataset nordland-sf --seqL 1 --split test

# SeqNet (S1)
python main.py --mode test --pooling seqnet --dataset nordland-sf --seqL 1 --split test --resume ./data/runs/Jun03_15-07-46_l1_w1/

# Raw + Smoothing
python main.py --mode test --pooling smooth --dataset nordland-sf --seqL 5 --split test

# Raw + Delta
python main.py --mode test --pooling delta --dataset nordland-sf --seqL 5 --split test

# Raw + SeqMatch
python main.py --mode test --pooling single+seqmatch --dataset nordland-sf --seqL 5 --split test

# SeqNet (S1) + SeqMatch
python main.py --mode test --pooling s1+seqmatch --dataset nordland-sf --seqL 5 --split test --resume ./data/runs/Jun03_15-07-46_l1_w1/

# HVPR (S5 to S1)
# Run S5 first and save its predictions by specifying `resultsPath`
python main.py --mode test --pooling seqnet --dataset nordland-sf --seqL 5 --split test --resume ./data/runs/Jun03_15-22-44_l10_w5/ --resultsPath ./data/results/
# Now run S1 + SeqMatch using results from above (the timestamp of `predictionsFile` would be different in your case)
python main.py --mode test --pooling s1+seqmatch --dataset nordland-sf --seqL 5 --split test --resume ./data/runs/Jun03_15-07-46_l1_w1/ --predictionsFile ./data/results/Jun03_16-07-36_l5_0.npz

Single Image Vanilla NetVLAD Extraction

[Expand this] To obtain the single image vanilla NetVLAD descriptors (i.e. the provided precomputed .npy descriptors)
# Setup Patch-NetVLAD submodule from the seqNet repo:
cd seqNet 
git submodule update --init

# Download NetVLAD+PCA model
cd thirdparty/Patch-NetVLAD/patchnetvlad/pretrained_models
wget -O pitts_orig_WPCA4096.pth.tar https://cloudstor.aarnet.edu.au/plus/s/gJZvogRj4FUUQMy/download

# Compute global descriptors
cd ../../../Patch-NetVLAD/
python feature_extract.py --config_path patchnetvlad/configs/seqnet.ini --dataset_file_path ../../structFiles/imageNamesFiles/oxford_2014-12-16-18-44-24_imagenames_subsampled-2m.txt --dataset_root_dir <PATH_TO_OXFORD_IMAGE_DIR> --output_features_fullpath ../../data/descData/netvlad-pytorch/oxford_2014-12-16-18-44-24_stereo_left.npy

# example for MSLS (replace 'database' with 'query' and use different city names to compute all)
python feature_extract.py --config_path patchnetvlad/configs/seqnet.ini --dataset_file_path ../../structFiles/imageNamesFiles/msls_melbourne_database_imageNames.txt --dataset_root_dir <PATH_TO_Mapillary_Street_Level_Sequences> --output_features_fullpath ../../data/descData/netvlad-pytorch/msls_melbourne_database.npy

Acknowledgement

The code in this repository is based on Nanne/pytorch-NetVlad. Thanks to Tobias Fischer for his contributions to this code during the development of our project QVPR/Patch-NetVLAD.

Citation

@article{garg2021seqnet,
  title={SeqNet: Learning Descriptors for Sequence-based Hierarchical Place Recognition},
  author={Garg, Sourav and Milford, Michael},
  journal={IEEE Robotics and Automation Letters},
  volume={6},
  number={3},
  pages={4305-4312},
  year={2021},
  publisher={IEEE},
  doi={10.1109/LRA.2021.3067633}
}

@misc{garg2021seqnetvlad,
  title={SeqNetVLAD vs PointNetVLAD: Image Sequence vs 3D Point Clouds for Day-Night Place Recognition},
  author={Garg, Sourav and Milford, Michael},
  howpublished={CVPR 2021 Workshop on 3D Vision and Robotics (3DVR)},
  month={Jun},
  year={2021},
}

Other Related Projects

SeqMatchNet (2021); Patch-NetVLAD (2021); Delta Descriptors (2020); CoarseHash (2020); seq2single (2019); LoST (2018)

[a] This is the clean version of the dataset that excludes images from the tunnels and red lights and can be downloaded from here.

[b] These will automatically save to ./data/, you can modify this path in download.sh and get_datasets.py to specify your workdir.

seqnet's People

Contributors

oravus avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

seqnet's Issues

the Oxford dataset Preprocessing

Thank you very much for your great work! Can you provide a version of the Oxford dataset that is the same as the clean Nordland dataset?
I'm sorry to bother you.

use SeqNet without SeqMatch

Hello @oravus ,

Thanks for your fantastic job providing a nice way to fuse information from multiple frames. Now I only use SeqNet (Conv+SAP+L2-Norm) without SeqMatch, trained with my own dataset for LIDAR-based place recognition. However, I found SeqNet did not work well once the seq_len is small (< 20) and was hard to train to get a strong model. Thus I guess that whether SeqNet without SeqMatch works well largely depends on the distribution of training set and test set, or it largely depends on the output module of the raw descriptor generation algorithm. Is this right? Have you met some dataset where SeqNet can not work well?

Best wishes!

use SeqNet without SeqMatch

Hello @oravus ,

Thanks for your fantastic job providing a nice way to fuse information from multiple frames. Now I only use SeqNet (Conv+SAP+L2-Norm) without SeqMatch, trained with my own dataset for LIDAR-based place recognition. However, I found SeqNet did not work well once the seq_len is small (< 20) and was hard to train to get a strong model. Thus I guess that whether SeqNet without SeqMatch works well largely depends on the distribution of training set and test set, or it largely depends on the output module of the raw descriptor generation algorithm. Is this right? Have you met some dataset where SeqNet can not work well?

Best wishes!

Questions about training and testing Seqnet using other dataset

Hi! I am now working on your SeqNet project, and meeting some problems through it.
Currently I can run the codes and get recall results using nordland and oxford dataset. Now I am trying to use other datasets like hilti: https://hilti-challenge.com/dataset.html . I wonder how I can fit this dataset into your program.
Should I also generate .db AND .npy files as well? If my original data format is .jpg, what should be my process? the .npy files under /netvlad-pytorch file are descriptors generated by netVLAD, am I right?

Thank you for your patience and waiting for your kindly help!

MSLS results

Hi, I would like to replicate the results on Mapillary. Thank you for adding the code to generate the .npy files containing the descriptors for any dataset. Could you please suggest how to generate the .db files for MSLS and then replicate the results in your paper?
Thank your help!

dataset split issues

Hi @oravus , thanks for your help and I have generated the needed db files using pitts250k dataset. Generally, I use the whole dataset to generate descriptors and saved both query and ref .npy files.
That is :
===> Loading dataset(s)
All Db descs: (254064, 4096)
All Qry descs: (24000, 4096)

Next, I use the first 10000 images as train sets, followed by 3000 images as validation and 3000 images as test sets. Also, I generated 3 .db files as train_mat_file, test_mat_file, val_mat_file.

Thereafter, I wrote the specification in get_datasets.py. The indexes are defined as nordland dataset format:
trainInds, testInds, valInds = np.arange(10000), np.arange(10000,13000), np.arange(13000,16000)

And I think I can test the dataset using your pretrained model, but things comes out that still errors occur regarding index.

//////////////////////////////////////////
Restored flags: ['--optim', 'SGD', '--lr', '0.0001', '--lrStep', '50', '--lrGamma', '0.5', '--weightDecay', '0.001', '--momentum', '0.9', '--seed', '123', '--runsPath', './data/runs', '--savePath', './data/runs/Jun03_15-22-44_l10_l10_w5_seqnetEnv/checkpoints', '--patience', '0', '--pooling', 'seqnet', '--w', '5', '--outDims', '4096', '--margin', '0.1']
Namespace(batchSize=16, cacheBatchSize=24, cachePath='./data/cache', cacheRefreshRate=0, ckpt='latest', dataset='pitts250k', descType='netvlad-pytorch', evalEvery=1, expName='0', extractOnly=False, lr=0.0001, lrGamma=0.5, lrStep=50.0, margin=0.1, mode='test', momentum=0.9, msls_trainCity='melbourne', msls_valCity='austin', nEpochs=200, nGPU=1, nocuda=False, numSamples2Project=-1, optim='SGD', outDims=4096, patience=0, pooling='seqnet', predictionsFile=None, resultsPath=None, resume='./data/runs/Jun03_15-22-44_l10_w5/', runsPath='./data/runs', savePath='./data/runs/Jun03_15-22-44_l10_l10_w5_seqnetEnv/checkpoints', seed=123, seqL=5, seqL_filterData=None, split='test', start_epoch=0, threads=8, w=5, weightDecay=0.001)
===> Loading dataset(s)
All Db descs: (254064, 4096)
All Qry descs: (24000, 4096)
===> Evaluating on test set
====> Query count: 800
===> Building model
=> loading checkpoint './data/runs/Jun03_15-22-44_l10_w5/checkpoints/checkpoint.pth.tar'
=> loaded checkpoint './data/runs/Jun03_15-22-44_l10_w5/checkpoints/checkpoint.pth.tar' (epoch 200)
===> Running evaluation step
====> Extracting Features
20%|████████████████████████████████████▉ | 49/249 [00:00<00:01, 127.41it/s]==> Batch (50/250)
39%|█████████████████████████████████████████████████████████████████████████▏ | 97/249 [00:00<00:01, 148.19it/s]==> Batch (100/250)
58%|████████████████████████████████████████████████████████████████████████████████████████████████████████████▉ | 145/249 [00:01<00:00, 154.13it/s]==> Batch (150/250)
78%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉ | 193/249 [00:01<00:00, 156.80it/s]==> Batch (200/250)
97%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 242/249 [00:01<00:00, 158.98it/s]==> Batch (250/250)
Average batch time: 0.006786982536315918 0.009104941585046229
torch.Size([3000, 4096]) torch.Size([3000, 4096])
====> Building faiss index
====> Calculating recall @ N
Using Localization Radius: 25
Traceback (most recent call last):
File "main.py", line 133, in
recallsOrDesc, dbEmb, qEmb, rAtL, preds = test(opt, model, encoder_dim, device, whole_test_set, writer, epoch, extract_noEval=opt.extractOnly)
File "/home/lx/lx/Seqnet_new/test.py", line 138, in test
rAtL.append(getRecallAtN(n_values, predictions, gtAtL))
File "/home/lx/lx/Seqnet_new/test.py", line 37, in getRecallAtN
if len(gt[qIx]) == 0:
IndexError: list index out of range

/////////////////////////////////////////
I am writing to ask :

  1. how should the index should be defined? Could you use oxford dataset or nordland dataset to explain the details for me?
  2. should npy files contain the same number of descriptors as I generate the dataset as 3 db files for train, test, and val?

THANKS FOR HELP!

"seqL_filterData" argument

Hi! Thanks for your great work!
The argument, “seqL_filterData”, has always defaulted to None, What does it do and under what conditions is it used?
Kind regards!

nGPU

I'm sorry to bother you.
I've noticed that the 'nGPU' parameter only appears once in the main function, not anywhere else.I would like to know how this parameter is used.
Best wishes to you!

The evaluation metric about Precision@K

Hi! Thanks for your great work on SeqNet.
The evaluation metric in SeqNet is only Recall@K, but the false positive instances is not considered.
Would you mind providing some implementation ideas in test step for evaluating the Precision@K?
Best wishes!

Train sequential descriptors through SeqNet on the MSLS dataset

Hi! Thanks for your great work on SeqNet.
And I met a problem when I trained the sequential descriptors on the MSLS dataset. I just run the command: python main.py --mode train --pooling seqnet --dataset msls --msls_trainCity melbourne --msls_valCity austin --seqL 5 --w 3 --outDims 4096 --expName "msls_w3".
But ther is a problem: No such file or directory: './data/descData/netvlad-pytorch/msls_melbourne_databast.npy'.
Maybe I did not find any corresponding ***.npy files about the MSLS dataset. Would you mind providing the corresponging download link of the MSLS dataset.?
Kind regards!

pre-train for MCD dataset

Hi Dr. Garg
We recently came up with a new challenging dataset called https://mcdviral.github.io/

We'd like to invite you to try the SeqNet on these 3 campuses located at different latitudes.

if you have other IROS paper for me to review, I'm perfectly ok to do so

Regards
Shenghai Yuan

use SeqNet without SeqMatch

Hello @oravus ,

Thanks for your fantastic job providing a nice way to fuse information from multiple frames. Now I only use SeqNet (Conv+SAP+L2-Norm) without SeqMatch, trained with my own dataset for LIDAR-based place recognition. However, I found SeqNet did not work well once the seq_len is small (< 20) and was hard to train to get a strong model. Thus I guess that whether SeqNet without SeqMatch works well largely depends on the distribution of training set and test set, or it largely depends on the output module of the raw descriptor generation algorithm. Is this right? Have you met some dataset where SeqNet can not work well?

Best wishes!

Related files for msls experiments

@oravus Hi! Thanks for your kindly reply in issue #17! That do help me a lot!

In case my additional comments in issue #17 are not noticed I submit this new issue and copy my question here:

In the experiments on MSLS dataset (Fig.4 in the paper), Amman, Boston, San Francisco and Copenhagen are used for testing but only files related Amman are provided (imageNames .txt and seqBounds.txt), for the rest three cities should I use subsets like your descriptions above? In this case could you please provide these related files so that I could do some experiments and get the same results? As the principle to generate those subsets was not mentioned in the paper:).

Or it is sufficient to generate these files myself from MSLS dataset, which means, to use the whole set of original Boston, San Francisco and Copenhagen traverses.

Thank you again!

The descriptor data for the Brisbane dataset

Hi! Thanks for your great work on SeqNet.
With your kindly help in issue #12 opened by myself, I have obtained the corresponding descriptor data for the Nordland, Oxford, and MSLS datasets, and achieved the same results as the paper. Then, I found that the Bribane dataset is not open-source, such that I cannot directly obtain the corresponding descriptor through the original dataset imsges. I have found your reply--"For the Brisbane dataset, we can only release the descriptor data." in issue #1 that the descritpor data can be released.
Would you mind providing the descriptor data of the Brisbane dataset for me? I just want to reproduce all the results in the paper and improve this remarkable work. I promise it will never be commercial but for research.
Kind regards!

discussion on sequential descriptor

Hi May I ask what the sequential descriptor really encodes?

As far as I understand, I can use LSTM to estimate both self-motion as well as motion stereo depth.

Does LSTM here only encodes the changes in locomotion? or does it also encode the overall 3D structure prior?

If it encodes either motion or map, I can use short-duration odometry as additional field input or something like LIDAR projected depth map to speed up the seqNet process to make it multi-modality seqNet?

MSLS austin/melbourne image number difference between original seq_info.csv and provided seqBounds.txt here

I find that the number of images in original seq_info.csv and the number of images in seqBounds.txt in this code are not the same.

For example, in melbourne/query/seq_info.csv (original MSLS dataset) there are 88119 images but here in structFiles/seqBoundsFiles/msls_melbourne_query_seqBounds.txt only 4474! This is the same case for austin city, but for amman it's the same.

I wonder whether some of you know the reason about this difference...

Thanks a lot!!!

Doubt regarding MSLS sequences

Hi, thanks for sharing your work! I have two doubts reagrding the MSLS sequences.

  1. In the paper, it is mentioned that for MSLS, the sequences were used as such. I checked the seqBound txt files and the sequence length seems to span hundreds of frames and is not fixed. However, it is said that the focus is on shorter sequences for VPR. Could you please explain this?
  2. It is also mentioned in the paper that for all sequences, only those images are considered where a valid sequence can be 'centered'. However, looking at the seqBound txt files, it looks like the approach of taking the neighboring frames keeping the current frame as center is not used and the same frames are used for all frames in the sequence (e.g. 1 to 31 for all 31 frames). Could you please explain this as well?

Thanks in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.