Giter Club home page Giter Club logo

openplacerecognition's Introduction

Open Place Recognition library

Place Recognition overview

An overview of a typical place recognition pipeline. At first, the input data is encoded into a query descriptor. Then, a K-nearest neighbors search is performed between the query and the database. Finally, the position of the closest database descriptor found is considered as the answer.

Featured modules

Detailed description of featured library modules can be found in the docs/modules.md document.

  1. PlaceRecognitionPipeline
  2. SequencePointcloudRegistrationPipeline
  3. PlaceRecognitionPipeline with semantics
  4. ArucoLocalizationPipeline
  5. LocalizationPipeline without dynamic objects
  6. Localization by specific scene elements (Semantic Object Context (SOC) module)
  7. Module for generating global vector representations of multimodal outdoor data
  8. MultimodalPlaceRecognitionTrainer
  9. TextLabelsPlaceRecognitionPipeline
  10. DepthReconstruction
  11. ITLPCampus

Installation

Pre-requisites

  • The library requires PyTorch, MinkowskiEngine and (optionally) faiss libraries to be installed manually:

  • Another option is to use the docker image. You can read detailed description in the docker/README.md. Quick-start commands to build, start and enter the container:

    # from repo root dir
    bash docker/build_devel.sh
    bash docker/start.sh [DATASETS_DIR]
    bash docker/into.sh

Library installation

  • After the pre-requisites are met, install the Open Place Recognition library with the following command:

    pip install -e .

Third-party packages

  • If you want to use the GeoTransformer model for pointcloud registration, you should install the package located in the third_party directory:

    # load submodules from git
    git submodule update --init
    
    # change dir
    cd third_party/GeoTransformer/
    
    # install the package
    bash setup.sh

How to load the weights

You can download the weights from the public Google Drive folder.

Developers only

We use DVC to manage the weights storage. To download the weights, run the following command (assuming that dvc is already installed):

dvc pull

You will be be asked to authorize the Google Drive access. After that, the weights will be downloaded to the weights directory. For more details, see the DVC documentation.

ITLP-Campus dataset

We introduce the ITLP-Campus dataset. The dataset was recorded on the Husky wheeled robotic platform on the university campus and consists of tracks recorded at different times of day (day/dusk/night) and different seasons (winter/spring). You can find more detail in the VitalyyBezuglyj/ITLP-Campus repository.

Package Structure

opr.datasets

Subpackage containing dataset classes and functions.

Usage example:

from opr.datasets import OxfordDataset

train_dataset = OxfordDataset(
    dataset_root="/home/docker_opr/Datasets/pnvlad_oxford_robotcar_full/",
    subset="train",
    data_to_load=["image_stereo_centre", "pointcloud_lidar"]
)

The iterator will return a dictionary with the following keys:

  • "idx": index of the sample in the dataset, single number Tensor
  • "utm": UTM coordinates of the sample, Tensor of shape (2)
  • (optional) "image_stereo_centre": image Tensor of shape (C, H, W)
  • (optional) "pointcloud_lidar_feats": point cloud features Tensor of shape (N, 1)
  • (optional) "pointcloud_lidar_coords": point cloud coordinates Tensor of shape (N, 3)

More details can be found in the demo_datasets.ipynb notebook.

opr.losses

The opr.losses subpackage contains ready-to-use loss functions implemented in PyTorch, featuring a common interface.

Usage example:

from opr.losses import BatchHardTripletMarginLoss

loss_fn = BatchHardTripletMarginLoss(margin=0.2)

idxs = sample_batch["idxs"]
positives_mask = dataset.positives_mask[idxs][:, idxs]
negatives_mask = dataset.negatives_mask[idxs][:, idxs]

loss, stats = loss_fn(output["final_descriptor"], positives_mask, negatives_mask)

The loss functions introduce a unified interface:

  • Input:
    • embeddings: descriptor Tensor of shape (B, D)
    • positives_mask: boolean mask Tensor of shape (B, B)
    • negatives_mask: boolean mask Tensor of shape (B, B)
  • Output:
    • loss: loss value Tensor
    • stats: dictionary with additional statistics

More details can be found in the demo_losses.ipynb notebook.

opr.models

The opr.models subpackage contains ready-to-use neural networks implemented in PyTorch, featuring a common interface.

Usage example:

from opr.models.place_recognition import MinkLoc3D

model = MinkLoc3D()

# forward pass
output = model(batch)

The models introduce unified input and output formats:

  • Input: a batch dictionary with the following keys (all keys are optional, depending on the model and dataset):
    • "images_<camera_name>": images Tensor of shape (B, 3, H, W)
    • "masks_<camera_name>": semantic segmentation masks Tensor of shape (B, 1, H, W)
    • "pointclouds_lidar_coords": point cloud coordinates Tensor of shape (B * N_points, 4)
    • "pointclouds_lidar_feats": point cloud features Tensor of shape (B * N_points, C)
  • Output: a dictionary with the requiered key "final_descriptor" and optional keys for intermediate descriptors:
    • "final_descriptor": final descriptor Tensor of shape (B, D)

More details can be found in the demo_models.ipynb notebook.

opr.trainers

The opr.trainers subpackage contains ready-to-use training algorithms.

Usage example:

from opr.trainers.place_recognition import UnimodalPlaceRecognitionTrainer

trainer = UnimodalPlaceRecognitionTrainer(
    checkpoints_dir=checkpoints_dir,
    model=model,
    loss_fn=loss_fn,
    optimizer=optimizer,
    scheduler=scheduler,
    batch_expansion_threshold=cfg.batch_expansion_threshold,
    wandb_log=(not cfg.debug and not cfg.wandb.disabled),
    device=cfg.device,
)

trainer.train(
    epochs=cfg.epochs,
    train_dataloader=dataloaders["train"],
    val_dataloader=dataloaders["val"],
    test_dataloader=dataloaders["test"],
)

opr.pipelines

The opr.pipelines subpackage contains ready-to-use pipelines for model inference.

Usage example:

from opr.models.place_recognition import MinkLoc3Dv2
from opr.pipelines.place_recognition import PlaceRecognitionPipeline

pipe = **PlaceRecognitionPipeline**(
    database_dir="/home/docker_opr/Datasets/ITLP_Campus/ITLP_Campus_outdoor/databases/00",
    model=MinkLoc3Dv2(),
    model_weights_path=None,
    device="cuda",
)

out = pipe.infer(sample)

The pipeline introduces a unified interface for model inference:

  • Input: a dictionary with the following keys (all keys are optional, depending on the model and dataset):
    • "image_<camera_name>": image Tensor of shape (3, H, W)
    • "mask_<camera_name>": semantic segmentation mask Tensor of shape (1, H, W)
    • "pointcloud_lidar_coords": point cloud coordinates Tensor of shape (N_points, 4)
    • "pointcloud_lidar_feats": point cloud features Tensor of shape (N_points, C)
  • Output: a dictionary with keys:
    • "idx" for predicted index in the database,
    • "pose" for predicted pose in the format [tx, ty, tz, qx, qy, qz, qw],
    • "descriptor" for predicted descriptor.

More details can be found in the demo_pipelines.ipynb notebook.

Model Zoo

Place Recognition

Model Modality Train Dataset Config Weights
MinkLoc3D (paper) LiDAR NCLT minkloc3d.yaml minkloc3d_nclt.pth
Custom Multi-Image, Multi-Semantic, LiDAR NCLT multi-image_multi-semantic_lidar_late-fusion.yaml multi-image_multi-semantic_lidar_late-fusion_nclt.pth
Custom Multi-Image, LiDAR NCLT multi-image_lidar_late-fusion.yaml multi-image_lidar_late-fusion_nclt.pth

Featured Projects

License

MIT License (the license is subject to change in future versions)

openplacerecognition's People

Contributors

alexmelekhin avatar ms-ana avatar kirillmouraviev avatar petilia avatar yuddim avatar linukc avatar vitalyybezuglyj avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.