Giter Club home page Giter Club logo

cmflow's Introduction

Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision

arXiv  code visitors GitHub license GitHub

This is the official repository of the CMFlow, a cross-modal supervised approach for estimating 4D radar scene flow. For technical details, please refer to our paper on CVPR 2023:

Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision
Fangqiang Ding, Andras Palffy, Dariu M. Gavrila, Chris Xiaoxuan Lu
[arXiv] [demo] [page] [supp]

News

  • [2023-02] Our paper is accepted by CVPR 2023.
  • [2023-03] Our paper can be found on arXiv. Supplementary materials can be found here. Project page is built here.

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{ding2023hidden,
  title={Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision},
  author={Ding, Fangqiang and Palffy, Andras and Gavrila, Dariu M. and Lu, Chris Xiaoxuan},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  pages={1-10},
  year={2023} 
}

Abstract

This work proposes a novel approach to 4D radar-based scene flow estimation via cross-modal learning. Our approach is motivated by the co-located sensing redundancy in modern autonomous vehicles. Such redundancy implicitly provides various forms of supervision cues to the radar scene flow estimation. Specifically, we introduce a multi-task model architecture for the identified cross-modal learning problem and propose loss functions to opportunistically engage scene flow estimation using multiple cross-modal constraints for effective model training. Extensive experiments show the state-of-the-art performance of our method and demonstrate the effectiveness of cross-modal supervised learning to infer more accurate 4D radar scene flow. We also show its usefulness to two subtasks - motion segmentation and ego-motion estimation.

Method

pipeline.jpg
Figure 1. Cross-modal supervised learning pipeline for 4D radar scene flow estimation. Given two consecutive radar point clouds as the input, the model architecture, which is composed of two stages (blue/orange block colours for stage 1/2), outputs the final scene flow together with the motion segmentation and a rigid ego-motion transformation. Cross-modal supervision signals retrieved from co-located modalities are utilized to constrain outputs with various loss functions. This essentially leads to a multi-task learning problem.

Qualitative results

Here are some GIFs to show our qualitative results on scene flow estimation and two subtasks, motion segmentation and ego-motion estimation. For more qualitative results, please refer to our demo video or supplementary.

Scene flow

Subtask - Motion Segmentation

Subtask - Ego-motion Estimation

Demo Video

Getting Started

Our codes and models will be released by March 15th. Instructions on how to preprocess the data, train our models and run the test will be provided in GETTING_STARTED.

cmflow's People

Contributors

toytiny avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.