Giter Club home page Giter Club logo

superfusion's Introduction

SuperFusion

This is official Pytorch implementation of "SuperFusion: A Versatile Image Registration and Fusion Network with Semantic Awareness"

Framework

The overall framework of the proposed SuperFusion for cross-modal image registration and fusion. The overall framework of the proposed SuperFusion for cross-modal image registration and fusion.

Network Architecture

Dense Matcher

The architecture of dense matcher, which consists of a pyramid feature extractor and iterative flow estimators. Flows are estimated in three scales iteratively and summed up.

The architecture of dense matcher, which consists of a pyramid feature extractor and iterative flow estimators. Flows are estimated in three scales iteratively and summed up.

Fusion Network

Architecture of the fusion network $\mathcal{N}_F$. Conv($c, k$) denotes a convolutional layer with $c$ output channels and kernel size of $k\times k$; GSAM indicates the global spatial attention module. Architecture of the fusion network $\mathcal{N}_F$. Conv($c, k$) denotes a convolutional layer with $c$ output channels and kernel size of $k\times k$; GSAM indicates the global spatial attention module.

Global Spatial Attention Module (GSAM)

The schematic illustration of the global spatial attention module (GSAM). The global attention is calculated by adapting a spatial RNN to aggregate the spatial context in four directions. The schematic illustration of the global spatial attention module (GSAM). The global attention is calculated by adapting a spatial RNN to aggregate the spatial context in four directions.

Recommended Environment

  • torch 1.10.1
  • torchvision 0.11.2
  • kornia 0.6.5
  • opencv 4.5.5
  • pillow 9.2.0

To Test

Registration

MSRS dataset

python test.py --mode=Reg --dataset_name=MSRS 

RoadScene dataset

python test.py --mode=Reg --dataset_name=RoadScene

Fusion

MSRS dataset

python test.py --mode=Fusion --dataset_name=MSRS 

RoadScene dataset

python test.py --mode=Fusion --dataset_name=RoadScene

Registration and Fusion

MSRS dataset

python test.py --mode=Reg&Fusion --dataset_name=MSRS 

RoadScene dataset

python test.py --mode=Reg&Fusion --dataset_name=RoadScene

To Train

We suggest using our pre-trained model to test SuperFusion.

Training the registration and fusion model

MSRS dataset

python train.py --dataroot=./dataset/train/MSRS --n_ep=1000 --n_ep_decay=800 --resume=./checkpoint/MSRS.pth --stage=RF

RoadScene dataset

python train.py --dataroot=./dataset/train/RoadScene --n_ep=1000 --n_ep_decay=800 --resume=./checkpoint/RoadScene.pth --stage=RF

Fine-tuning the fusion network with the semantic constraint

python train.py --dataroot=./dataset/train/MSRS --n_ep=2000 --n_ep_decay=1600 --resume=./checkpoint/MSRS.pth --stage=FS

Registration Results

Quantitative registration performance on MSRS and RoadScene. Mean reprojection error (RE) and end-point error (EPE) are reported. Quantitative registration performance on MSRS and RoadScene. Mean reprojection error~(RE) and end-point error~(EPE) are reported.

Qualitative registration performance of DASC, RIFT, GLU-Net, UMF-CMGR, CrossRAFT, and our SuperFusion. The first four rows of images are from the MSRS dataset, and the last two are from the RoadScene dataset. The purple textures are the gradients of registered infrared images and the backgrounds are the corresponding ground truths. The discriminateive regions that demonstrate the superiority of our method are highlighted in boxes. Note that, the gradients of the second column images are from the warped images, i.e. , the misaligned infrared images. Qualitative registration performance of DASC, RIFT, GLU-Net, UMF-CMGR, CrossRAFT, and our SuperFusion. The first four rows of images are from the MSRS dataset, and the last two are from the RoadScene dataset. The purple textures are the gradients of registered infrared images and the backgrounds are the corresponding ground truths. The discriminateive regions that demonstrate the superiority of our method are highlighted in boxes. Note that, the gradients of the second column images are from the warped images, i.e., the misaligned infrared images.

Fusion Results

Quantitative comparison results of SuperFusion with five state-of-the-art alternatives on the MSRS dataset.

Quantitative comparison results of SuperFusion with five state-of-the-art alternatives on $361$ image pairs from the MSRS dataset. A point $(x, y)$ on the curve denotes that there are $100 * x$ percent of image pairs that have metric values no more than $y$.

Quantitative comparison results of SuperFusion with five state-of-the-art alternatives on the RoadScene dataset.

Quantitative comparison results of SuperFusion with five state-of-the-art alternatives on $25$ image pairs from the RoadScene dataset.

Qualitative comparison results of SuperFusion with five state-of-the-art infrared and visible image fusion methods on the MSRS and RoadScene datasets. All methods employ the built-in registration module (e.g., UMF-CMGR and our SuperFusion) or CrossRAFT to register the source images. Qualitative comparison results of SuperFusion with five state-of-the-art infrared and visible image fusion methods on the MSRS and RoadScene datasets. All methods employ the built-in registration module (e.g., UMF-CMGR and our SuperFusion) or CrossRAFT to register the source images.

Segmentation Results

Segmentation performance (IoU) of visible, infrared, and fused images on the MSRS dataset. Segmentation results for source images and fused images from the MSRS dataset.  The fused image indicates the fusion result generated by our SuperFusion, and the pre-trained segmentation model is provided by SeAFusion.

Segmentation results for source images and fused images from the MSRS dataset. Segmentation results for source images and fused images from the MSRS dataset. The fused image indicates the fusion result generated by our SuperFusion, and the pre-trained segmentation model is provided by SeAFusion.

If this work is helpful to you, please cite it as:

@article{TANG2022SuperFusion,
title = {SuperFusion: A Versatile Image Registration and Fusion Network with Semantic Awareness},
author = {Tang, Linfeng and  Deng, Yuxin and  Ma, Yong and Huang, Jun and Ma, Jiayi},
journal = {IEEE/CAA Journal of Automatica Sinica},
year = {2022}
}

superfusion's People

Contributors

linfeng-tang avatar jiayi-ma avatar

Stargazers

 avatar leesir2001 avatar He Zisen avatar

Forkers

salvafang

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.