Giter Club home page Giter Club logo

mindvideo's Introduction

Cinematic Mindscapes: High-quality Video Reconstruction from Brain Activity

arXiv | Website.

MinD-Video

MinD-Video is a framework for high-quality video reconstruction from brain recording.

Cinematic Mindscapes: High-quality Video Reconstruction from Brain Activity.
Zijiao Chen*, Jiaxin Qing*, Juan Helen Zhou
* equal contribution

News

  • Sep. 22, 2023. Accepted by NeurIPS2023 for Oral Presentation.
  • May. 20, 2023. Preprint release.

Abstract

Reconstructing human vision from brain activities has been an appealing task that helps to understand our cognitive process. Even though recent research has seen great success in reconstructing static images from non-invasive brain recordings, work on recovering continuous visual experiences in the form of videos is limited. In this work, we propose MinD-Video that learns spatiotemporal information from continuous fMRI data of the cerebral cortex progressively through masked brain modeling, multimodal contrastive learning with spatiotemporal attention, and co-training with an augmented Stable Diffusion model that incorporates network temporal inflation. We show that high-quality videos of arbitrary frame rates can be reconstructed with MinD-Video using adversarial guidance. The recovered videos were evaluated with various semantic and pixel-level metrics. We achieved an average accuracy of 85% in semantic classification tasks and 0.19 in structural similarity index (SSIM), outperforming the previous state-of-the-art by 45%. We also show that our model is biologically plausible and interpretable, reflecting established physiological processes.

Overview

flowchar-img

Samples

  • Some samples are shown below. Our methods can reconstruct various objects, animals, motions, and scenes. The reconstructed videos are of high quality and are consistent with the ground truth. For more samples, please refer to our website or download with google drive.
  • The following samples are currently generated with one RTX3090. Due to GPU memory limitation, samples shown below are currently 2 seconds of 3 FPS at the resolution of 256 x 256. But our method can work with longer brain recordings and reconstruct longer videos with full frame rate (30 FPS) and higher resolution, if more GPU memory is available.
        GT                      Ours         GT                      Ours         GT                      Ours         GT                      Ours         GT                      Ours

Environment setup

Create and activate conda environment named mind-video from our env.yaml

conda env create -f env.yaml
conda activate mind-video

Download data and checkpoints

The large-scale pre-training dataset is downloaded from HCP. And please refer to this repo for large-scale pre-training scripts. Our target dataset Wen (2018) can be downloaded from here.

Download the pre-trained checkpoints and preprocessed test data from here. Change the path in the config file accordingly.

Replicate our results

Method 1: Run generation with pretrained checkpoints

python scripts/eval_all.py --config configs/eval_all_sub1.yaml

Set half_precision to True and num_inference_steps to 50 for faster inference.

Method 2: Download the generated videos and run metrics evaluation

Download the generated videos from google drive.

python scripts/run_metrics.py /path/to/generated/videos

Acknowledgement

We thank the authors of Tune-A-Video for open-sourcing their codes. We also thank the Laboratory of Integrated Brain Imaging at Purdue University for making their data publicly available.

BibTeX

@article{chen2023cinematic,
  title={Cinematic Mindscapes: High-quality Video Reconstruction from Brain Activity},
  author={Chen, Zijiao and Qing, Jiaxin and Zhou, Juan Helen},
  journal={NeurIPS},
  year={2023}
}

mindvideo's People

Contributors

jackjohn avatar jqin4749 avatar zjc062 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mindvideo's Issues

Release of Training Details

Congratulations on your paper's acceptance for an oral presentation at NeurIPS!
I recently ran eval_all.py using your pre-trained checkpoints and achieved results closely aligned with those documented in your paper and the samples you provided.
Given the considerable influence and potential impact of your work on the research field, we are anticipating the release of the training details for these checkpoints. Access to these details would greatly enhance our understanding and ability to build upon your groundbreaking work, which will further enhance the influence of MinD-Video.

大佬

大佬,代码什么时候开源啊,想体验一下。

Has there been any progress on this as of late?

Hello! I've been looking for something like this for a very long time, as a lucid dreamer I would kill to have something that let you save dreams or any real form of visualizations.

This page seems rather slow however. Was just wondering if any progress is still being made on this, it seems incredible and it would be a huge shame if something like this was cancelled

Inquiry about code open-sourcing timeline

Hello, I have read your paper titled 'Mind VIDEO' and am very interested in your method. May I ask when you plan to open-source the code? Looking forward to your reply, thank you!"

code

Hello, I have read your paper titled 'Mind VIDEO' and am very interested in your method. May I ask when you plan to open-source the code? Looking forward to your reply, thank you!"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.