FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts 🌊🤔💡

Welcome to the FlowVQA Project Repository

FlowVQA introduces a novel benchmark for visual question answering, emphasizing the use of flowcharts for complex reasoning and evaluation. This benchmark aims to challenge and enhance the capabilities of multimodal language models through spatial reasoning, decision-making, and logical progression tasks.

Abstract

Traditional benchmarks in visual question answering do not fully test models' visual grounding and complexity, especially in spatial reasoning. FlowVQA addresses this gap by providing a comprehensive set of 2,272 human-verified flowchart images and 22,413 question-answer pairs. This new benchmark is designed for a thorough evaluation of visual and logical reasoning capabilities in AI.

Citing Our Work

Please consider citing our paper if you use FlowVQA in your research:

@article{SinghEtAlXXXX, title={Your Paper Title Here}, author={Singh, Shubhankar and others}, journal={Journal Name}, volume={XX}, number={XX}, pages={XX--XX}, year={XXXX}, publisher={Publisher} }

Repository Contents

Data Repository: Contains test and train JSONs, including flowchart images, mermaid scripts, tags, questions, and question types.
Code Repository: Features code snippets, example prompts, and additional resources.

Contributors

The project was developed by Shubhankar Singh, Purvi Chaurasia, Yerram Varun, Vatsal Gupta, and Pranshu Pandya, under the mentorship of Dr. Vivek Gupta and Dr. Dan Roth.

Future Directions and Use Cases

FlowVQA facilitates research in several key areas, such as:

Flowchart Reasoning: Enhancing visual logic and reasoning capabilities of models.
Graph-Encoder Models: Improving structural reasoning with flowchart-based models.
Adversarial and Counterfactual Probes: Testing models with challenging questions.
Complex Subtasks: Developing additional tasks for comprehensive training and evaluation.
NeuroSymbolic AI: Applying neurosymbolic methods for better performance and understanding.

Repository Status

🚧 Under Construction. Please refrain from using the data or code until further notice.

Stay Informed

For project updates and more details on contributing, please follow this repository.

Thank you for your interest in FlowVQA! 🎉

vatsal-ts / flowvqa Goto Github PK

flowvqa's Introduction

FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts 🌊🤔💡

Welcome to the FlowVQA Project Repository

Abstract

Citing Our Work

Repository Contents

Contributors

Future Directions and Use Cases

Repository Status

Stay Informed

flowvqa's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent