Topic: visual-question-answering Goto Github
Some thing interesting about visual-question-answering
Some thing interesting about visual-question-answering
visual-question-answering,Creating multimodal multitask models
Organization: ai-forever
Home Page: https://dsworks.ru/champs/fb5778a8-94e9-46de-8bad-aa2c83a755fb
visual-question-answering,Compact Trilinear Interaction for Visual Question Answering (ICCV 2019)
Organization: aioz-ai
Home Page: https://blog.ai.aioz.io/research/vqa-cti/
visual-question-answering,AIOZ AI - Overcoming Data Limitation in Medical Visual Question Answering (MICCAI 2019)
Organization: aioz-ai
Home Page: https://blog.ai.aioz.io/research/vqa-mevf/
visual-question-answering,Official repository for the A-OKVQA dataset
Organization: allenai
visual-question-answering,Document Visual Question Answering
User: anisha2102
visual-question-answering,[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
User: antoyang
Home Page: https://arxiv.org/abs/2206.08155
visual-question-answering,[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos
User: antoyang
Home Page: https://arxiv.org/abs/2012.00451
visual-question-answering,Code for paper title "Learning Semantic Sentence Embeddings using Pair-wise Discriminator" COLING-2018
User: badripatro
visual-question-answering,PyTorch implementation of FiLM: Visual Reasoning with a General Conditioning Layer
User: caffeinism
visual-question-answering,[Paper][ISWC 2021] Zero-shot Visual Question Answering using Knowledge Graph
Organization: china-uk-zsl
Home Page: https://arxiv.org/abs/2107.05348
visual-question-answering,:eyes: :speaking_head: :memo:12-in-1: Multi-Task Vision and Language Representation Learning Web Demo
Organization: cloud-cv
Home Page: https://vilbert.cloudcv.org/
visual-question-answering,Strong baseline for visual question answering
User: cyanogenoid
visual-question-answering,PyTorch implementation of "Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning"
User: davidmascharka
Home Page: https://arxiv.org/abs/1803.05268
visual-question-answering,PyTorch VQA implementation that achieved top performances in the (ECCV18) VizWiz Grand Challenge: Answering Visual Questions from Blind People
User: denisdsh
visual-question-answering,PyTorch implementation of paper "Visual Concept-Metaconcept Learner", NeruIPS 2019
User: glaciohound
Home Page: http://vcml.csail.mit.edu
visual-question-answering,a collection of computer vision projects&tools. ่ฎก็ฎๆบ่ง่งๆนๅ้กน็ฎๅๅทฅๅ ท้ๅใ
User: hanxinzi-ai
visual-question-answering,Official code repository for "Meta Learning to Bridge Vision and Language Models for Multimodal Few-Shot Learning" (published at ICLR 2023).
User: ivonajdenkoska
visual-question-answering,Code for NeurIPS 2019 paper ``Self-Critical Reasoning for Robust Visual Question Answering''
User: jialinwu17
visual-question-answering,Bilinear attention networks for visual question answering
User: jnhwkim
visual-question-answering,Real-world photo sequence question answering system (MemexQA). CVPR'18 and TPAMI'19
User: junweiliang
Home Page: https://memexqa.cs.cmu.edu/
visual-question-answering,A Pytorch implementation of Attention on Attention module (both self and guided variants), for Visual Question Answering
User: lucidrains
visual-question-answering,Implementation of ๐ฆฉ Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch
User: lucidrains
visual-question-answering,Co-attending Regions and Detections for VQA.
User: lupantech
visual-question-answering,MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts
User: lupantech
Home Page: https://mathvista.github.io/
visual-question-answering,Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17
User: markdtw
visual-question-answering,A pytorch implementation for "A simple neural network module for relational reasoning", working on the CLEVR dataset
User: mesnico
visual-question-answering,Deep Modular Co-Attention Networks for Visual Question Answering
Organization: milvlg
visual-question-answering,A lightweight, scalable, and general framework for visual question answering research
Organization: milvlg
visual-question-answering,Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".
Organization: milvlg
Home Page: https://arxiv.org/abs/2303.01903
visual-question-answering,Large Language Models are Temporal and Causal Reasoners for Video Question Answering (EMNLP 2023)
Organization: mlvlab
Home Page: https://ikodoh.github.io/flipped_vqa_demo.html
visual-question-answering,This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"
Organization: mmmu-benchmark
Home Page: https://mmmu-benchmark.github.io/
visual-question-answering,This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"
Organization: mmstar-benchmark
Home Page: https://mmstar-benchmark.github.io
visual-question-answering,Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Organization: ofa-sys
visual-question-answering,
User: paarthneekhara
visual-question-answering,Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
User: peteanderson80
Home Page: http://panderson.me/up-down-attention/
visual-question-answering,Code to reproduce results in our ACL 2018 paper "Did the Model Understand the Question?"
User: pramodkaushik
visual-question-answering,[AAAI 2024] NuScenes-QA: A Multi-modal Visual Question Answering Benchmark for Autonomous Driving Scenario.
User: qiantianwen
visual-question-answering,[ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"
User: rentainhe
visual-question-answering,A collection of resources on applications of multi-modal learning in medical imaging.
User: richard-peng-xia
visual-question-answering,PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Organization: salesforce
visual-question-answering,[ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.
User: sdc17
Home Page: https://dachuanshi.com/UPop-Project/
visual-question-answering,CNN+LSTM, Attention based, and MUTAN-based models for Visual Question Answering
User: shivanshu-gupta
visual-question-answering,Korean Visual Question Answering
Organization: sktbrain
Home Page: https://sktbrain.github.io/KVQA/
visual-question-answering,Bottom-up features extractor implemented in PyTorch.
User: violetteshev
visual-question-answering,TensorFlow implementation of the CNN-LSTM, Relation Network and text-only baselines for the paper "FigureQA: An Annotated Figure Dataset for Visual Reasoning"
User: vmichals
visual-question-answering,The Easy Visual Question Answering dataset.
User: vzhou842
Home Page: https://pypi.org/project/easy-vqa/
visual-question-answering,X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).
User: yehli
visual-question-answering,TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering
User: yushi-hu
Home Page: https://tifa-benchmark.github.io/
visual-question-answering,Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": UNITER adversarial training part
User: zhegan27
Home Page: https://arxiv.org/pdf/2006.06195.pdf
visual-question-answering,Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey
Organization: zjukg
Home Page: http://arxiv.org/abs/2402.05391
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.