Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with visual-question-answering
A curated list of projects in awesome lists tagged with visual-question-answering .
https://github.com/salesforce/blip
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
image-captioning image-text-retrieval vision-and-language-pre-training vision-language vision-language-transformer visual-question-answering visual-reasoning
Last synced: 30 Sep 2024
https://github.com/salesforce/BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
image-captioning image-text-retrieval vision-and-language-pre-training vision-language vision-language-transformer visual-question-answering visual-reasoning
Last synced: 31 Jul 2024
https://github.com/ofa-sys/ofa
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
chinese image-captioning multimodal pretrained-models pretraining prompt prompt-tuning referring-expression-comprehension text-to-image-synthesis vision-language visual-question-answering
Last synced: 30 Sep 2024
https://github.com/OFA-Sys/OFA
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
chinese image-captioning multimodal pretrained-models pretraining prompt prompt-tuning referring-expression-comprehension text-to-image-synthesis vision-language visual-question-answering
Last synced: 01 Aug 2024
https://github.com/peteanderson80/bottom-up-attention
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
caffe captioning-images faster-rcnn image-captioning mscoco mscoco-dataset visual-question-answering vqa
Last synced: 30 Sep 2024
https://github.com/lucidrains/flamingo-pytorch
Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch
artificial-intelligence attention-mechanism deep-learning transformers visual-question-answering
Last synced: 03 Oct 2024
https://github.com/yehli/xmodaler
X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).
cross-modal-retrieval image-captioning pretraining tden video-captioning vision-and-language visual-question-answering
Last synced: 30 Sep 2024
https://github.com/YehLi/xmodaler
X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).
cross-modal-retrieval image-captioning pretraining tden video-captioning vision-and-language visual-question-answering
Last synced: 01 Aug 2024
https://github.com/jnhwkim/ban-vqa
Bilinear attention networks for visual question answering
attention bilinear-pooling pytorch-implmention visual-question-answering
Last synced: 01 Aug 2024
https://github.com/davidmascharka/tbd-nets
PyTorch implementation of "Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning"
deep-learning machine-learning neural-networks pytorch visual-question-answering visualization vqa
Last synced: 07 Aug 2024
https://github.com/lupantech/MathVista
MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts
ai4math large-language-models large-multimadality-models machine-learning mathematics mathqa science visual-question-answering
Last synced: 31 Jul 2024
https://github.com/qiantianwen/NuScenes-QA
[AAAI 2024] NuScenes-QA: A Multi-modal Visual Question Answering Benchmark for Autonomous Driving Scenario.
autonomous-driving vision-language visual-question-answering
Last synced: 31 Jul 2024
https://github.com/zhegan27/VILLA
Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": UNITER adversarial training part
adversarial-training neurips-2020 pretraining vision-and-language visual-question-answering
Last synced: 08 Aug 2024
https://github.com/China-UK-ZSL/ZS-F-VQA
[Paper][ISWC 2021] Zero-shot Visual Question Answering using Knowledge Graph
commonsense commonsense-reasoning fvqa knowledge-graph visual-question-answering vqa zero-shot zs-f-vqa zsl
Last synced: 08 Aug 2024
https://github.com/ai-forever/fusion_brain_aij2021
Creating multimodal multitask models
bilingual handwritten-text-recognition java-to-python multimodal-fusion multitask visual-question-answering zero-shot-object-detection
Last synced: 02 Aug 2024
https://github.com/lucidrains/aoa-pytorch
A Pytorch implementation of Attention on Attention module (both self and guided variants), for Visual Question Answering
attention attention-mechanism captioning visual-question-answering vqa
Last synced: 03 Oct 2024
https://github.com/adrianbzg/llama-multimodal-vqa
Multimodal Instruction Tuning for Llama 3
chatbot chatgpt gpt-4 huggingface instruction-tuning language-models llama llama2 llama3 multimodal multimodal-instruction-tuning visual-language-learning visual-question-answering vqa
Last synced: 27 Sep 2024
https://github.com/abachaa/VQA-Med-2021
VQA-Med 2021
medical-imaging radiology visual-question-answering visual-question-generation vqa vqa-dataset vqa-med
Last synced: 01 Aug 2024
https://github.com/simonesartoni/anndl-visual-questioning
Third project of course "Artificial Neural Networks and Deep Learning" attended during Master Degree at Polimi and concerning the creation a Neural Network for visual question answering problem using Dataset VQA. Authors: Simone Sartoni, Mattia Surricchio
artificial-neural-networks ipynb-jupyter-notebook visual-question-answering
Last synced: 27 Sep 2024
https://github.com/kritiksoman/relation-network
IPython Notebook showing pytorch implementation of Google DeepMind paper on Relation Network
ipynb-jupyter-notebook neural-networks nips-2017 pytorch-implementation visual-question-answering
Last synced: 27 Sep 2024