Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with visual-question-answering

A curated list of projects in awesome lists tagged with visual-question-answering .

https://github.com/salesforce/blip

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

image-captioning image-text-retrieval vision-and-language-pre-training vision-language vision-language-transformer visual-question-answering visual-reasoning

Last synced: 30 Sep 2024

https://github.com/salesforce/BLIP

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

image-captioning image-text-retrieval vision-and-language-pre-training vision-language vision-language-transformer visual-question-answering visual-reasoning

Last synced: 31 Jul 2024

https://github.com/ofa-sys/ofa

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

chinese image-captioning multimodal pretrained-models pretraining prompt prompt-tuning referring-expression-comprehension text-to-image-synthesis vision-language visual-question-answering

Last synced: 30 Sep 2024

https://github.com/OFA-Sys/OFA

Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

chinese image-captioning multimodal pretrained-models pretraining prompt prompt-tuning referring-expression-comprehension text-to-image-synthesis vision-language visual-question-answering

Last synced: 01 Aug 2024

https://github.com/peteanderson80/bottom-up-attention

Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome

caffe captioning-images faster-rcnn image-captioning mscoco mscoco-dataset visual-question-answering vqa

Last synced: 30 Sep 2024

https://github.com/lucidrains/flamingo-pytorch

Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

artificial-intelligence attention-mechanism deep-learning transformers visual-question-answering

Last synced: 03 Oct 2024

https://github.com/yehli/xmodaler

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).

cross-modal-retrieval image-captioning pretraining tden video-captioning vision-and-language visual-question-answering

Last synced: 30 Sep 2024

https://github.com/YehLi/xmodaler

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).

cross-modal-retrieval image-captioning pretraining tden video-captioning vision-and-language visual-question-answering

Last synced: 01 Aug 2024

https://github.com/jnhwkim/ban-vqa

Bilinear attention networks for visual question answering

attention bilinear-pooling pytorch-implmention visual-question-answering

Last synced: 01 Aug 2024

https://github.com/davidmascharka/tbd-nets

PyTorch implementation of "Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning"

deep-learning machine-learning neural-networks pytorch visual-question-answering visualization vqa

Last synced: 07 Aug 2024

https://github.com/lupantech/MathVista

MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts

ai4math large-language-models large-multimadality-models machine-learning mathematics mathqa science visual-question-answering

Last synced: 31 Jul 2024

https://github.com/qiantianwen/NuScenes-QA

[AAAI 2024] NuScenes-QA: A Multi-modal Visual Question Answering Benchmark for Autonomous Driving Scenario.

autonomous-driving vision-language visual-question-answering

Last synced: 31 Jul 2024

https://github.com/zhegan27/VILLA

Research Code for NeurIPS 2020 Spotlight paper "Large-Scale Adversarial Training for Vision-and-Language Representation Learning": UNITER adversarial training part

adversarial-training neurips-2020 pretraining vision-and-language visual-question-answering

Last synced: 08 Aug 2024

https://github.com/China-UK-ZSL/ZS-F-VQA

[Paper][ISWC 2021] Zero-shot Visual Question Answering using Knowledge Graph

commonsense commonsense-reasoning fvqa knowledge-graph visual-question-answering vqa zero-shot zs-f-vqa zsl

Last synced: 08 Aug 2024

https://github.com/lucidrains/aoa-pytorch

A Pytorch implementation of Attention on Attention module (both self and guided variants), for Visual Question Answering

attention attention-mechanism captioning visual-question-answering vqa

Last synced: 03 Oct 2024

https://github.com/simonesartoni/anndl-visual-questioning

Third project of course "Artificial Neural Networks and Deep Learning" attended during Master Degree at Polimi and concerning the creation a Neural Network for visual question answering problem using Dataset VQA. Authors: Simone Sartoni, Mattia Surricchio

artificial-neural-networks ipynb-jupyter-notebook visual-question-answering

Last synced: 27 Sep 2024

https://github.com/kritiksoman/relation-network

IPython Notebook showing pytorch implementation of Google DeepMind paper on Relation Network

ipynb-jupyter-notebook neural-networks nips-2017 pytorch-implementation visual-question-answering

Last synced: 27 Sep 2024