Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Awesome-Multimodal-Research
A curated list of Multimodal Related Research.
https://github.com/Eurus-Holmes/Awesome-Multimodal-Research
Last synced: 5 days ago
JSON representation
-
News
- 03/2023 - to-date information, run computations, or use third-party services. https://openai.com/blog/chatgpt-plugins*
- 03/2023
- 03/2023 - world scenarios, exhibits human-level performance on various professional and academic benchmarks. https://openai.com/research/gpt-4*
- 03/2023 - e-embodied-multimodal-language.html*
- 03/2023 - chatgpt-and-whisper-apis*
- 02/2023 - context learning for not only language tasks but also multimodal tasks. https://github.com/microsoft/unilm#llm--mllm-multimodal-llm*
- 11/2022 - following), which is trained to follow an instruction in a prompt and provide a detailed response. https://openai.com/blog/chatgpt*
- 04/2022 - e-2/*
- 05/2021 - mum/*
- 03/2021 - neurons/*
- 01/2021 - E](https://openai.com/blog/dall-e/) creates new images from text. A step toward systems with deeper understanding of the world. https://openai.com/multimodal/*
- 08/2022 - 3](https://arxiv.org/abs/2208.10442) is a general-purpose multimodal foundation model, which achieves state-of-the-art transfer performance on both vision and vision-language tasks. https://github.com/microsoft/unilm/tree/master/beit*
- 11/2022 - following), which is trained to follow an instruction in a prompt and provide a detailed response. https://openai.com/blog/chatgpt*
- web browser - plugins#code-interpreter). We’ve also open-sourced the code for a knowledge base [retrieval plugin](https://github.com/openai/chatgpt-retrieval-plugin), to be self-hosted by any developer with information with which they’d like to augment ChatGPT."*
- 02/2023 - context learning for not only language tasks but also multimodal tasks. https://github.com/microsoft/unilm#llm--mllm-multimodal-llm*
- 04/2022 - e-2/*
- 01/2021 - E](https://openai.com/blog/dall-e/) creates new images from text. A step toward systems with deeper understanding of the world. https://openai.com/multimodal/*
- 01/2023 - research-2022-beyond-language.html*
-
Recent Workshop
- Social Intelligence in Humans and Robots
- LANTERN 2021 - world kNowledge, EACL 2021
- Multimodal Learning and Applications - ai.org/), [Language for 3D Scenes](http://language3dscenes.github.io/), CVPR 2021
- Advances in Language and Vision Research (ALVR)
- Visually Grounded Interaction and Language (ViGIL)
- Wordplay: When Language Meets Games
- NLP Beyond Text
- International Challenge on Compositional and Multimodal Perception
- Multimodal Video Analysis Workshop and Moments in Time Challenge
- Video Turing Test: Toward Human-Level Video Story Understanding
- Grand Challenge and Workshop on Human Multimodal Language
- Workshop on Multimodal Learning
- Language & Vision with applications to Video Understanding
- International Challenge on Activity Recognition (ActivityNet)
- The End-of-End-to-End A Video Understanding Pentathlon
- Towards Human-Centric Image/Video Synthesis, and the 4th Look Into Person (LIP) Challenge
- Visual Question Answering and Dialog
-
Recent Tutorial
- Tutorials on Multimodal Machine Learning
- Multi-modal Information Extraction from Text, Semi-structured, and Tabular Data on the Web (Cutting-edge)
- Neuro-Symbolic Visual Reasoning and Program Synthesis
- Large Scale Holistic Video Understanding
- A Comprehensive Tutorial on Video Modeling
- Achieving Common Ground in Multi-modal Dialogue (Cutting-edge)
- Recent Advances in Vision-and-Language Research
Categories
Sub Categories