An open API service indexing awesome lists of open source software.

https://github.com/hollobit/GenAI_LLM_timeline

ChatGPT, GenerativeAI and LLMs Timeline
https://github.com/hollobit/GenAI_LLM_timeline

agi chatgpt chatgpt-api claude copilot generative-ai generative-models gpt langchain large-language-models llama llm midjourney openai palm-e stable-diffusion timeline transformer vall-e

Last synced: about 1 year ago
JSON representation

ChatGPT, GenerativeAI and LLMs Timeline

Awesome Lists containing this project

README

          

# ChatGPT, GenerativeAI and LLMs Timeline

This repository organizes a timeline of key events (products, services, papers, GitHub, blog posts and news) that occurred before and after the ChatGPT announcement.

It's curating a variety of information in this timeline, with a particular focus on LLM and Generative AI.

Maybe it's a scene from the hottest history, so I thought it would be important to keep those memories well, so I organized them.

## Statistics

These diagrams were generated by ChatGPT's Code Interpreter.


## Contributing

Issues and Pull Requests are greatly appreciated. If you've never contributed to an open source project before I'm more than happy to walk you through how to create a pull request.

You can start by [opening an issue](https://github.com/hollobit/BCAC_timeline/issues/new) describing the problem that you're looking to resolve and we'll go from there.

## Emoji

arXiv :x:, PDF :paperclip:, arxiv-vanity :orange_book:, paper page :house:, papers with code :eight_spoked_asterisk:, Github :octocat:

## License

This document is licensed under the [MIT license](https://opensource.org/licenses/mit-license.php) © Jonghong Jeon(전종홍)

## Timeline V2

### 2024

* 05/17 - **OpenAI strikes Reddit deal to train its AI on your posts**
([News](https://www.theverge.com/2024/5/16/24158529/reddit-openai-chatgpt-api-access-advertising)),
* 05/17 - **OpenAI dissolves team focused on long-term AI risks, less than one year after announcing it**
([News](https://www.cnbc.com/2024/05/17/openai-superalignment-sutskever-leike.html)),
* 05/17 - **International Scientific Report on the Safety of Advanced AI**
([Blog](https://www.gov.uk/government/publications/international-scientific-report-on-the-safety-of-advanced-ai)),
* 05/16 - **TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction**
([:x:](https://arxiv.org/abs/2405.10315)), ([:book:](https://browse.arxiv.org/pdf/2405.10315.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.10315.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.10315)), ([:house:](https://huggingface.co/papers/2405.10315)), ([HTML](https://browse.arxiv.org/html/2405.10315v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.10315)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.10315v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.10315)), ([SS](https://api.semanticscholar.org/arXiv:2405.10315))
* 05/16 - **Toon3D: Seeing Cartoons from a New Perspective**
([:x:](https://arxiv.org/abs/2405.10320)), ([:book:](https://browse.arxiv.org/pdf/2405.10320.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.10320.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.10320)), ([:house:](https://huggingface.co/papers/2405.10320)), ([HTML](https://browse.arxiv.org/html/2405.10320v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.10320)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.10320v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.10320)), ([SS](https://api.semanticscholar.org/arXiv:2405.10320))
* 05/16 - **Testing the reliability of an AI-based large language model to extract ecological information from the scientific literature**
([News](https://www.nature.com/articles/s44185-024-00043-9)),
* 05/16 - **Many-Shot In-Context Learning in Multimodal Foundation Models**
([:x:](https://arxiv.org/abs/2405.09798)), ([:book:](https://browse.arxiv.org/pdf/2405.09798.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.09798.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.09798)), ([:house:](https://huggingface.co/papers/2405.09798)), ([HTML](https://browse.arxiv.org/html/2405.09798v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.09798)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.09798v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.09798)), ([SS](https://api.semanticscholar.org/arXiv:2405.09798))
* 05/16 - **How to Hit Pause on AI Before It’s Too Late**
([News](https://time.com/6978790/how-to-pause-artificial-intelligence/)),
* 05/16 - **Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection**
([:x:](https://arxiv.org/abs/2405.10300)), ([:book:](https://browse.arxiv.org/pdf/2405.10300.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.10300.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.10300)), ([:house:](https://huggingface.co/papers/2405.10300)), ([HTML](https://browse.arxiv.org/html/2405.10300v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.10300)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.10300v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.10300)), ([SS](https://api.semanticscholar.org/arXiv:2405.10300))
* 05/16 - **GPT Store Mining and Analysis**
([:x:](https://arxiv.org/abs/2405.10210)), ([:book:](https://browse.arxiv.org/pdf/2405.10210.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.10210.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.10210)), ([:house:](https://huggingface.co/papers/2405.10210)), ([HTML](https://browse.arxiv.org/html/2405.10210v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.10210)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.10210v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.10210)), ([SS](https://api.semanticscholar.org/arXiv:2405.10210))
* 05/16 - **Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion**
([:x:](https://arxiv.org/abs/2405.09874)), ([:book:](https://browse.arxiv.org/pdf/2405.09874.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.09874.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.09874)), ([:house:](https://huggingface.co/papers/2405.09874)), ([HTML](https://browse.arxiv.org/html/2405.09874v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.09874)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.09874v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.09874)), ([SS](https://api.semanticscholar.org/arXiv:2405.09874))
* 05/16 - **Chameleon: Mixed-Modal Early-Fusion Foundation Models**
([:x:](https://arxiv.org/abs/2405.09818)), ([:book:](https://browse.arxiv.org/pdf/2405.09818.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.09818.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.09818)), ([:house:](https://huggingface.co/papers/2405.09818)), ([HTML](https://browse.arxiv.org/html/2405.09818v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.09818)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.09818v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.09818)), ([SS](https://api.semanticscholar.org/arXiv:2405.09818))
* 05/16 - **CAT3D: Create Anything in 3D with Multi-View Diffusion Models**
([:x:](https://arxiv.org/abs/2405.10314)), ([:book:](https://browse.arxiv.org/pdf/2405.10314.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.10314.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.10314)), ([:house:](https://huggingface.co/papers/2405.10314)), ([HTML](https://browse.arxiv.org/html/2405.10314v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.10314)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.10314v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.10314)), ([SS](https://api.semanticscholar.org/arXiv:2405.10314))
* 05/15 - **Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model**
([:x:](https://arxiv.org/abs/2405.09215)), ([:book:](https://browse.arxiv.org/pdf/2405.09215.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.09215.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.09215)), ([:house:](https://huggingface.co/papers/2405.09215)), ([HTML](https://browse.arxiv.org/html/2405.09215v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.09215)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.09215v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.09215)), ([SS](https://api.semanticscholar.org/arXiv:2405.09215))
* 05/15 - **LoRA Learns Less and Forgets Less**
([:x:](https://arxiv.org/abs/2405.09673)), ([:book:](https://browse.arxiv.org/pdf/2405.09673.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.09673.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.09673)), ([:house:](https://huggingface.co/papers/2405.09673)), ([HTML](https://browse.arxiv.org/html/2405.09673v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.09673)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.09673v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.09673)), ([SS](https://api.semanticscholar.org/arXiv:2405.09673))
* 05/15 - **Google’s invisible AI watermark will help identify generative text and video**
([News](https://www.theverge.com/2024/5/14/24155927/google-ai-synthid-watermark-text-video-io)),
* 05/15 - **Google I/O 2024: everything announced**
([Blog](https://www.theverge.com/24153841/google-io-2024-ai-gemini-android-chrome-photos)),
* 05/15 - **BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation**
([:x:](https://arxiv.org/abs/2405.09546)), ([:book:](https://browse.arxiv.org/pdf/2405.09546.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.09546.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.09546)), ([:house:](https://huggingface.co/papers/2405.09546)), ([HTML](https://browse.arxiv.org/html/2405.09546v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.09546)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.09546v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.09546)), ([SS](https://api.semanticscholar.org/arXiv:2405.09546))
* 05/15 - **ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models**
([:x:](https://arxiv.org/abs/2405.09220)), ([:book:](https://browse.arxiv.org/pdf/2405.09220.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.09220.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.09220)), ([:house:](https://huggingface.co/papers/2405.09220)), ([HTML](https://browse.arxiv.org/html/2405.09220v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.09220)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.09220v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.09220)), ([SS](https://api.semanticscholar.org/arXiv:2405.09220))
* 05/14 - **Understanding the performance gap between online and offline alignment algorithms**
([:x:](https://arxiv.org/abs/2405.08448)), ([:book:](https://browse.arxiv.org/pdf/2405.08448.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08448.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08448)), ([:house:](https://huggingface.co/papers/2405.08448)), ([HTML](https://browse.arxiv.org/html/2405.08448v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.08448)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08448v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08448)), ([SS](https://api.semanticscholar.org/arXiv:2405.08448))
* 05/14 - **SpeechVerse: A Large-scale Generalizable Audio Language Model**
([:x:](https://arxiv.org/abs/2405.08295)), ([:book:](https://browse.arxiv.org/pdf/2405.08295.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08295.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08295)), ([:house:](https://huggingface.co/papers/2405.08295)), ([HTML](https://browse.arxiv.org/html/2405.08295v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.08295)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08295v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08295)), ([SS](https://api.semanticscholar.org/arXiv:2405.08295))
* 05/14 - **SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models**
([:x:](https://arxiv.org/abs/2405.08317)), ([:book:](https://browse.arxiv.org/pdf/2405.08317.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08317.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08317)), ([:house:](https://huggingface.co/papers/2405.08317)), ([HTML](https://browse.arxiv.org/html/2405.08317v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.08317)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08317v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08317)), ([SS](https://api.semanticscholar.org/arXiv:2405.08317))
* 05/14 - **No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding**
([:x:](https://arxiv.org/abs/2405.08344)), ([:book:](https://browse.arxiv.org/pdf/2405.08344.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08344.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08344)), ([:house:](https://huggingface.co/papers/2405.08344)), ([HTML](https://browse.arxiv.org/html/2405.08344v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.08344)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08344v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08344)), ([SS](https://api.semanticscholar.org/arXiv:2405.08344))
* 05/14 - **Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding**
([:x:](https://arxiv.org/abs/2405.08748)), ([:book:](https://browse.arxiv.org/pdf/2405.08748.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08748.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08748)), ([:house:](https://huggingface.co/papers/2405.08748)), ([HTML](https://browse.arxiv.org/html/2405.08748v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.08748)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08748v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08748)), ([SS](https://api.semanticscholar.org/arXiv:2405.08748))
* 05/14 - **Compositional Text-to-Image Generation with Dense Blob Representations**
([:x:](https://arxiv.org/abs/2405.08246)), ([:book:](https://browse.arxiv.org/pdf/2405.08246.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08246.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08246)), ([:house:](https://huggingface.co/papers/2405.08246)), ([HTML](https://browse.arxiv.org/html/2405.08246v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.08246)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08246v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08246)), ([SS](https://api.semanticscholar.org/arXiv:2405.08246))
* 05/14 - **Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory**
([:x:](https://arxiv.org/abs/2405.08707)), ([:book:](https://browse.arxiv.org/pdf/2405.08707.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08707.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08707)), ([:house:](https://huggingface.co/papers/2405.08707)), ([HTML](https://browse.arxiv.org/html/2405.08707v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.08707)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08707v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08707)), ([SS](https://api.semanticscholar.org/arXiv:2405.08707))
* 05/13 - **SambaNova SN40L: Scaling the AI Memory Wall with Dataflow and Composition of Experts**
([:x:](https://arxiv.org/abs/2405.07518)), ([:book:](https://browse.arxiv.org/pdf/2405.07518.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.07518.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.07518)), ([:house:](https://huggingface.co/papers/2405.07518)), ([HTML](https://browse.arxiv.org/html/2405.07518v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.07518)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.07518v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.07518)), ([SS](https://api.semanticscholar.org/arXiv:2405.07518))
* 05/13 - **RLHF Workflow: From Reward Modeling to Online RLHF**
([:x:](https://arxiv.org/abs/2405.07863)), ([:book:](https://browse.arxiv.org/pdf/2405.07863.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.07863.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.07863)), ([:house:](https://huggingface.co/papers/2405.07863)), ([HTML](https://browse.arxiv.org/html/2405.07863v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.07863)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.07863v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.07863)), ([SS](https://api.semanticscholar.org/arXiv:2405.07863))
* 05/13 - **Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots**
([:x:](https://arxiv.org/abs/2405.07990)), ([:book:](https://browse.arxiv.org/pdf/2405.07990.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.07990.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.07990)), ([:house:](https://huggingface.co/papers/2405.07990)), ([HTML](https://browse.arxiv.org/html/2405.07990v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.07990)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.07990v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.07990)), ([SS](https://api.semanticscholar.org/arXiv:2405.07990))
* 05/13 - **OpenAI unveils newest AI model, GPT-4o**
([News](https://edition.cnn.com/2024/05/13/tech/openai-altman-new-ai-model-gpt-4o/index.html)),
* 05/13 - **MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels**
([:x:](https://arxiv.org/abs/2405.07526)), ([:book:](https://browse.arxiv.org/pdf/2405.07526.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.07526.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.07526)), ([:house:](https://huggingface.co/papers/2405.07526)), ([HTML](https://browse.arxiv.org/html/2405.07526v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.07526)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.07526v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.07526)), ([SS](https://api.semanticscholar.org/arXiv:2405.07526))
* 05/13 - **How Much Research Is Being Written by Large Language Models?**
([Blog](https://hai.stanford.edu/news/how-much-research-being-written-large-language-models)),
* 05/13 - **Hello GPT-4o**
([Blog](https://openai.com/index/hello-gpt-4o/)),
* 05/13 - **Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning**
([:x:](https://arxiv.org/abs/2405.08054)), ([:book:](https://browse.arxiv.org/pdf/2405.08054.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08054.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08054)), ([:house:](https://huggingface.co/papers/2405.08054)), ([HTML](https://browse.arxiv.org/html/2405.08054v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.08054)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08054v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08054)), ([SS](https://api.semanticscholar.org/arXiv:2405.08054))
* 05/11 - **Piccolo2: General Text Embedding with Multi-task Hybrid Loss Training**
([:x:](https://arxiv.org/abs/2405.06932)), ([:book:](https://browse.arxiv.org/pdf/2405.06932.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.06932.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.06932)), ([:house:](https://huggingface.co/papers/2405.06932)), ([HTML](https://browse.arxiv.org/html/2405.06932v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.06932)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.06932v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.06932)), ([SS](https://api.semanticscholar.org/arXiv:2405.06932))
* 05/11 - **LogoMotion: Visually Grounded Code Generation for Content-Aware Animation**
([:x:](https://arxiv.org/abs/2405.07065)), ([:book:](https://browse.arxiv.org/pdf/2405.07065.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.07065.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.07065)), ([:house:](https://huggingface.co/papers/2405.07065)), ([HTML](https://browse.arxiv.org/html/2405.07065v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.07065)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.07065v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.07065)), ([SS](https://api.semanticscholar.org/arXiv:2405.07065))
* 05/10 - **INSPECT - An open-source framework for large language model evaluations**
([Blog](https://ukgovernmentbeis.github.io/inspect_ai/)),
* 05/10 - **AI Safety Institute releases new AI safety evaluations platform**
([News](https://www.gov.uk/government/news/ai-safety-institute-releases-new-ai-safety-evaluations-platform)),
* 05/07 - **SUTRA: Scalable Multilingual Language Model Architecture**
([:x:](https://arxiv.org/abs/2405.06694)), ([:book:](https://browse.arxiv.org/pdf/2405.06694.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.06694.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.06694)), ([:house:](https://huggingface.co/papers/2405.06694)), ([HTML](https://browse.arxiv.org/html/2405.06694v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.06694)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.06694v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.06694)), ([SS](https://api.semanticscholar.org/arXiv:2405.06694))
* 05/07 - **Meta Releases Llama 3 Open-Source LLM**
([News](https://www.infoq.com/news/2024/05/meta-llama-3/)),
* 05/03 - **What matters when building vision-language models?**
([:x:](https://arxiv.org/abs/2405.02246)), ([:book:](https://browse.arxiv.org/pdf/2405.02246.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.02246.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.02246)), ([:house:](https://huggingface.co/papers/2405.02246)), ([HTML](https://browse.arxiv.org/html/2405.02246v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.02246)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.02246v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.02246)), ([SS](https://api.semanticscholar.org/arXiv:2405.02246))
* 05/02 - **WildChat: 1M ChatGPT Interaction Logs in the Wild**
([:x:](https://arxiv.org/abs/2405.01470)), ([:book:](https://browse.arxiv.org/pdf/2405.01470.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.01470.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.01470)), ([:house:](https://huggingface.co/papers/2405.01470)), ([HTML](https://browse.arxiv.org/html/2405.01470v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.01470)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.01470v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.01470)), ([SS](https://api.semanticscholar.org/arXiv:2405.01470))
* 05/02 - **StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation**
([:x:](https://arxiv.org/abs/2405.01434)), ([:book:](https://browse.arxiv.org/pdf/2405.01434.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.01434.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.01434)), ([:house:](https://huggingface.co/papers/2405.01434)), ([HTML](https://browse.arxiv.org/html/2405.01434v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.01434)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.01434v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.01434)), ([SS](https://api.semanticscholar.org/arXiv:2405.01434))
* 05/02 - **Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models**
([:x:](https://arxiv.org/abs/2405.01535)), ([:book:](https://browse.arxiv.org/pdf/2405.01535.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.01535.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.01535)), ([:house:](https://huggingface.co/papers/2405.01535)), ([HTML](https://browse.arxiv.org/html/2405.01535v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.01535)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.01535v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.01535)), ([SS](https://api.semanticscholar.org/arXiv:2405.01535))
* 05/02 - **NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment**
([:x:](https://arxiv.org/abs/2405.01481)), ([:book:](https://browse.arxiv.org/pdf/2405.01481.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.01481.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.01481)), ([:house:](https://huggingface.co/papers/2405.01481)), ([HTML](https://browse.arxiv.org/html/2405.01481v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.01481)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.01481v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.01481)), ([SS](https://api.semanticscholar.org/arXiv:2405.01481))
* 05/02 - **LLM-AD: Large Language Model based Audio Description System**
([:x:](https://arxiv.org/abs/2405.00983)), ([:book:](https://browse.arxiv.org/pdf/2405.00983.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00983.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00983)), ([:house:](https://huggingface.co/papers/2405.00983)), ([HTML](https://browse.arxiv.org/html/2405.00983v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.00983)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00983v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00983)), ([SS](https://api.semanticscholar.org/arXiv:2405.00983))
* 05/02 - **FLAME: Factuality-Aware Alignment for Large Language Models**
([:x:](https://arxiv.org/abs/2405.01525)), ([:book:](https://browse.arxiv.org/pdf/2405.01525.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.01525.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.01525)), ([:house:](https://huggingface.co/papers/2405.01525)), ([HTML](https://browse.arxiv.org/html/2405.01525v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.01525)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.01525v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.01525)), ([SS](https://api.semanticscholar.org/arXiv:2405.01525))
* 05/02 - **Customizing Text-to-Image Models with a Single Image Pair**
([:x:](https://arxiv.org/abs/2405.01536)), ([:book:](https://browse.arxiv.org/pdf/2405.01536.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.01536.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.01536)), ([:house:](https://huggingface.co/papers/2405.01536)), ([HTML](https://browse.arxiv.org/html/2405.01536v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.01536)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.01536v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.01536)), ([SS](https://api.semanticscholar.org/arXiv:2405.01536))
* 05/01 - **Spectrally Pruned Gaussian Fields with Neural Compensation**
([:x:](https://arxiv.org/abs/2405.00676)), ([:book:](https://browse.arxiv.org/pdf/2405.00676.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00676.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00676)), ([:house:](https://huggingface.co/papers/2405.00676)), ([HTML](https://browse.arxiv.org/html/2405.00676v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.00676)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00676v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00676)), ([SS](https://api.semanticscholar.org/arXiv:2405.00676))
* 05/01 - **Self-Play Preference Optimization for Language Model Alignment**
([:x:](https://arxiv.org/abs/2405.00675)), ([:book:](https://browse.arxiv.org/pdf/2405.00675.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00675.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00675)), ([:house:](https://huggingface.co/papers/2405.00675)), ([HTML](https://browse.arxiv.org/html/2405.00675v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.00675)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00675v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00675)), ([SS](https://api.semanticscholar.org/arXiv:2405.00675))
* 05/01 - **Is Bigger Edit Batch Size Always Better? -- An Empirical Study on Model Editing with Llama-3**
([:x:](https://arxiv.org/abs/2405.00664)), ([:book:](https://browse.arxiv.org/pdf/2405.00664.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00664.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00664)), ([:house:](https://huggingface.co/papers/2405.00664)), ([HTML](https://browse.arxiv.org/html/2405.00664v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.00664)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00664v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00664)), ([SS](https://api.semanticscholar.org/arXiv:2405.00664))
* 05/01 - **Clover: Regressive Lightweight Speculative Decoding with Sequential Knowledge**
([:x:](https://arxiv.org/abs/2405.00263)), ([:book:](https://browse.arxiv.org/pdf/2405.00263.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00263.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00263)), ([:house:](https://huggingface.co/papers/2405.00263)), ([HTML](https://browse.arxiv.org/html/2405.00263v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.00263)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00263v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00263)), ([SS](https://api.semanticscholar.org/arXiv:2405.00263))
* 05/01 - **A Careful Examination of Large Language Model Performance on Grade School Arithmetic**
([:x:](https://arxiv.org/abs/2405.00332)), ([:book:](https://browse.arxiv.org/pdf/2405.00332.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00332.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00332)), ([:house:](https://huggingface.co/papers/2405.00332)), ([HTML](https://browse.arxiv.org/html/2405.00332v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.00332)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00332v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00332)), ([SS](https://api.semanticscholar.org/arXiv:2405.00332))
* 04/30 - **Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation**
([:x:](https://arxiv.org/abs/2404.19752)), ([:book:](https://browse.arxiv.org/pdf/2404.19752.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19752.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19752)), ([:house:](https://huggingface.co/papers/2404.19752)), ([HTML](https://browse.arxiv.org/html/2404.19752v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19752)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19752v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19752)), ([SS](https://api.semanticscholar.org/arXiv:2404.19752))
* 04/30 - **STT: Stateful Tracking with Transformers for Autonomous Driving**
([:x:](https://arxiv.org/abs/2405.00236)), ([:book:](https://browse.arxiv.org/pdf/2405.00236.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00236.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00236)), ([:house:](https://huggingface.co/papers/2405.00236)), ([HTML](https://browse.arxiv.org/html/2405.00236v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.00236)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00236v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00236)), ([SS](https://api.semanticscholar.org/arXiv:2405.00236))
* 04/30 - **SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound**
([:x:](https://arxiv.org/abs/2405.00233)), ([:book:](https://browse.arxiv.org/pdf/2405.00233.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00233.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00233)), ([:house:](https://huggingface.co/papers/2405.00233)), ([HTML](https://browse.arxiv.org/html/2405.00233v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.00233)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00233v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00233)), ([SS](https://api.semanticscholar.org/arXiv:2405.00233))
* 04/30 - **Octopus v4: Graph of language models**
([:x:](https://arxiv.org/abs/2404.19296)), ([:book:](https://browse.arxiv.org/pdf/2404.19296.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19296.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19296)), ([:house:](https://huggingface.co/papers/2404.19296)), ([HTML](https://browse.arxiv.org/html/2404.19296v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19296)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19296v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19296)), ([SS](https://api.semanticscholar.org/arXiv:2404.19296))
* 04/30 - **MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model**
([:x:](https://arxiv.org/abs/2404.19759)), ([:book:](https://browse.arxiv.org/pdf/2404.19759.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19759.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19759)), ([:house:](https://huggingface.co/papers/2404.19759)), ([HTML](https://browse.arxiv.org/html/2404.19759v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19759)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19759v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19759)), ([SS](https://api.semanticscholar.org/arXiv:2404.19759))
* 04/30 - **MicroDreamer: Zero-shot 3D Generation in sim20 Seconds by Score-based Iterative Reconstruction**
([:x:](https://arxiv.org/abs/2404.19525)), ([:book:](https://browse.arxiv.org/pdf/2404.19525.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19525.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19525)), ([:house:](https://huggingface.co/papers/2404.19525)), ([HTML](https://browse.arxiv.org/html/2404.19525v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19525)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19525v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19525)), ([SS](https://api.semanticscholar.org/arXiv:2404.19525))
* 04/30 - **Lightplane: Highly-Scalable Components for Neural 3D Fields**
([:x:](https://arxiv.org/abs/2404.19760)), ([:book:](https://browse.arxiv.org/pdf/2404.19760.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19760.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19760)), ([:house:](https://huggingface.co/papers/2404.19760)), ([HTML](https://browse.arxiv.org/html/2404.19760v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19760)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19760v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19760)), ([SS](https://api.semanticscholar.org/arXiv:2404.19760))
* 04/30 - **KAN: Kolmogorov-Arnold Networks**
([:x:](https://arxiv.org/abs/2404.19756)), ([:book:](https://browse.arxiv.org/pdf/2404.19756.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19756.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19756)), ([:house:](https://huggingface.co/papers/2404.19756)), ([HTML](https://browse.arxiv.org/html/2404.19756v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19756)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19756v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19756)), ([SS](https://api.semanticscholar.org/arXiv:2404.19756))
* 04/30 - **Iterative Reasoning Preference Optimization**
([:x:](https://arxiv.org/abs/2404.19733)), ([:book:](https://browse.arxiv.org/pdf/2404.19733.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19733.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19733)), ([:house:](https://huggingface.co/papers/2404.19733)), ([HTML](https://browse.arxiv.org/html/2404.19733v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19733)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19733v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19733)), ([SS](https://api.semanticscholar.org/arXiv:2404.19733))
* 04/30 - **Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting**
([:x:](https://arxiv.org/abs/2404.19758)), ([:book:](https://browse.arxiv.org/pdf/2404.19758.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19758.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19758)), ([:house:](https://huggingface.co/papers/2404.19758)), ([HTML](https://browse.arxiv.org/html/2404.19758v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19758)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19758v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19758)), ([SS](https://api.semanticscholar.org/arXiv:2404.19758))
* 04/30 - **InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation**
([:x:](https://arxiv.org/abs/2404.19427)), ([:book:](https://browse.arxiv.org/pdf/2404.19427.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19427.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19427)), ([:house:](https://huggingface.co/papers/2404.19427)), ([HTML](https://browse.arxiv.org/html/2404.19427v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19427)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19427v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19427)), ([SS](https://api.semanticscholar.org/arXiv:2404.19427))
* 04/30 - **GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting**
([:x:](https://arxiv.org/abs/2404.19702)), ([:book:](https://browse.arxiv.org/pdf/2404.19702.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19702.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19702)), ([:house:](https://huggingface.co/papers/2404.19702)), ([HTML](https://browse.arxiv.org/html/2404.19702v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19702)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19702v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19702)), ([SS](https://api.semanticscholar.org/arXiv:2404.19702))
* 04/30 - **Extending Llama-3's Context Ten-Fold Overnight**
([:x:](https://arxiv.org/abs/2404.19553)), ([:book:](https://browse.arxiv.org/pdf/2404.19553.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19553.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19553)), ([:house:](https://huggingface.co/papers/2404.19553)), ([HTML](https://browse.arxiv.org/html/2404.19553v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19553)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19553v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19553)), ([SS](https://api.semanticscholar.org/arXiv:2404.19553))
* 04/30 - **DOCCI: Descriptions of Connected and Contrasting Images**
([:x:](https://arxiv.org/abs/2404.19753)), ([:book:](https://browse.arxiv.org/pdf/2404.19753.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19753.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19753)), ([:house:](https://huggingface.co/papers/2404.19753)), ([HTML](https://browse.arxiv.org/html/2404.19753v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19753)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19753v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19753)), ([SS](https://api.semanticscholar.org/arXiv:2404.19753))
* 04/30 - **Better & Faster Large Language Models via Multi-token Prediction**
([:x:](https://arxiv.org/abs/2404.19737)), ([:book:](https://browse.arxiv.org/pdf/2404.19737.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19737.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19737)), ([:house:](https://huggingface.co/papers/2404.19737)), ([HTML](https://browse.arxiv.org/html/2404.19737v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19737)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19737v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19737)), ([SS](https://api.semanticscholar.org/arXiv:2404.19737))
* 04/29 - **Stylus: Automatic Adapter Selection for Diffusion Models**
([:x:](https://arxiv.org/abs/2404.18928)), ([:book:](https://browse.arxiv.org/pdf/2404.18928.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.18928.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.18928)), ([:house:](https://huggingface.co/papers/2404.18928)), ([HTML](https://browse.arxiv.org/html/2404.18928v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.18928)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.18928v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.18928)), ([SS](https://api.semanticscholar.org/arXiv:2404.18928))
* 04/29 - **SAGS: Structure-Aware 3D Gaussian Splatting**
([:x:](https://arxiv.org/abs/2404.19149)), ([:book:](https://browse.arxiv.org/pdf/2404.19149.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19149.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19149)), ([:house:](https://huggingface.co/papers/2404.19149)), ([HTML](https://browse.arxiv.org/html/2404.19149v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19149)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19149v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19149)), ([SS](https://api.semanticscholar.org/arXiv:2404.19149))
* 04/29 - **Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models**
([:x:](https://arxiv.org/abs/2404.18796)), ([:book:](https://browse.arxiv.org/pdf/2404.18796.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.18796.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.18796)), ([:house:](https://huggingface.co/papers/2404.18796)), ([HTML](https://browse.arxiv.org/html/2404.18796v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.18796)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.18796v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.18796)), ([SS](https://api.semanticscholar.org/arXiv:2404.18796))
* 04/29 - **NIST AI RMF Generative AI Profile**
([News](https://www.nist.gov/news-events/news/2024/04/department-commerce-announces-new-actions-implement-president-bidens)),
* 04/29 - **LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report**
([:x:](https://arxiv.org/abs/2405.00732)), ([:book:](https://browse.arxiv.org/pdf/2405.00732.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00732.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00732)), ([:house:](https://huggingface.co/papers/2405.00732)), ([HTML](https://browse.arxiv.org/html/2405.00732v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.00732)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00732v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00732)), ([SS](https://api.semanticscholar.org/arXiv:2405.00732))
* 04/29 - **Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting**
([:x:](https://arxiv.org/abs/2404.18911)), ([:book:](https://browse.arxiv.org/pdf/2404.18911.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.18911.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.18911)), ([:house:](https://huggingface.co/papers/2404.18911)), ([HTML](https://browse.arxiv.org/html/2404.18911v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.18911)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.18911v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.18911)), ([SS](https://api.semanticscholar.org/arXiv:2404.18911))
* 04/29 - **Capabilities of Gemini Models in Medicine**
([:x:](https://arxiv.org/abs/2404.18416)), ([:book:](https://browse.arxiv.org/pdf/2404.18416.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.18416.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.18416)), ([:house:](https://huggingface.co/papers/2404.18416)), ([HTML](https://browse.arxiv.org/html/2404.18416v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.18416)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.18416v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.18416)), ([SS](https://api.semanticscholar.org/arXiv:2404.18416))
* 04/28 - **Paint by Inpaint: Learning to Add Image Objects by Removing Them First**
([:x:](https://arxiv.org/abs/2404.18212)), ([:book:](https://browse.arxiv.org/pdf/2404.18212.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.18212.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.18212)), ([:house:](https://huggingface.co/papers/2404.18212)), ([HTML](https://browse.arxiv.org/html/2404.18212v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.18212)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.18212v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.18212)), ([SS](https://api.semanticscholar.org/arXiv:2404.18212))
* 04/28 - **LEGENT: Open Platform for Embodied Agents**
([:x:](https://arxiv.org/abs/2404.18243)), ([:book:](https://browse.arxiv.org/pdf/2404.18243.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.18243.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.18243)), ([:house:](https://huggingface.co/papers/2404.18243)), ([HTML](https://browse.arxiv.org/html/2404.18243v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.18243)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.18243v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.18243)), ([SS](https://api.semanticscholar.org/arXiv:2404.18243))
* 04/27 - **Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations**
([:x:](https://arxiv.org/abs/2404.17521)), ([:book:](https://browse.arxiv.org/pdf/2404.17521.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.17521.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.17521)), ([:house:](https://huggingface.co/papers/2404.17521)), ([HTML](https://browse.arxiv.org/html/2404.17521v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.17521)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.17521v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.17521)), ([SS](https://api.semanticscholar.org/arXiv:2404.17521))
* 04/26 - **MaPa: Text-driven Photorealistic Material Painting for 3D Shapes**
([:x:](https://arxiv.org/abs/2404.17569)), ([:book:](https://browse.arxiv.org/pdf/2404.17569.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.17569.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.17569)), ([:house:](https://huggingface.co/papers/2404.17569)), ([HTML](https://browse.arxiv.org/html/2404.17569v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.17569)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.17569v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.17569)), ([SS](https://api.semanticscholar.org/arXiv:2404.17569))
* 04/26 - **BlenderAlchemy: Editing 3D Graphics with Vision-Language Models**
([:x:](https://arxiv.org/abs/2404.17672)), ([:book:](https://browse.arxiv.org/pdf/2404.17672.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.17672.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.17672)), ([:house:](https://huggingface.co/papers/2404.17672)), ([HTML](https://browse.arxiv.org/html/2404.17672v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.17672)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.17672v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.17672)), ([SS](https://api.semanticscholar.org/arXiv:2404.17672))
* 04/25 - **Tele-FLM Technical Report**
([:x:](https://arxiv.org/abs/2404.16645)), ([:book:](https://browse.arxiv.org/pdf/2404.16645.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16645.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16645)), ([:house:](https://huggingface.co/papers/2404.16645)), ([HTML](https://browse.arxiv.org/html/2404.16645v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16645)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16645v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16645)), ([SS](https://api.semanticscholar.org/arXiv:2404.16645))
* 04/25 - **SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension**
([:x:](https://arxiv.org/abs/2404.16790)), ([:book:](https://browse.arxiv.org/pdf/2404.16790.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16790.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16790)), ([:house:](https://huggingface.co/papers/2404.16790)), ([HTML](https://browse.arxiv.org/html/2404.16790v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16790)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16790v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16790)), ([SS](https://api.semanticscholar.org/arXiv:2404.16790))
* 04/25 - **Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings**
([:x:](https://arxiv.org/abs/2404.16820)), ([:book:](https://browse.arxiv.org/pdf/2404.16820.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16820.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16820)), ([:house:](https://huggingface.co/papers/2404.16820)), ([HTML](https://browse.arxiv.org/html/2404.16820v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16820)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16820v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16820)), ([SS](https://api.semanticscholar.org/arXiv:2404.16820))
* 04/25 - **PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning**
([:x:](https://arxiv.org/abs/2404.16994)), ([:book:](https://browse.arxiv.org/pdf/2404.16994.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16994.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16994)), ([:house:](https://huggingface.co/papers/2404.16994)), ([HTML](https://browse.arxiv.org/html/2404.16994v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16994)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16994v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16994)), ([SS](https://api.semanticscholar.org/arXiv:2404.16994))
* 04/25 - **Make Your LLM Fully Utilize the Context**
([:x:](https://arxiv.org/abs/2404.16811)), ([:book:](https://browse.arxiv.org/pdf/2404.16811.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16811.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16811)), ([:house:](https://huggingface.co/papers/2404.16811)), ([HTML](https://browse.arxiv.org/html/2404.16811v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16811)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16811v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16811)), ([SS](https://api.semanticscholar.org/arXiv:2404.16811))
* 04/25 - **List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs**
([:x:](https://arxiv.org/abs/2404.16375)), ([:book:](https://browse.arxiv.org/pdf/2404.16375.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16375.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16375)), ([:house:](https://huggingface.co/papers/2404.16375)), ([HTML](https://browse.arxiv.org/html/2404.16375v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16375)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16375v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16375)), ([SS](https://api.semanticscholar.org/arXiv:2404.16375))
* 04/25 - **Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding**
([:x:](https://arxiv.org/abs/2404.16710)), ([:book:](https://browse.arxiv.org/pdf/2404.16710.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16710.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16710)), ([:house:](https://huggingface.co/papers/2404.16710)), ([HTML](https://browse.arxiv.org/html/2404.16710v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16710)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16710v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16710)), ([SS](https://api.semanticscholar.org/arXiv:2404.16710))
* 04/25 - **Interactive3D: Create What You Want by Interactive 3D Generation**
([:x:](https://arxiv.org/abs/2404.16510)), ([:book:](https://browse.arxiv.org/pdf/2404.16510.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16510.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16510)), ([:house:](https://huggingface.co/papers/2404.16510)), ([HTML](https://browse.arxiv.org/html/2404.16510v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16510)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16510v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16510)), ([SS](https://api.semanticscholar.org/arXiv:2404.16510))
* 04/25 - **How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites**
([:x:](https://arxiv.org/abs/2404.16821)), ([:book:](https://browse.arxiv.org/pdf/2404.16821.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16821.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16821)), ([:house:](https://huggingface.co/papers/2404.16821)), ([HTML](https://browse.arxiv.org/html/2404.16821v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16821)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16821v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16821)), ([SS](https://api.semanticscholar.org/arXiv:2404.16821))
* 04/25 - **ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving**
([:x:](https://arxiv.org/abs/2404.16771)), ([:book:](https://browse.arxiv.org/pdf/2404.16771.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16771.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16771)), ([:house:](https://huggingface.co/papers/2404.16771)), ([HTML](https://browse.arxiv.org/html/2404.16771v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16771)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16771v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16771)), ([SS](https://api.semanticscholar.org/arXiv:2404.16771))
* 04/24 - **XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference**
([:x:](https://arxiv.org/abs/2404.15420)), ([:book:](https://browse.arxiv.org/pdf/2404.15420.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.15420.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.15420)), ([:house:](https://huggingface.co/papers/2404.15420)), ([HTML](https://browse.arxiv.org/html/2404.15420v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.15420)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.15420v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.15420)), ([SS](https://api.semanticscholar.org/arXiv:2404.15420))
* 04/24 - **The Ethics of Advanced AI Assistants**
([:x:](https://arxiv.org/abs/2404.16244)), ([:book:](https://browse.arxiv.org/pdf/2404.16244.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16244.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16244)), ([:house:](https://huggingface.co/papers/2404.16244)), ([HTML](https://browse.arxiv.org/html/2404.16244v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16244)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16244v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16244)), ([SS](https://api.semanticscholar.org/arXiv:2404.16244))
* 04/24 - **PuLID: Pure and Lightning ID Customization via Contrastive Alignment**
([:x:](https://arxiv.org/abs/2404.16022)), ([:book:](https://browse.arxiv.org/pdf/2404.16022.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16022.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16022)), ([:house:](https://huggingface.co/papers/2404.16022)), ([HTML](https://browse.arxiv.org/html/2404.16022v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16022)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16022v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16022)), ([SS](https://api.semanticscholar.org/arXiv:2404.16022))
* 04/24 - **NeRF-XL: Scaling NeRFs with Multiple GPUs**
([:x:](https://arxiv.org/abs/2404.16221)), ([:book:](https://browse.arxiv.org/pdf/2404.16221.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16221.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16221)), ([:house:](https://huggingface.co/papers/2404.16221)), ([HTML](https://browse.arxiv.org/html/2404.16221v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16221)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16221v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16221)), ([SS](https://api.semanticscholar.org/arXiv:2404.16221))
* 04/24 - **MotionMaster: Training-free Camera Motion Transfer For Video Generation**
([:x:](https://arxiv.org/abs/2404.15789)), ([:book:](https://browse.arxiv.org/pdf/2404.15789.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.15789.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.15789)), ([:house:](https://huggingface.co/papers/2404.15789)), ([HTML](https://browse.arxiv.org/html/2404.15789v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.15789)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.15789v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.15789)), ([SS](https://api.semanticscholar.org/arXiv:2404.15789))
* 04/24 - **MoDE: CLIP Data Experts via Clustering**
([:x:](https://arxiv.org/abs/2404.16030)), ([:book:](https://browse.arxiv.org/pdf/2404.16030.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16030.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16030)), ([:house:](https://huggingface.co/papers/2404.16030)), ([HTML](https://browse.arxiv.org/html/2404.16030v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16030)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16030v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16030)), ([SS](https://api.semanticscholar.org/arXiv:2404.16030))
* 04/24 - **MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI**
([:x:](https://arxiv.org/abs/2404.16006)), ([:book:](https://browse.arxiv.org/pdf/2404.16006.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16006.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16006)), ([:house:](https://huggingface.co/papers/2404.16006)), ([HTML](https://browse.arxiv.org/html/2404.16006v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16006)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16006v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16006)), ([SS](https://api.semanticscholar.org/arXiv:2404.16006))
* 04/24 - **MaGGIe: Masked Guided Gradual Human Instance Matting**
([:x:](https://arxiv.org/abs/2404.16035)), ([:book:](https://browse.arxiv.org/pdf/2404.16035.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16035.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16035)), ([:house:](https://huggingface.co/papers/2404.16035)), ([HTML](https://browse.arxiv.org/html/2404.16035v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16035)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16035v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16035)), ([SS](https://api.semanticscholar.org/arXiv:2404.16035))
* 04/24 - **ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning**
([:x:](https://arxiv.org/abs/2404.15449)), ([:book:](https://browse.arxiv.org/pdf/2404.15449.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.15449.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.15449)), ([:house:](https://huggingface.co/papers/2404.15449)), ([HTML](https://browse.arxiv.org/html/2404.15449v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.15449)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.15449v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.15449)), ([SS](https://api.semanticscholar.org/arXiv:2404.15449))
* 04/24 - **Editable Image Elements for Controllable Synthesis**
([:x:](https://arxiv.org/abs/2404.16029)), ([:book:](https://browse.arxiv.org/pdf/2404.16029.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16029.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16029)), ([:house:](https://huggingface.co/papers/2404.16029)), ([HTML](https://browse.arxiv.org/html/2404.16029v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16029)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16029v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16029)), ([SS](https://api.semanticscholar.org/arXiv:2404.16029))
* 04/24 - **CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data**
([:x:](https://arxiv.org/abs/2404.15653)), ([:book:](https://browse.arxiv.org/pdf/2404.15653.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.15653.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.15653)), ([:house:](https://huggingface.co/papers/2404.15653)), ([HTML](https://browse.arxiv.org/html/2404.15653v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.15653)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.15653v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.15653)), ([SS](https://api.semanticscholar.org/arXiv:2404.15653))
* 04/24 - **BASS: Batched Attention-optimized Speculative Sampling**
([:x:](https://arxiv.org/abs/2404.15778)), ([:book:](https://browse.arxiv.org/pdf/2404.15778.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.15778.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.15778)), ([:house:](https://huggingface.co/papers/2404.15778)), ([HTML](https://browse.arxiv.org/html/2404.15778v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.15778)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.15778v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.15778)), ([SS](https://api.semanticscholar.org/arXiv:2404.15778))
* 04/23 - **Transformers Can Represent n-gram Language Models**
([:x:](https://arxiv.org/abs/2404.14994)), ([:book:](https://browse.arxiv.org/pdf/2404.14994.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14994.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14994)), ([:house:](https://huggingface.co/papers/2404.14994)), ([HTML](https://browse.arxiv.org/html/2404.14994v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14994)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14994v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14994)), ([SS](https://api.semanticscholar.org/arXiv:2404.14994))
* 04/23 - **Pegasus-v1 Technical Report**
([:x:](https://arxiv.org/abs/2404.14687)), ([:book:](https://browse.arxiv.org/pdf/2404.14687.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14687.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14687)), ([:house:](https://huggingface.co/papers/2404.14687)), ([HTML](https://browse.arxiv.org/html/2404.14687v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14687)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14687v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14687)), ([SS](https://api.semanticscholar.org/arXiv:2404.14687))
* 04/23 - **Multi-Head Mixture-of-Experts**
([:x:](https://arxiv.org/abs/2404.15045)), ([:book:](https://browse.arxiv.org/pdf/2404.15045.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.15045.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.15045)), ([:house:](https://huggingface.co/papers/2404.15045)), ([HTML](https://browse.arxiv.org/html/2404.15045v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.15045)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.15045v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.15045)), ([SS](https://api.semanticscholar.org/arXiv:2404.15045))
* 04/23 - **FlashSpeech: Efficient Zero-Shot Speech Synthesis**
([:x:](https://arxiv.org/abs/2404.14700)), ([:book:](https://browse.arxiv.org/pdf/2404.14700.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14700.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14700)), ([:house:](https://huggingface.co/papers/2404.14700)), ([HTML](https://browse.arxiv.org/html/2404.14700v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14700)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14700v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14700)), ([SS](https://api.semanticscholar.org/arXiv:2404.14700))
* 04/22 - **SnapKV: LLM Knows What You are Looking for Before Generation**
([:x:](https://arxiv.org/abs/2404.14469)), ([:book:](https://browse.arxiv.org/pdf/2404.14469.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14469.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14469)), ([:house:](https://huggingface.co/papers/2404.14469)), ([HTML](https://browse.arxiv.org/html/2404.14469v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14469)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14469v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14469)), ([SS](https://api.semanticscholar.org/arXiv:2404.14469))
* 04/22 - **SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation**
([:x:](https://arxiv.org/abs/2404.14396)), ([:book:](https://browse.arxiv.org/pdf/2404.14396.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14396.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14396)), ([:house:](https://huggingface.co/papers/2404.14396)), ([HTML](https://browse.arxiv.org/html/2404.14396v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14396)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14396v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14396)), ([SS](https://api.semanticscholar.org/arXiv:2404.14396))
* 04/22 - **Scene Coordinate Reconstruction: Posing of Image Collections via Incremental Learning of a Relocalizer**
([:x:](https://arxiv.org/abs/2404.14351)), ([:book:](https://browse.arxiv.org/pdf/2404.14351.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14351.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14351)), ([:house:](https://huggingface.co/papers/2404.14351)), ([HTML](https://browse.arxiv.org/html/2404.14351v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14351)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14351v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14351)), ([SS](https://api.semanticscholar.org/arXiv:2404.14351))
* 04/22 - **Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone**
([:x:](https://arxiv.org/abs/2404.14219)), ([:book:](https://browse.arxiv.org/pdf/2404.14219.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14219.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14219)), ([:house:](https://huggingface.co/papers/2404.14219)), ([HTML](https://browse.arxiv.org/html/2404.14219v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14219)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14219v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14219)), ([SS](https://api.semanticscholar.org/arXiv:2404.14219))
* 04/22 - **OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework**
([:x:](https://arxiv.org/abs/2404.14619)), ([:book:](https://browse.arxiv.org/pdf/2404.14619.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14619.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14619)), ([:house:](https://huggingface.co/papers/2404.14619)), ([HTML](https://browse.arxiv.org/html/2404.14619v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14619)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14619v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14619)), ([SS](https://api.semanticscholar.org/arXiv:2404.14619))
* 04/22 - **MultiBooth: Towards Generating All Your Concepts in an Image from Text**
([:x:](https://arxiv.org/abs/2404.14239)), ([:book:](https://browse.arxiv.org/pdf/2404.14239.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14239.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14239)), ([:house:](https://huggingface.co/papers/2404.14239)), ([HTML](https://browse.arxiv.org/html/2404.14239v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14239)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14239v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14239)), ([SS](https://api.semanticscholar.org/arXiv:2404.14239))
* 04/22 - **Learning H-Infinity Locomotion Control**
([:x:](https://arxiv.org/abs/2404.14405)), ([:book:](https://browse.arxiv.org/pdf/2404.14405.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14405.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14405)), ([:house:](https://huggingface.co/papers/2404.14405)), ([HTML](https://browse.arxiv.org/html/2404.14405v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14405)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14405v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14405)), ([SS](https://api.semanticscholar.org/arXiv:2404.14405))
* 04/22 - **How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study**
([:x:](https://arxiv.org/abs/2404.14047)), ([:book:](https://browse.arxiv.org/pdf/2404.14047.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14047.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14047)), ([:house:](https://huggingface.co/papers/2404.14047)), ([HTML](https://browse.arxiv.org/html/2404.14047v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14047)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14047v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14047)), ([SS](https://api.semanticscholar.org/arXiv:2404.14047))
* 04/22 - **Align Your Steps: Optimizing Sampling Schedules in Diffusion Models**
([:x:](https://arxiv.org/abs/2404.14507)), ([:book:](https://browse.arxiv.org/pdf/2404.14507.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14507.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14507)), ([:house:](https://huggingface.co/papers/2404.14507)), ([HTML](https://browse.arxiv.org/html/2404.14507v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14507)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14507v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14507)), ([SS](https://api.semanticscholar.org/arXiv:2404.14507))
* 04/22 - **A Multimodal Automated Interpretability Agent**
([:x:](https://arxiv.org/abs/2404.14394)), ([:book:](https://browse.arxiv.org/pdf/2404.14394.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14394.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14394)), ([:house:](https://huggingface.co/papers/2404.14394)), ([HTML](https://browse.arxiv.org/html/2404.14394v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14394)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14394v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14394)), ([SS](https://api.semanticscholar.org/arXiv:2404.14394))
* 04/21 - **Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis**
([:x:](https://arxiv.org/abs/2404.13686)), ([:book:](https://browse.arxiv.org/pdf/2404.13686.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.13686.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.13686)), ([:house:](https://huggingface.co/papers/2404.13686)), ([HTML](https://browse.arxiv.org/html/2404.13686v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.13686)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.13686v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.13686)), ([SS](https://api.semanticscholar.org/arXiv:2404.13686))
* 04/21 - **AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs**
([:x:](https://arxiv.org/abs/2404.16873)), ([:book:](https://browse.arxiv.org/pdf/2404.16873.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16873.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16873)), ([:house:](https://huggingface.co/papers/2404.16873)), ([HTML](https://browse.arxiv.org/html/2404.16873v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16873)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16873v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16873)), ([SS](https://api.semanticscholar.org/arXiv:2404.16873))
* 04/20 - **Music Consistency Models**
([:x:](https://arxiv.org/abs/2404.13358)), ([:book:](https://browse.arxiv.org/pdf/2404.13358.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.13358.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.13358)), ([:house:](https://huggingface.co/papers/2404.13358)), ([HTML](https://browse.arxiv.org/html/2404.13358v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.13358)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.13358v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.13358)), ([SS](https://api.semanticscholar.org/arXiv:2404.13358))
* 04/19 - **The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions**
([:x:](https://arxiv.org/abs/2404.13208)), ([:book:](https://browse.arxiv.org/pdf/2404.13208.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.13208.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.13208)), ([:house:](https://huggingface.co/papers/2404.13208)), ([HTML](https://browse.arxiv.org/html/2404.13208v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.13208)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.13208v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.13208)), ([SS](https://api.semanticscholar.org/arXiv:2404.13208))
* 04/19 - **TextSquare: Scaling up Text-Centric Visual Instruction Tuning**
([:x:](https://arxiv.org/abs/2404.12803)), ([:book:](https://browse.arxiv.org/pdf/2404.12803.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12803.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12803)), ([:house:](https://huggingface.co/papers/2404.12803)), ([HTML](https://browse.arxiv.org/html/2404.12803v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12803)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12803v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12803)), ([SS](https://api.semanticscholar.org/arXiv:2404.12803))
* 04/19 - **PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation**
([:x:](https://arxiv.org/abs/2404.13026)), ([:book:](https://browse.arxiv.org/pdf/2404.13026.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.13026.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.13026)), ([:house:](https://huggingface.co/papers/2404.13026)), ([HTML](https://browse.arxiv.org/html/2404.13026v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.13026)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.13026v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.13026)), ([SS](https://api.semanticscholar.org/arXiv:2404.13026))
* 04/19 - **LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency**
([:x:](https://arxiv.org/abs/2404.12872)), ([:book:](https://browse.arxiv.org/pdf/2404.12872.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12872.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12872)), ([:house:](https://huggingface.co/papers/2404.12872)), ([HTML](https://browse.arxiv.org/html/2404.12872v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12872)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12872v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12872)), ([SS](https://api.semanticscholar.org/arXiv:2404.12872))
* 04/19 - **How Real Is Real? A Human Evaluation Framework for Unrestricted Adversarial Examples**
([:x:](https://arxiv.org/abs/2404.12653)), ([:book:](https://browse.arxiv.org/pdf/2404.12653.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12653.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12653)), ([:house:](https://huggingface.co/papers/2404.12653)), ([HTML](https://browse.arxiv.org/html/2404.12653v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12653)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12653v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12653)), ([SS](https://api.semanticscholar.org/arXiv:2404.12653))
* 04/19 - **How Far Can We Go with Practical Function-Level Program Repair?**
([:x:](https://arxiv.org/abs/2404.12833)), ([:book:](https://browse.arxiv.org/pdf/2404.12833.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12833.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12833)), ([:house:](https://huggingface.co/papers/2404.12833)), ([HTML](https://browse.arxiv.org/html/2404.12833v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12833)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12833v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12833)), ([SS](https://api.semanticscholar.org/arXiv:2404.12833))
* 04/19 - **Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models**
([:x:](https://arxiv.org/abs/2404.13013)), ([:book:](https://browse.arxiv.org/pdf/2404.13013.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.13013.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.13013)), ([:house:](https://huggingface.co/papers/2404.13013)), ([HTML](https://browse.arxiv.org/html/2404.13013v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.13013)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.13013v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.13013)), ([SS](https://api.semanticscholar.org/arXiv:2404.13013))
* 04/19 - **Does Gaussian Splatting need SFM Initialization?**
([:x:](https://arxiv.org/abs/2404.12547)), ([:book:](https://browse.arxiv.org/pdf/2404.12547.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12547.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12547)), ([:house:](https://huggingface.co/papers/2404.12547)), ([HTML](https://browse.arxiv.org/html/2404.12547v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12547)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12547v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12547)), ([SS](https://api.semanticscholar.org/arXiv:2404.12547))
* 04/19 - **AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation**
([:x:](https://arxiv.org/abs/2404.12753)), ([:book:](https://browse.arxiv.org/pdf/2404.12753.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12753.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12753)), ([:house:](https://huggingface.co/papers/2404.12753)), ([HTML](https://browse.arxiv.org/html/2404.12753v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12753)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12753v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12753)), ([SS](https://api.semanticscholar.org/arXiv:2404.12753))
* 04/18 - **TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding**
([:x:](https://arxiv.org/abs/2404.11912)), ([:book:](https://browse.arxiv.org/pdf/2404.11912.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.11912.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.11912)), ([:house:](https://huggingface.co/papers/2404.11912)), ([HTML](https://browse.arxiv.org/html/2404.11912v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.11912)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.11912v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.11912)), ([SS](https://api.semanticscholar.org/arXiv:2404.11912))
* 04/18 - **Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing**
([:x:](https://arxiv.org/abs/2404.12253)), ([:book:](https://browse.arxiv.org/pdf/2404.12253.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12253.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12253)), ([:house:](https://huggingface.co/papers/2404.12253)), ([HTML](https://browse.arxiv.org/html/2404.12253v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12253)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12253v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12253)), ([SS](https://api.semanticscholar.org/arXiv:2404.12253))
* 04/18 - **Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment**
([:x:](https://arxiv.org/abs/2404.12318)), ([:book:](https://browse.arxiv.org/pdf/2404.12318.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12318.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12318)), ([:house:](https://huggingface.co/papers/2404.12318)), ([HTML](https://browse.arxiv.org/html/2404.12318v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12318)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12318v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12318)), ([SS](https://api.semanticscholar.org/arXiv:2404.12318))
* 04/18 - **Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models**
([:x:](https://arxiv.org/abs/2404.12387)), ([:book:](https://browse.arxiv.org/pdf/2404.12387.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12387.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12387)), ([:house:](https://huggingface.co/papers/2404.12387)), ([HTML](https://browse.arxiv.org/html/2404.12387v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12387)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12387v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12387)), ([SS](https://api.semanticscholar.org/arXiv:2404.12387))
* 04/18 - **OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of Instruction Data**
([:x:](https://arxiv.org/abs/2404.12195)), ([:book:](https://browse.arxiv.org/pdf/2404.12195.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12195.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12195)), ([:house:](https://huggingface.co/papers/2404.12195)), ([HTML](https://browse.arxiv.org/html/2404.12195v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12195)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12195v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12195)), ([SS](https://api.semanticscholar.org/arXiv:2404.12195))
* 04/18 - **MeshLRM: Large Reconstruction Model for High-Quality Mesh**
([:x:](https://arxiv.org/abs/2404.12385)), ([:book:](https://browse.arxiv.org/pdf/2404.12385.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12385.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12385)), ([:house:](https://huggingface.co/papers/2404.12385)), ([HTML](https://browse.arxiv.org/html/2404.12385v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12385)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12385v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12385)), ([SS](https://api.semanticscholar.org/arXiv:2404.12385))
* 04/18 - **Introducing v0.5 of the AI Safety Benchmark from MLCommons**
([:x:](https://arxiv.org/abs/2404.12241)), ([:book:](https://browse.arxiv.org/pdf/2404.12241.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12241.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12241)), ([:house:](https://huggingface.co/papers/2404.12241)), ([HTML](https://browse.arxiv.org/html/2404.12241v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12241)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12241v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12241)), ([SS](https://api.semanticscholar.org/arXiv:2404.12241))
* 04/18 - **Introducing Meta Llama 3: The most capable openly available LLM to date**
([Blog](https://ai.meta.com/blog/meta-llama-3/)),
* 04/18 - **EdgeFusion: On-Device Text-to-Image Generation**
([:x:](https://arxiv.org/abs/2404.11925)), ([:book:](https://browse.arxiv.org/pdf/2404.11925.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.11925.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.11925)), ([:house:](https://huggingface.co/papers/2404.11925)), ([HTML](https://browse.arxiv.org/html/2404.11925v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.11925)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.11925v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.11925)), ([SS](https://api.semanticscholar.org/arXiv:2404.11925))
* 04/18 - **BLINK: Multimodal Large Language Models Can See but Not Perceive**
([:x:](https://arxiv.org/abs/2404.12390)), ([:book:](https://browse.arxiv.org/pdf/2404.12390.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12390.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12390)), ([:house:](https://huggingface.co/papers/2404.12390)), ([HTML](https://browse.arxiv.org/html/2404.12390v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12390)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12390v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12390)), ([SS](https://api.semanticscholar.org/arXiv:2404.12390))
* 04/18 - **AniClipart: Clipart Animation with Text-to-Video Priors**
([:x:](https://arxiv.org/abs/2404.12347)), ([:book:](https://browse.arxiv.org/pdf/2404.12347.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12347.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12347)), ([:house:](https://huggingface.co/papers/2404.12347)), ([HTML](https://browse.arxiv.org/html/2404.12347v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12347)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12347v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12347)), ([SS](https://api.semanticscholar.org/arXiv:2404.12347))
* 04/17 - **MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation**
([:x:](https://arxiv.org/abs/2404.11565)), ([:book:](https://browse.arxiv.org/pdf/2404.11565.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.11565.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.11565)), ([:house:](https://huggingface.co/papers/2404.11565)), ([HTML](https://browse.arxiv.org/html/2404.11565v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.11565)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.11565v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.11565)), ([SS](https://api.semanticscholar.org/arXiv:2404.11565))
* 04/17 - **FlowMind: Automatic Workflow Generation with LLMs**
([:x:](https://arxiv.org/abs/2404.13050)), ([:book:](https://browse.arxiv.org/pdf/2404.13050.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.13050.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.13050)), ([:house:](https://huggingface.co/papers/2404.13050)), ([HTML](https://browse.arxiv.org/html/2404.13050v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.13050)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.13050v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.13050)), ([SS](https://api.semanticscholar.org/arXiv:2404.13050))
* 04/17 - **Dynamic Typography: Bringing Words to Life**
([:x:](https://arxiv.org/abs/2404.11614)), ([:book:](https://browse.arxiv.org/pdf/2404.11614.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.11614.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.11614)), ([:house:](https://huggingface.co/papers/2404.11614)), ([HTML](https://browse.arxiv.org/html/2404.11614v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.11614)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.11614v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.11614)), ([SS](https://api.semanticscholar.org/arXiv:2404.11614))
* 04/17 - **Stable Diffusion 3 API Now Available**
([twitter](https://twitter.com/StabilityAI/status/1780599024707596508)), ([Blog](https://stability.ai/news/stable-diffusion-3-api?utm_source=twitter&utm_medium=website&utm_campaign=blog)), ([Demo](https://platform.stability.ai/docs/api-reference#tag/Generate/paths/~1v2beta~1stable-image~1generate~1sd3/post)),
* 04/16 - **VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time**
([:x:](https://arxiv.org/abs/2404.10667)), ([:book:](https://browse.arxiv.org/pdf/2404.10667.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.10667.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.10667)), ([:house:](https://huggingface.co/papers/2404.10667)), ([HTML](https://browse.arxiv.org/html/2404.10667v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.10667)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.10667v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.10667)), ([SS](https://api.semanticscholar.org/arXiv:2404.10667)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/vasa-1-lifelike-audio-driven-talking-faces))
* 04/16 - **U.S. Commerce Secretary Gina Raimondo Announces Expansion of U.S. AI Safety Institute Leadership Team**
([News](https://www.commerce.gov/news/press-releases/2024/04/us-commerce-secretary-gina-raimondo-announces-expansion-us-ai-safety)),
* 04/16 - **Long-form music generation with latent diffusion**
([:x:](https://arxiv.org/abs/2404.10301)), ([:book:](https://browse.arxiv.org/pdf/2404.10301.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.10301.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.10301)), ([:house:](https://huggingface.co/papers/2404.10301)), ([HTML](https://browse.arxiv.org/html/2404.10301v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.10301)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.10301v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.10301)), ([SS](https://api.semanticscholar.org/arXiv:2404.10301))
* 04/15 - **LLM Evaluators Recognize and Favor Their Own Generations**
([:x:](https://arxiv.org/abs/2404.13076)), ([:book:](https://browse.arxiv.org/pdf/2404.13076.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.13076.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.13076)), ([:house:](https://huggingface.co/papers/2404.13076)), ([HTML](https://browse.arxiv.org/html/2404.13076v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.13076)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.13076v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.13076)), ([SS](https://api.semanticscholar.org/arXiv:2404.13076))
* 04/15 - **Video2Game: Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video**
([:x:](https://arxiv.org/abs/2404.09833)), ([:book:](https://browse.arxiv.org/pdf/2404.09833.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09833.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09833)), ([:house:](https://huggingface.co/papers/2404.09833)), ([HTML](https://browse.arxiv.org/html/2404.09833v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.09833)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09833v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09833)), ([SS](https://api.semanticscholar.org/arXiv:2404.09833)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/video2game-real-time-interactive-realistic))
* 04/15 - **Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization**
([:x:](https://arxiv.org/abs/2404.09956)), ([:book:](https://browse.arxiv.org/pdf/2404.09956.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09956.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09956)), ([:house:](https://huggingface.co/papers/2404.09956)), ([HTML](https://browse.arxiv.org/html/2404.09956v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.09956)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09956v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09956)), ([SS](https://api.semanticscholar.org/arXiv:2404.09956)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/tango-2-aligning-diffusion-based-text-to)), ([:octocat:](https://github.com/declare-lab/tango)![GitHub Repo stars](https://img.shields.io/github/stars/declare-lab/tango?style=social))
* 04/15 - **Taming Latent Diffusion Model for Neural Radiance Field Inpainting**
([:x:](https://arxiv.org/abs/2404.09995)), ([:book:](https://browse.arxiv.org/pdf/2404.09995.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09995.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09995)), ([:house:](https://huggingface.co/papers/2404.09995)), ([HTML](https://browse.arxiv.org/html/2404.09995v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.09995)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09995v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09995)), ([SS](https://api.semanticscholar.org/arXiv:2404.09995)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/taming-latent-diffusion-model-for-neural))
* 04/15 - **Opus can operate as a Turing machine**
([twitter](https://twitter.com/ctjlewis/status/1779740038852690393)),
* 04/15 - **MathGPT: Leveraging Llama 2 to create a platform for highly personalized learning**

* 04/15 - **HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing**
([:x:](https://arxiv.org/abs/2404.09990)), ([:book:](https://browse.arxiv.org/pdf/2404.09990.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09990.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09990)), ([:house:](https://huggingface.co/papers/2404.09990)), ([HTML](https://browse.arxiv.org/html/2404.09990v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.09990)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09990v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09990)), ([SS](https://api.semanticscholar.org/arXiv:2404.09990)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/hq-edit-a-high-quality-dataset-for))
* 04/15 - **Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model**
([:x:](https://arxiv.org/abs/2404.09967)), ([:book:](https://browse.arxiv.org/pdf/2404.09967.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09967.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09967)), ([:house:](https://huggingface.co/papers/2404.09967)), ([HTML](https://browse.arxiv.org/html/2404.09967v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.09967)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09967v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09967)), ([SS](https://api.semanticscholar.org/arXiv:2404.09967)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/ctrl-adapter-an-efficient-and-versatile))
* 04/15 - **Compression Represents Intelligence Linearly**
([:x:](https://arxiv.org/abs/2404.09937)), ([:book:](https://browse.arxiv.org/pdf/2404.09937.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09937.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09937)), ([:house:](https://huggingface.co/papers/2404.09937)), ([HTML](https://browse.arxiv.org/html/2404.09937v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.09937)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09937v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09937)), ([SS](https://api.semanticscholar.org/arXiv:2404.09937)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/compression-represents-intelligence-linearly))
* 04/15 - **CompGS: Efficient 3D Scene Representation via Compressed Gaussian Splatting**
([:x:](https://arxiv.org/abs/2404.09458)), ([:book:](https://browse.arxiv.org/pdf/2404.09458.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09458.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09458)), ([:house:](https://huggingface.co/papers/2404.09458)), ([HTML](https://browse.arxiv.org/html/2404.09458v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.09458)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09458v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09458)), ([SS](https://api.semanticscholar.org/arXiv:2404.09458)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/compgs-efficient-3d-scene-representation-via))
* 04/14 - **TextHawk: Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models**
([:x:](https://arxiv.org/abs/2404.09204)), ([:book:](https://browse.arxiv.org/pdf/2404.09204.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09204.pdf)), ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09204)), ([:house:](https://huggingface.co/papers/2404.09204)), ([HTML](https://browse.arxiv.org/html/2404.09204v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.09204)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09204v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09204)), ([SS](https://api.semanticscholar.org/arXiv:2404.09204)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/texthawk-exploring-efficient-fine-grained)), ([:octocat:](https://github.com/yuyq96/texthawk)![GitHub Repo stars](https://img.shields.io/github/stars/yuyq96/texthawk?style=social))
* 04/13 - **Cathie Wood Muscles Into ChatGPT Boom With New OpenAI Stake**
([News](https://finance.yahoo.com/news/cathie-wood-ark-investment-management-232619722.html)),
* 04/12 - **Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies**
([:x:](https://arxiv.org/abs/2404.08197)), ([:book:](https://browse.arxiv.org/pdf/2404.08197.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.08197.pdf)), ([:orange_book:](https://www.a