https://github.com/hollobit/GenAI_LLM_timeline

ChatGPT, GenerativeAI and LLMs Timeline
https://github.com/hollobit/GenAI_LLM_timeline
agi chatgpt chatgpt-api claude copilot generative-ai generative-models gpt langchain large-language-models llama llm midjourney openai palm-e stable-diffusion timeline transformer vall-e
Last synced: over 1 year ago
JSON representation
ChatGPT, GenerativeAI and LLMs Timeline
Host: GitHub
URL: https://github.com/hollobit/GenAI_LLM_timeline
Owner: hollobit
Created: 2023-03-25T07:00:16.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2024-05-19T23:57:02.000Z (about 2 years ago)
Last Synced: 2025-04-03T17:12:30.168Z (over 1 year ago)
Topics: agi, chatgpt, chatgpt-api, claude, copilot, generative-ai, generative-models, gpt, langchain, large-language-models, llama, llm, midjourney, openai, palm-e, stable-diffusion, timeline, transformer, vall-e
Homepage:
Size: 3.15 MB
Stars: 953
Watchers: 84
Forks: 58
Open Issues: 4
Metadata Files:
- Readme: README.md
Awesome Lists containing this project

README

          # ChatGPT, GenerativeAI and LLMs Timeline 

This repository organizes a timeline of key events (products, services, papers, GitHub, blog posts and news) that occurred before and after the ChatGPT announcement. 

It's curating a variety of information in this timeline, with a particular focus on LLM and Generative AI. 

Maybe it's a scene from the hottest history, so I thought it would be important to keep those memories well, so I organized them.

## Statistics 

These diagrams were generated by ChatGPT's Code Interpreter.

 



## Contributing

Issues and Pull Requests are greatly appreciated. If you've never contributed to an open source project before I'm more than happy to walk you through how to create a pull request.

You can start by [opening an issue](https://github.com/hollobit/BCAC_timeline/issues/new) describing the problem that you're looking to resolve and we'll go from there.

## Emoji 

arXiv :x:, PDF :paperclip:, arxiv-vanity :orange_book:, paper page :house:, papers with code :eight_spoked_asterisk:, Github :octocat:

## License

This document is licensed under the [MIT license](https://opensource.org/licenses/mit-license.php) © Jonghong Jeon(전종홍)

## Timeline V2

### 2024

  * 05/17 - **OpenAI strikes Reddit deal to train its AI on your posts** 
  ([News](https://www.theverge.com/2024/5/16/24158529/reddit-openai-chatgpt-api-access-advertising)), 

  * 05/17 - **OpenAI dissolves team focused on long-term AI risks, less than one year after announcing it** 
  ([News](https://www.cnbc.com/2024/05/17/openai-superalignment-sutskever-leike.html)), 

  * 05/17 - **International Scientific Report on the Safety of Advanced AI** 
  ([Blog](https://www.gov.uk/government/publications/international-scientific-report-on-the-safety-of-advanced-ai)), 

  * 05/16 - **TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction** 
([:x:](https://arxiv.org/abs/2405.10315)), ([:book:](https://browse.arxiv.org/pdf/2405.10315.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.10315.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.10315)), ([:house:](https://huggingface.co/papers/2405.10315)), ([HTML](https://browse.arxiv.org/html/2405.10315v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.10315)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.10315v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.10315)), ([SS](https://api.semanticscholar.org/arXiv:2405.10315))

  * 05/16 - **Toon3D: Seeing Cartoons from a New Perspective** 
([:x:](https://arxiv.org/abs/2405.10320)), ([:book:](https://browse.arxiv.org/pdf/2405.10320.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.10320.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.10320)), ([:house:](https://huggingface.co/papers/2405.10320)), ([HTML](https://browse.arxiv.org/html/2405.10320v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.10320)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.10320v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.10320)), ([SS](https://api.semanticscholar.org/arXiv:2405.10320))

  * 05/16 - **Testing the reliability of an AI-based large language model to extract ecological information from the scientific literature** 
  ([News](https://www.nature.com/articles/s44185-024-00043-9)), 

  * 05/16 - **Many-Shot In-Context Learning in Multimodal Foundation Models** 
([:x:](https://arxiv.org/abs/2405.09798)), ([:book:](https://browse.arxiv.org/pdf/2405.09798.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.09798.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.09798)), ([:house:](https://huggingface.co/papers/2405.09798)), ([HTML](https://browse.arxiv.org/html/2405.09798v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.09798)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.09798v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.09798)), ([SS](https://api.semanticscholar.org/arXiv:2405.09798))

  * 05/16 - **How to Hit Pause on AI Before It’s Too Late** 
  ([News](https://time.com/6978790/how-to-pause-artificial-intelligence/)), 

  * 05/16 - **Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection** 
([:x:](https://arxiv.org/abs/2405.10300)), ([:book:](https://browse.arxiv.org/pdf/2405.10300.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.10300.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.10300)), ([:house:](https://huggingface.co/papers/2405.10300)), ([HTML](https://browse.arxiv.org/html/2405.10300v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.10300)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.10300v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.10300)), ([SS](https://api.semanticscholar.org/arXiv:2405.10300))

  * 05/16 - **GPT Store Mining and Analysis** 
([:x:](https://arxiv.org/abs/2405.10210)), ([:book:](https://browse.arxiv.org/pdf/2405.10210.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.10210.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.10210)), ([:house:](https://huggingface.co/papers/2405.10210)), ([HTML](https://browse.arxiv.org/html/2405.10210v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.10210)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.10210v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.10210)), ([SS](https://api.semanticscholar.org/arXiv:2405.10210))

  * 05/16 - **Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion** 
([:x:](https://arxiv.org/abs/2405.09874)), ([:book:](https://browse.arxiv.org/pdf/2405.09874.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.09874.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.09874)), ([:house:](https://huggingface.co/papers/2405.09874)), ([HTML](https://browse.arxiv.org/html/2405.09874v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.09874)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.09874v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.09874)), ([SS](https://api.semanticscholar.org/arXiv:2405.09874))

  * 05/16 - **Chameleon: Mixed-Modal Early-Fusion Foundation Models** 
([:x:](https://arxiv.org/abs/2405.09818)), ([:book:](https://browse.arxiv.org/pdf/2405.09818.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.09818.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.09818)), ([:house:](https://huggingface.co/papers/2405.09818)), ([HTML](https://browse.arxiv.org/html/2405.09818v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.09818)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.09818v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.09818)), ([SS](https://api.semanticscholar.org/arXiv:2405.09818))

  * 05/16 - **CAT3D: Create Anything in 3D with Multi-View Diffusion Models** 
([:x:](https://arxiv.org/abs/2405.10314)), ([:book:](https://browse.arxiv.org/pdf/2405.10314.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.10314.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.10314)), ([:house:](https://huggingface.co/papers/2405.10314)), ([HTML](https://browse.arxiv.org/html/2405.10314v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.10314)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.10314v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.10314)), ([SS](https://api.semanticscholar.org/arXiv:2405.10314))

  * 05/15 - **Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model** 
([:x:](https://arxiv.org/abs/2405.09215)), ([:book:](https://browse.arxiv.org/pdf/2405.09215.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.09215.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.09215)), ([:house:](https://huggingface.co/papers/2405.09215)), ([HTML](https://browse.arxiv.org/html/2405.09215v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.09215)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.09215v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.09215)), ([SS](https://api.semanticscholar.org/arXiv:2405.09215))

  * 05/15 - **LoRA Learns Less and Forgets Less** 
([:x:](https://arxiv.org/abs/2405.09673)), ([:book:](https://browse.arxiv.org/pdf/2405.09673.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.09673.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.09673)), ([:house:](https://huggingface.co/papers/2405.09673)), ([HTML](https://browse.arxiv.org/html/2405.09673v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.09673)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.09673v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.09673)), ([SS](https://api.semanticscholar.org/arXiv:2405.09673))

  * 05/15 - **Google’s invisible AI watermark will help identify generative text and video** 
  ([News](https://www.theverge.com/2024/5/14/24155927/google-ai-synthid-watermark-text-video-io)), 

  * 05/15 - **Google I/O 2024: everything announced** 
  ([Blog](https://www.theverge.com/24153841/google-io-2024-ai-gemini-android-chrome-photos)), 

  * 05/15 - **BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation** 
([:x:](https://arxiv.org/abs/2405.09546)), ([:book:](https://browse.arxiv.org/pdf/2405.09546.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.09546.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.09546)), ([:house:](https://huggingface.co/papers/2405.09546)), ([HTML](https://browse.arxiv.org/html/2405.09546v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.09546)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.09546v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.09546)), ([SS](https://api.semanticscholar.org/arXiv:2405.09546))

  * 05/15 - **ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models** 
([:x:](https://arxiv.org/abs/2405.09220)), ([:book:](https://browse.arxiv.org/pdf/2405.09220.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.09220.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.09220)), ([:house:](https://huggingface.co/papers/2405.09220)), ([HTML](https://browse.arxiv.org/html/2405.09220v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.09220)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.09220v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.09220)), ([SS](https://api.semanticscholar.org/arXiv:2405.09220))

  * 05/14 - **Understanding the performance gap between online and offline alignment algorithms** 
([:x:](https://arxiv.org/abs/2405.08448)), ([:book:](https://browse.arxiv.org/pdf/2405.08448.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08448.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08448)), ([:house:](https://huggingface.co/papers/2405.08448)), ([HTML](https://browse.arxiv.org/html/2405.08448v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.08448)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08448v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08448)), ([SS](https://api.semanticscholar.org/arXiv:2405.08448))

  * 05/14 - **SpeechVerse: A Large-scale Generalizable Audio Language Model** 
([:x:](https://arxiv.org/abs/2405.08295)), ([:book:](https://browse.arxiv.org/pdf/2405.08295.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08295.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08295)), ([:house:](https://huggingface.co/papers/2405.08295)), ([HTML](https://browse.arxiv.org/html/2405.08295v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.08295)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08295v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08295)), ([SS](https://api.semanticscholar.org/arXiv:2405.08295))

  * 05/14 - **SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models** 
([:x:](https://arxiv.org/abs/2405.08317)), ([:book:](https://browse.arxiv.org/pdf/2405.08317.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08317.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08317)), ([:house:](https://huggingface.co/papers/2405.08317)), ([HTML](https://browse.arxiv.org/html/2405.08317v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.08317)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08317v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08317)), ([SS](https://api.semanticscholar.org/arXiv:2405.08317))

  * 05/14 - **No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding** 
([:x:](https://arxiv.org/abs/2405.08344)), ([:book:](https://browse.arxiv.org/pdf/2405.08344.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08344.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08344)), ([:house:](https://huggingface.co/papers/2405.08344)), ([HTML](https://browse.arxiv.org/html/2405.08344v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.08344)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08344v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08344)), ([SS](https://api.semanticscholar.org/arXiv:2405.08344))

  * 05/14 - **Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding** 
([:x:](https://arxiv.org/abs/2405.08748)), ([:book:](https://browse.arxiv.org/pdf/2405.08748.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08748.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08748)), ([:house:](https://huggingface.co/papers/2405.08748)), ([HTML](https://browse.arxiv.org/html/2405.08748v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.08748)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08748v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08748)), ([SS](https://api.semanticscholar.org/arXiv:2405.08748))

  * 05/14 - **Compositional Text-to-Image Generation with Dense Blob Representations** 
([:x:](https://arxiv.org/abs/2405.08246)), ([:book:](https://browse.arxiv.org/pdf/2405.08246.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08246.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08246)), ([:house:](https://huggingface.co/papers/2405.08246)), ([HTML](https://browse.arxiv.org/html/2405.08246v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.08246)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08246v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08246)), ([SS](https://api.semanticscholar.org/arXiv:2405.08246))

  * 05/14 - **Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory** 
([:x:](https://arxiv.org/abs/2405.08707)), ([:book:](https://browse.arxiv.org/pdf/2405.08707.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08707.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08707)), ([:house:](https://huggingface.co/papers/2405.08707)), ([HTML](https://browse.arxiv.org/html/2405.08707v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.08707)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08707v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08707)), ([SS](https://api.semanticscholar.org/arXiv:2405.08707))

  * 05/13 - **SambaNova SN40L: Scaling the AI Memory Wall with Dataflow and Composition of Experts** 
([:x:](https://arxiv.org/abs/2405.07518)), ([:book:](https://browse.arxiv.org/pdf/2405.07518.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.07518.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.07518)), ([:house:](https://huggingface.co/papers/2405.07518)), ([HTML](https://browse.arxiv.org/html/2405.07518v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.07518)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.07518v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.07518)), ([SS](https://api.semanticscholar.org/arXiv:2405.07518))

  * 05/13 - **RLHF Workflow: From Reward Modeling to Online RLHF** 
([:x:](https://arxiv.org/abs/2405.07863)), ([:book:](https://browse.arxiv.org/pdf/2405.07863.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.07863.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.07863)), ([:house:](https://huggingface.co/papers/2405.07863)), ([HTML](https://browse.arxiv.org/html/2405.07863v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.07863)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.07863v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.07863)), ([SS](https://api.semanticscholar.org/arXiv:2405.07863))

  * 05/13 - **Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots** 
([:x:](https://arxiv.org/abs/2405.07990)), ([:book:](https://browse.arxiv.org/pdf/2405.07990.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.07990.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.07990)), ([:house:](https://huggingface.co/papers/2405.07990)), ([HTML](https://browse.arxiv.org/html/2405.07990v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.07990)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.07990v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.07990)), ([SS](https://api.semanticscholar.org/arXiv:2405.07990))

  * 05/13 - **OpenAI unveils newest AI model, GPT-4o** 
  ([News](https://edition.cnn.com/2024/05/13/tech/openai-altman-new-ai-model-gpt-4o/index.html)), 

  * 05/13 - **MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels** 
([:x:](https://arxiv.org/abs/2405.07526)), ([:book:](https://browse.arxiv.org/pdf/2405.07526.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.07526.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.07526)), ([:house:](https://huggingface.co/papers/2405.07526)), ([HTML](https://browse.arxiv.org/html/2405.07526v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.07526)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.07526v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.07526)), ([SS](https://api.semanticscholar.org/arXiv:2405.07526))

  * 05/13 - **How Much Research Is Being Written by Large Language Models?** 
  ([Blog](https://hai.stanford.edu/news/how-much-research-being-written-large-language-models)), 

  * 05/13 - **Hello GPT-4o** 
  ([Blog](https://openai.com/index/hello-gpt-4o/)), 

  * 05/13 - **Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning** 
([:x:](https://arxiv.org/abs/2405.08054)), ([:book:](https://browse.arxiv.org/pdf/2405.08054.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08054.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08054)), ([:house:](https://huggingface.co/papers/2405.08054)), ([HTML](https://browse.arxiv.org/html/2405.08054v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.08054)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08054v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08054)), ([SS](https://api.semanticscholar.org/arXiv:2405.08054))

  * 05/11 - **Piccolo2: General Text Embedding with Multi-task Hybrid Loss Training** 
([:x:](https://arxiv.org/abs/2405.06932)), ([:book:](https://browse.arxiv.org/pdf/2405.06932.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.06932.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.06932)), ([:house:](https://huggingface.co/papers/2405.06932)), ([HTML](https://browse.arxiv.org/html/2405.06932v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.06932)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.06932v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.06932)), ([SS](https://api.semanticscholar.org/arXiv:2405.06932))

  * 05/11 - **LogoMotion: Visually Grounded Code Generation for Content-Aware Animation** 
([:x:](https://arxiv.org/abs/2405.07065)), ([:book:](https://browse.arxiv.org/pdf/2405.07065.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.07065.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.07065)), ([:house:](https://huggingface.co/papers/2405.07065)), ([HTML](https://browse.arxiv.org/html/2405.07065v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.07065)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.07065v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.07065)), ([SS](https://api.semanticscholar.org/arXiv:2405.07065))

  * 05/10 - **INSPECT - An open-source framework for large language model evaluations** 
  ([Blog](https://ukgovernmentbeis.github.io/inspect_ai/)), 

  * 05/10 - **AI Safety Institute releases new AI safety evaluations platform** 
  ([News](https://www.gov.uk/government/news/ai-safety-institute-releases-new-ai-safety-evaluations-platform)), 

  * 05/07 - **SUTRA: Scalable Multilingual Language Model Architecture** 
([:x:](https://arxiv.org/abs/2405.06694)), ([:book:](https://browse.arxiv.org/pdf/2405.06694.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.06694.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.06694)), ([:house:](https://huggingface.co/papers/2405.06694)), ([HTML](https://browse.arxiv.org/html/2405.06694v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.06694)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.06694v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.06694)), ([SS](https://api.semanticscholar.org/arXiv:2405.06694))

  * 05/07 - **Meta Releases Llama 3 Open-Source LLM** 
  ([News](https://www.infoq.com/news/2024/05/meta-llama-3/)), 

  * 05/03 - **What matters when building vision-language models?** 
([:x:](https://arxiv.org/abs/2405.02246)), ([:book:](https://browse.arxiv.org/pdf/2405.02246.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.02246.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.02246)), ([:house:](https://huggingface.co/papers/2405.02246)), ([HTML](https://browse.arxiv.org/html/2405.02246v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.02246)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.02246v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.02246)), ([SS](https://api.semanticscholar.org/arXiv:2405.02246))

  * 05/02 - **WildChat: 1M ChatGPT Interaction Logs in the Wild** 
([:x:](https://arxiv.org/abs/2405.01470)), ([:book:](https://browse.arxiv.org/pdf/2405.01470.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.01470.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.01470)), ([:house:](https://huggingface.co/papers/2405.01470)), ([HTML](https://browse.arxiv.org/html/2405.01470v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.01470)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.01470v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.01470)), ([SS](https://api.semanticscholar.org/arXiv:2405.01470))

  * 05/02 - **StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation** 
([:x:](https://arxiv.org/abs/2405.01434)), ([:book:](https://browse.arxiv.org/pdf/2405.01434.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.01434.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.01434)), ([:house:](https://huggingface.co/papers/2405.01434)), ([HTML](https://browse.arxiv.org/html/2405.01434v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.01434)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.01434v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.01434)), ([SS](https://api.semanticscholar.org/arXiv:2405.01434))

  * 05/02 - **Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models** 
([:x:](https://arxiv.org/abs/2405.01535)), ([:book:](https://browse.arxiv.org/pdf/2405.01535.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.01535.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.01535)), ([:house:](https://huggingface.co/papers/2405.01535)), ([HTML](https://browse.arxiv.org/html/2405.01535v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.01535)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.01535v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.01535)), ([SS](https://api.semanticscholar.org/arXiv:2405.01535))

  * 05/02 - **NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment** 
([:x:](https://arxiv.org/abs/2405.01481)), ([:book:](https://browse.arxiv.org/pdf/2405.01481.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.01481.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.01481)), ([:house:](https://huggingface.co/papers/2405.01481)), ([HTML](https://browse.arxiv.org/html/2405.01481v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.01481)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.01481v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.01481)), ([SS](https://api.semanticscholar.org/arXiv:2405.01481))

  * 05/02 - **LLM-AD: Large Language Model based Audio Description System** 
([:x:](https://arxiv.org/abs/2405.00983)), ([:book:](https://browse.arxiv.org/pdf/2405.00983.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00983.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00983)), ([:house:](https://huggingface.co/papers/2405.00983)), ([HTML](https://browse.arxiv.org/html/2405.00983v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.00983)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00983v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00983)), ([SS](https://api.semanticscholar.org/arXiv:2405.00983))

  * 05/02 - **FLAME: Factuality-Aware Alignment for Large Language Models** 
([:x:](https://arxiv.org/abs/2405.01525)), ([:book:](https://browse.arxiv.org/pdf/2405.01525.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.01525.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.01525)), ([:house:](https://huggingface.co/papers/2405.01525)), ([HTML](https://browse.arxiv.org/html/2405.01525v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.01525)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.01525v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.01525)), ([SS](https://api.semanticscholar.org/arXiv:2405.01525))

  * 05/02 - **Customizing Text-to-Image Models with a Single Image Pair** 
([:x:](https://arxiv.org/abs/2405.01536)), ([:book:](https://browse.arxiv.org/pdf/2405.01536.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.01536.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.01536)), ([:house:](https://huggingface.co/papers/2405.01536)), ([HTML](https://browse.arxiv.org/html/2405.01536v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.01536)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.01536v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.01536)), ([SS](https://api.semanticscholar.org/arXiv:2405.01536))

  * 05/01 - **Spectrally Pruned Gaussian Fields with Neural Compensation** 
([:x:](https://arxiv.org/abs/2405.00676)), ([:book:](https://browse.arxiv.org/pdf/2405.00676.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00676.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00676)), ([:house:](https://huggingface.co/papers/2405.00676)), ([HTML](https://browse.arxiv.org/html/2405.00676v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.00676)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00676v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00676)), ([SS](https://api.semanticscholar.org/arXiv:2405.00676))

  * 05/01 - **Self-Play Preference Optimization for Language Model Alignment** 
([:x:](https://arxiv.org/abs/2405.00675)), ([:book:](https://browse.arxiv.org/pdf/2405.00675.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00675.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00675)), ([:house:](https://huggingface.co/papers/2405.00675)), ([HTML](https://browse.arxiv.org/html/2405.00675v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.00675)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00675v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00675)), ([SS](https://api.semanticscholar.org/arXiv:2405.00675))

  * 05/01 - **Is Bigger Edit Batch Size Always Better? -- An Empirical Study on Model Editing with Llama-3** 
([:x:](https://arxiv.org/abs/2405.00664)), ([:book:](https://browse.arxiv.org/pdf/2405.00664.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00664.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00664)), ([:house:](https://huggingface.co/papers/2405.00664)), ([HTML](https://browse.arxiv.org/html/2405.00664v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.00664)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00664v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00664)), ([SS](https://api.semanticscholar.org/arXiv:2405.00664))

  * 05/01 - **Clover: Regressive Lightweight Speculative Decoding with Sequential Knowledge** 
([:x:](https://arxiv.org/abs/2405.00263)), ([:book:](https://browse.arxiv.org/pdf/2405.00263.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00263.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00263)), ([:house:](https://huggingface.co/papers/2405.00263)), ([HTML](https://browse.arxiv.org/html/2405.00263v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.00263)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00263v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00263)), ([SS](https://api.semanticscholar.org/arXiv:2405.00263))

  * 05/01 - **A Careful Examination of Large Language Model Performance on Grade School Arithmetic** 
([:x:](https://arxiv.org/abs/2405.00332)), ([:book:](https://browse.arxiv.org/pdf/2405.00332.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00332.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00332)), ([:house:](https://huggingface.co/papers/2405.00332)), ([HTML](https://browse.arxiv.org/html/2405.00332v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.00332)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00332v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00332)), ([SS](https://api.semanticscholar.org/arXiv:2405.00332))

  * 04/30 - **Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation** 
([:x:](https://arxiv.org/abs/2404.19752)), ([:book:](https://browse.arxiv.org/pdf/2404.19752.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19752.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19752)), ([:house:](https://huggingface.co/papers/2404.19752)), ([HTML](https://browse.arxiv.org/html/2404.19752v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19752)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19752v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19752)), ([SS](https://api.semanticscholar.org/arXiv:2404.19752))

  * 04/30 - **STT: Stateful Tracking with Transformers for Autonomous Driving** 
([:x:](https://arxiv.org/abs/2405.00236)), ([:book:](https://browse.arxiv.org/pdf/2405.00236.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00236.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00236)), ([:house:](https://huggingface.co/papers/2405.00236)), ([HTML](https://browse.arxiv.org/html/2405.00236v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.00236)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00236v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00236)), ([SS](https://api.semanticscholar.org/arXiv:2405.00236))

  * 04/30 - **SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound** 
([:x:](https://arxiv.org/abs/2405.00233)), ([:book:](https://browse.arxiv.org/pdf/2405.00233.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00233.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00233)), ([:house:](https://huggingface.co/papers/2405.00233)), ([HTML](https://browse.arxiv.org/html/2405.00233v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.00233)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00233v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00233)), ([SS](https://api.semanticscholar.org/arXiv:2405.00233))

  * 04/30 - **Octopus v4: Graph of language models** 
([:x:](https://arxiv.org/abs/2404.19296)), ([:book:](https://browse.arxiv.org/pdf/2404.19296.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19296.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19296)), ([:house:](https://huggingface.co/papers/2404.19296)), ([HTML](https://browse.arxiv.org/html/2404.19296v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19296)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19296v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19296)), ([SS](https://api.semanticscholar.org/arXiv:2404.19296))

  * 04/30 - **MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model** 
([:x:](https://arxiv.org/abs/2404.19759)), ([:book:](https://browse.arxiv.org/pdf/2404.19759.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19759.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19759)), ([:house:](https://huggingface.co/papers/2404.19759)), ([HTML](https://browse.arxiv.org/html/2404.19759v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19759)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19759v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19759)), ([SS](https://api.semanticscholar.org/arXiv:2404.19759))

  * 04/30 - **MicroDreamer: Zero-shot 3D Generation in sim20 Seconds by Score-based Iterative Reconstruction** 
([:x:](https://arxiv.org/abs/2404.19525)), ([:book:](https://browse.arxiv.org/pdf/2404.19525.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19525.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19525)), ([:house:](https://huggingface.co/papers/2404.19525)), ([HTML](https://browse.arxiv.org/html/2404.19525v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19525)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19525v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19525)), ([SS](https://api.semanticscholar.org/arXiv:2404.19525))

  * 04/30 - **Lightplane: Highly-Scalable Components for Neural 3D Fields** 
([:x:](https://arxiv.org/abs/2404.19760)), ([:book:](https://browse.arxiv.org/pdf/2404.19760.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19760.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19760)), ([:house:](https://huggingface.co/papers/2404.19760)), ([HTML](https://browse.arxiv.org/html/2404.19760v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19760)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19760v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19760)), ([SS](https://api.semanticscholar.org/arXiv:2404.19760))

  * 04/30 - **KAN: Kolmogorov-Arnold Networks** 
([:x:](https://arxiv.org/abs/2404.19756)), ([:book:](https://browse.arxiv.org/pdf/2404.19756.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19756.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19756)), ([:house:](https://huggingface.co/papers/2404.19756)), ([HTML](https://browse.arxiv.org/html/2404.19756v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19756)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19756v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19756)), ([SS](https://api.semanticscholar.org/arXiv:2404.19756))

  * 04/30 - **Iterative Reasoning Preference Optimization** 
([:x:](https://arxiv.org/abs/2404.19733)), ([:book:](https://browse.arxiv.org/pdf/2404.19733.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19733.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19733)), ([:house:](https://huggingface.co/papers/2404.19733)), ([HTML](https://browse.arxiv.org/html/2404.19733v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19733)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19733v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19733)), ([SS](https://api.semanticscholar.org/arXiv:2404.19733))

  * 04/30 - **Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting** 
([:x:](https://arxiv.org/abs/2404.19758)), ([:book:](https://browse.arxiv.org/pdf/2404.19758.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19758.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19758)), ([:house:](https://huggingface.co/papers/2404.19758)), ([HTML](https://browse.arxiv.org/html/2404.19758v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19758)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19758v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19758)), ([SS](https://api.semanticscholar.org/arXiv:2404.19758))

  * 04/30 - **InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation** 
([:x:](https://arxiv.org/abs/2404.19427)), ([:book:](https://browse.arxiv.org/pdf/2404.19427.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19427.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19427)), ([:house:](https://huggingface.co/papers/2404.19427)), ([HTML](https://browse.arxiv.org/html/2404.19427v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19427)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19427v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19427)), ([SS](https://api.semanticscholar.org/arXiv:2404.19427))

  * 04/30 - **GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting** 
([:x:](https://arxiv.org/abs/2404.19702)), ([:book:](https://browse.arxiv.org/pdf/2404.19702.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19702.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19702)), ([:house:](https://huggingface.co/papers/2404.19702)), ([HTML](https://browse.arxiv.org/html/2404.19702v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19702)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19702v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19702)), ([SS](https://api.semanticscholar.org/arXiv:2404.19702))

  * 04/30 - **Extending Llama-3's Context Ten-Fold Overnight** 
([:x:](https://arxiv.org/abs/2404.19553)), ([:book:](https://browse.arxiv.org/pdf/2404.19553.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19553.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19553)), ([:house:](https://huggingface.co/papers/2404.19553)), ([HTML](https://browse.arxiv.org/html/2404.19553v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19553)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19553v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19553)), ([SS](https://api.semanticscholar.org/arXiv:2404.19553))

  * 04/30 - **DOCCI: Descriptions of Connected and Contrasting Images** 
([:x:](https://arxiv.org/abs/2404.19753)), ([:book:](https://browse.arxiv.org/pdf/2404.19753.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19753.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19753)), ([:house:](https://huggingface.co/papers/2404.19753)), ([HTML](https://browse.arxiv.org/html/2404.19753v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19753)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19753v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19753)), ([SS](https://api.semanticscholar.org/arXiv:2404.19753))

  * 04/30 - **Better & Faster Large Language Models via Multi-token Prediction** 
([:x:](https://arxiv.org/abs/2404.19737)), ([:book:](https://browse.arxiv.org/pdf/2404.19737.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19737.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19737)), ([:house:](https://huggingface.co/papers/2404.19737)), ([HTML](https://browse.arxiv.org/html/2404.19737v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19737)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19737v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19737)), ([SS](https://api.semanticscholar.org/arXiv:2404.19737))

  * 04/29 - **Stylus: Automatic Adapter Selection for Diffusion Models** 
([:x:](https://arxiv.org/abs/2404.18928)), ([:book:](https://browse.arxiv.org/pdf/2404.18928.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.18928.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.18928)), ([:house:](https://huggingface.co/papers/2404.18928)), ([HTML](https://browse.arxiv.org/html/2404.18928v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.18928)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.18928v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.18928)), ([SS](https://api.semanticscholar.org/arXiv:2404.18928))

  * 04/29 - **SAGS: Structure-Aware 3D Gaussian Splatting** 
([:x:](https://arxiv.org/abs/2404.19149)), ([:book:](https://browse.arxiv.org/pdf/2404.19149.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19149.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19149)), ([:house:](https://huggingface.co/papers/2404.19149)), ([HTML](https://browse.arxiv.org/html/2404.19149v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.19149)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19149v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19149)), ([SS](https://api.semanticscholar.org/arXiv:2404.19149))

  * 04/29 - **Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models** 
([:x:](https://arxiv.org/abs/2404.18796)), ([:book:](https://browse.arxiv.org/pdf/2404.18796.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.18796.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.18796)), ([:house:](https://huggingface.co/papers/2404.18796)), ([HTML](https://browse.arxiv.org/html/2404.18796v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.18796)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.18796v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.18796)), ([SS](https://api.semanticscholar.org/arXiv:2404.18796))

  * 04/29 - **NIST  AI RMF Generative AI Profile** 
  ([News](https://www.nist.gov/news-events/news/2024/04/department-commerce-announces-new-actions-implement-president-bidens)), 

  * 04/29 - **LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report** 
([:x:](https://arxiv.org/abs/2405.00732)), ([:book:](https://browse.arxiv.org/pdf/2405.00732.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00732.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00732)), ([:house:](https://huggingface.co/papers/2405.00732)), ([HTML](https://browse.arxiv.org/html/2405.00732v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2405.00732)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00732v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00732)), ([SS](https://api.semanticscholar.org/arXiv:2405.00732))

  * 04/29 - **Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting** 
([:x:](https://arxiv.org/abs/2404.18911)), ([:book:](https://browse.arxiv.org/pdf/2404.18911.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.18911.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.18911)), ([:house:](https://huggingface.co/papers/2404.18911)), ([HTML](https://browse.arxiv.org/html/2404.18911v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.18911)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.18911v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.18911)), ([SS](https://api.semanticscholar.org/arXiv:2404.18911))

  * 04/29 - **Capabilities of Gemini Models in Medicine** 
([:x:](https://arxiv.org/abs/2404.18416)), ([:book:](https://browse.arxiv.org/pdf/2404.18416.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.18416.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.18416)), ([:house:](https://huggingface.co/papers/2404.18416)), ([HTML](https://browse.arxiv.org/html/2404.18416v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.18416)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.18416v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.18416)), ([SS](https://api.semanticscholar.org/arXiv:2404.18416))

  * 04/28 - **Paint by Inpaint: Learning to Add Image Objects by Removing Them First** 
([:x:](https://arxiv.org/abs/2404.18212)), ([:book:](https://browse.arxiv.org/pdf/2404.18212.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.18212.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.18212)), ([:house:](https://huggingface.co/papers/2404.18212)), ([HTML](https://browse.arxiv.org/html/2404.18212v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.18212)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.18212v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.18212)), ([SS](https://api.semanticscholar.org/arXiv:2404.18212))

  * 04/28 - **LEGENT: Open Platform for Embodied Agents** 
([:x:](https://arxiv.org/abs/2404.18243)), ([:book:](https://browse.arxiv.org/pdf/2404.18243.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.18243.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.18243)), ([:house:](https://huggingface.co/papers/2404.18243)), ([HTML](https://browse.arxiv.org/html/2404.18243v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.18243)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.18243v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.18243)), ([SS](https://api.semanticscholar.org/arXiv:2404.18243))

  * 04/27 - **Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations** 
([:x:](https://arxiv.org/abs/2404.17521)), ([:book:](https://browse.arxiv.org/pdf/2404.17521.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.17521.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.17521)), ([:house:](https://huggingface.co/papers/2404.17521)), ([HTML](https://browse.arxiv.org/html/2404.17521v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.17521)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.17521v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.17521)), ([SS](https://api.semanticscholar.org/arXiv:2404.17521))

  * 04/26 - **MaPa: Text-driven Photorealistic Material Painting for 3D Shapes** 
([:x:](https://arxiv.org/abs/2404.17569)), ([:book:](https://browse.arxiv.org/pdf/2404.17569.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.17569.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.17569)), ([:house:](https://huggingface.co/papers/2404.17569)), ([HTML](https://browse.arxiv.org/html/2404.17569v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.17569)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.17569v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.17569)), ([SS](https://api.semanticscholar.org/arXiv:2404.17569))

  * 04/26 - **BlenderAlchemy: Editing 3D Graphics with Vision-Language Models** 
([:x:](https://arxiv.org/abs/2404.17672)), ([:book:](https://browse.arxiv.org/pdf/2404.17672.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.17672.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.17672)), ([:house:](https://huggingface.co/papers/2404.17672)), ([HTML](https://browse.arxiv.org/html/2404.17672v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.17672)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.17672v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.17672)), ([SS](https://api.semanticscholar.org/arXiv:2404.17672))

  * 04/25 - **Tele-FLM Technical Report** 
([:x:](https://arxiv.org/abs/2404.16645)), ([:book:](https://browse.arxiv.org/pdf/2404.16645.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16645.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16645)), ([:house:](https://huggingface.co/papers/2404.16645)), ([HTML](https://browse.arxiv.org/html/2404.16645v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16645)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16645v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16645)), ([SS](https://api.semanticscholar.org/arXiv:2404.16645))

  * 04/25 - **SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension** 
([:x:](https://arxiv.org/abs/2404.16790)), ([:book:](https://browse.arxiv.org/pdf/2404.16790.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16790.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16790)), ([:house:](https://huggingface.co/papers/2404.16790)), ([HTML](https://browse.arxiv.org/html/2404.16790v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16790)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16790v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16790)), ([SS](https://api.semanticscholar.org/arXiv:2404.16790))

  * 04/25 - **Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings** 
([:x:](https://arxiv.org/abs/2404.16820)), ([:book:](https://browse.arxiv.org/pdf/2404.16820.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16820.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16820)), ([:house:](https://huggingface.co/papers/2404.16820)), ([HTML](https://browse.arxiv.org/html/2404.16820v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16820)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16820v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16820)), ([SS](https://api.semanticscholar.org/arXiv:2404.16820))

  * 04/25 - **PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning** 
([:x:](https://arxiv.org/abs/2404.16994)), ([:book:](https://browse.arxiv.org/pdf/2404.16994.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16994.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16994)), ([:house:](https://huggingface.co/papers/2404.16994)), ([HTML](https://browse.arxiv.org/html/2404.16994v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16994)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16994v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16994)), ([SS](https://api.semanticscholar.org/arXiv:2404.16994))

  * 04/25 - **Make Your LLM Fully Utilize the Context** 
([:x:](https://arxiv.org/abs/2404.16811)), ([:book:](https://browse.arxiv.org/pdf/2404.16811.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16811.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16811)), ([:house:](https://huggingface.co/papers/2404.16811)), ([HTML](https://browse.arxiv.org/html/2404.16811v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16811)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16811v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16811)), ([SS](https://api.semanticscholar.org/arXiv:2404.16811))

  * 04/25 - **List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs** 
([:x:](https://arxiv.org/abs/2404.16375)), ([:book:](https://browse.arxiv.org/pdf/2404.16375.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16375.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16375)), ([:house:](https://huggingface.co/papers/2404.16375)), ([HTML](https://browse.arxiv.org/html/2404.16375v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16375)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16375v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16375)), ([SS](https://api.semanticscholar.org/arXiv:2404.16375))

  * 04/25 - **Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding** 
([:x:](https://arxiv.org/abs/2404.16710)), ([:book:](https://browse.arxiv.org/pdf/2404.16710.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16710.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16710)), ([:house:](https://huggingface.co/papers/2404.16710)), ([HTML](https://browse.arxiv.org/html/2404.16710v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16710)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16710v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16710)), ([SS](https://api.semanticscholar.org/arXiv:2404.16710))

  * 04/25 - **Interactive3D: Create What You Want by Interactive 3D Generation** 
([:x:](https://arxiv.org/abs/2404.16510)), ([:book:](https://browse.arxiv.org/pdf/2404.16510.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16510.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16510)), ([:house:](https://huggingface.co/papers/2404.16510)), ([HTML](https://browse.arxiv.org/html/2404.16510v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16510)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16510v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16510)), ([SS](https://api.semanticscholar.org/arXiv:2404.16510))

  * 04/25 - **How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites** 
([:x:](https://arxiv.org/abs/2404.16821)), ([:book:](https://browse.arxiv.org/pdf/2404.16821.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16821.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16821)), ([:house:](https://huggingface.co/papers/2404.16821)), ([HTML](https://browse.arxiv.org/html/2404.16821v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16821)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16821v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16821)), ([SS](https://api.semanticscholar.org/arXiv:2404.16821))

  * 04/25 - **ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving** 
([:x:](https://arxiv.org/abs/2404.16771)), ([:book:](https://browse.arxiv.org/pdf/2404.16771.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16771.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16771)), ([:house:](https://huggingface.co/papers/2404.16771)), ([HTML](https://browse.arxiv.org/html/2404.16771v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16771)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16771v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16771)), ([SS](https://api.semanticscholar.org/arXiv:2404.16771))

  * 04/24 - **XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference** 
([:x:](https://arxiv.org/abs/2404.15420)), ([:book:](https://browse.arxiv.org/pdf/2404.15420.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.15420.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.15420)), ([:house:](https://huggingface.co/papers/2404.15420)), ([HTML](https://browse.arxiv.org/html/2404.15420v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.15420)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.15420v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.15420)), ([SS](https://api.semanticscholar.org/arXiv:2404.15420))

  * 04/24 - **The Ethics of Advanced AI Assistants** 
([:x:](https://arxiv.org/abs/2404.16244)), ([:book:](https://browse.arxiv.org/pdf/2404.16244.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16244.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16244)), ([:house:](https://huggingface.co/papers/2404.16244)), ([HTML](https://browse.arxiv.org/html/2404.16244v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16244)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16244v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16244)), ([SS](https://api.semanticscholar.org/arXiv:2404.16244))

  * 04/24 - **PuLID: Pure and Lightning ID Customization via Contrastive Alignment** 
([:x:](https://arxiv.org/abs/2404.16022)), ([:book:](https://browse.arxiv.org/pdf/2404.16022.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16022.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16022)), ([:house:](https://huggingface.co/papers/2404.16022)), ([HTML](https://browse.arxiv.org/html/2404.16022v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16022)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16022v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16022)), ([SS](https://api.semanticscholar.org/arXiv:2404.16022))

  * 04/24 - **NeRF-XL: Scaling NeRFs with Multiple GPUs** 
([:x:](https://arxiv.org/abs/2404.16221)), ([:book:](https://browse.arxiv.org/pdf/2404.16221.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16221.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16221)), ([:house:](https://huggingface.co/papers/2404.16221)), ([HTML](https://browse.arxiv.org/html/2404.16221v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16221)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16221v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16221)), ([SS](https://api.semanticscholar.org/arXiv:2404.16221))

  * 04/24 - **MotionMaster: Training-free Camera Motion Transfer For Video Generation** 
([:x:](https://arxiv.org/abs/2404.15789)), ([:book:](https://browse.arxiv.org/pdf/2404.15789.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.15789.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.15789)), ([:house:](https://huggingface.co/papers/2404.15789)), ([HTML](https://browse.arxiv.org/html/2404.15789v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.15789)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.15789v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.15789)), ([SS](https://api.semanticscholar.org/arXiv:2404.15789))

  * 04/24 - **MoDE: CLIP Data Experts via Clustering** 
([:x:](https://arxiv.org/abs/2404.16030)), ([:book:](https://browse.arxiv.org/pdf/2404.16030.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16030.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16030)), ([:house:](https://huggingface.co/papers/2404.16030)), ([HTML](https://browse.arxiv.org/html/2404.16030v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16030)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16030v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16030)), ([SS](https://api.semanticscholar.org/arXiv:2404.16030))

  * 04/24 - **MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI** 
([:x:](https://arxiv.org/abs/2404.16006)), ([:book:](https://browse.arxiv.org/pdf/2404.16006.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16006.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16006)), ([:house:](https://huggingface.co/papers/2404.16006)), ([HTML](https://browse.arxiv.org/html/2404.16006v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16006)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16006v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16006)), ([SS](https://api.semanticscholar.org/arXiv:2404.16006))

  * 04/24 - **MaGGIe: Masked Guided Gradual Human Instance Matting** 
([:x:](https://arxiv.org/abs/2404.16035)), ([:book:](https://browse.arxiv.org/pdf/2404.16035.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16035.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16035)), ([:house:](https://huggingface.co/papers/2404.16035)), ([HTML](https://browse.arxiv.org/html/2404.16035v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16035)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16035v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16035)), ([SS](https://api.semanticscholar.org/arXiv:2404.16035))

  * 04/24 - **ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning** 
([:x:](https://arxiv.org/abs/2404.15449)), ([:book:](https://browse.arxiv.org/pdf/2404.15449.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.15449.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.15449)), ([:house:](https://huggingface.co/papers/2404.15449)), ([HTML](https://browse.arxiv.org/html/2404.15449v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.15449)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.15449v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.15449)), ([SS](https://api.semanticscholar.org/arXiv:2404.15449))

  * 04/24 - **Editable Image Elements for Controllable Synthesis** 
([:x:](https://arxiv.org/abs/2404.16029)), ([:book:](https://browse.arxiv.org/pdf/2404.16029.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16029.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16029)), ([:house:](https://huggingface.co/papers/2404.16029)), ([HTML](https://browse.arxiv.org/html/2404.16029v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16029)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16029v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16029)), ([SS](https://api.semanticscholar.org/arXiv:2404.16029))

  * 04/24 - **CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data** 
([:x:](https://arxiv.org/abs/2404.15653)), ([:book:](https://browse.arxiv.org/pdf/2404.15653.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.15653.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.15653)), ([:house:](https://huggingface.co/papers/2404.15653)), ([HTML](https://browse.arxiv.org/html/2404.15653v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.15653)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.15653v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.15653)), ([SS](https://api.semanticscholar.org/arXiv:2404.15653))

  * 04/24 - **BASS: Batched Attention-optimized Speculative Sampling** 
([:x:](https://arxiv.org/abs/2404.15778)), ([:book:](https://browse.arxiv.org/pdf/2404.15778.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.15778.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.15778)), ([:house:](https://huggingface.co/papers/2404.15778)), ([HTML](https://browse.arxiv.org/html/2404.15778v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.15778)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.15778v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.15778)), ([SS](https://api.semanticscholar.org/arXiv:2404.15778))

  * 04/23 - **Transformers Can Represent n-gram Language Models** 
([:x:](https://arxiv.org/abs/2404.14994)), ([:book:](https://browse.arxiv.org/pdf/2404.14994.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14994.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14994)), ([:house:](https://huggingface.co/papers/2404.14994)), ([HTML](https://browse.arxiv.org/html/2404.14994v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14994)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14994v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14994)), ([SS](https://api.semanticscholar.org/arXiv:2404.14994))

  * 04/23 - **Pegasus-v1 Technical Report** 
([:x:](https://arxiv.org/abs/2404.14687)), ([:book:](https://browse.arxiv.org/pdf/2404.14687.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14687.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14687)), ([:house:](https://huggingface.co/papers/2404.14687)), ([HTML](https://browse.arxiv.org/html/2404.14687v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14687)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14687v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14687)), ([SS](https://api.semanticscholar.org/arXiv:2404.14687))

  * 04/23 - **Multi-Head Mixture-of-Experts** 
([:x:](https://arxiv.org/abs/2404.15045)), ([:book:](https://browse.arxiv.org/pdf/2404.15045.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.15045.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.15045)), ([:house:](https://huggingface.co/papers/2404.15045)), ([HTML](https://browse.arxiv.org/html/2404.15045v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.15045)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.15045v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.15045)), ([SS](https://api.semanticscholar.org/arXiv:2404.15045))

  * 04/23 - **FlashSpeech: Efficient Zero-Shot Speech Synthesis** 
([:x:](https://arxiv.org/abs/2404.14700)), ([:book:](https://browse.arxiv.org/pdf/2404.14700.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14700.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14700)), ([:house:](https://huggingface.co/papers/2404.14700)), ([HTML](https://browse.arxiv.org/html/2404.14700v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14700)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14700v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14700)), ([SS](https://api.semanticscholar.org/arXiv:2404.14700))

  * 04/22 - **SnapKV: LLM Knows What You are Looking for Before Generation** 
([:x:](https://arxiv.org/abs/2404.14469)), ([:book:](https://browse.arxiv.org/pdf/2404.14469.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14469.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14469)), ([:house:](https://huggingface.co/papers/2404.14469)), ([HTML](https://browse.arxiv.org/html/2404.14469v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14469)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14469v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14469)), ([SS](https://api.semanticscholar.org/arXiv:2404.14469))

  * 04/22 - **SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation** 
([:x:](https://arxiv.org/abs/2404.14396)), ([:book:](https://browse.arxiv.org/pdf/2404.14396.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14396.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14396)), ([:house:](https://huggingface.co/papers/2404.14396)), ([HTML](https://browse.arxiv.org/html/2404.14396v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14396)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14396v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14396)), ([SS](https://api.semanticscholar.org/arXiv:2404.14396))

  * 04/22 - **Scene Coordinate Reconstruction: Posing of Image Collections via Incremental Learning of a Relocalizer** 
([:x:](https://arxiv.org/abs/2404.14351)), ([:book:](https://browse.arxiv.org/pdf/2404.14351.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14351.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14351)), ([:house:](https://huggingface.co/papers/2404.14351)), ([HTML](https://browse.arxiv.org/html/2404.14351v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14351)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14351v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14351)), ([SS](https://api.semanticscholar.org/arXiv:2404.14351))

  * 04/22 - **Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone** 
([:x:](https://arxiv.org/abs/2404.14219)), ([:book:](https://browse.arxiv.org/pdf/2404.14219.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14219.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14219)), ([:house:](https://huggingface.co/papers/2404.14219)), ([HTML](https://browse.arxiv.org/html/2404.14219v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14219)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14219v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14219)), ([SS](https://api.semanticscholar.org/arXiv:2404.14219))

  * 04/22 - **OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework** 
([:x:](https://arxiv.org/abs/2404.14619)), ([:book:](https://browse.arxiv.org/pdf/2404.14619.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14619.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14619)), ([:house:](https://huggingface.co/papers/2404.14619)), ([HTML](https://browse.arxiv.org/html/2404.14619v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14619)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14619v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14619)), ([SS](https://api.semanticscholar.org/arXiv:2404.14619))

  * 04/22 - **MultiBooth: Towards Generating All Your Concepts in an Image from Text** 
([:x:](https://arxiv.org/abs/2404.14239)), ([:book:](https://browse.arxiv.org/pdf/2404.14239.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14239.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14239)), ([:house:](https://huggingface.co/papers/2404.14239)), ([HTML](https://browse.arxiv.org/html/2404.14239v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14239)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14239v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14239)), ([SS](https://api.semanticscholar.org/arXiv:2404.14239))

  * 04/22 - **Learning H-Infinity Locomotion Control** 
([:x:](https://arxiv.org/abs/2404.14405)), ([:book:](https://browse.arxiv.org/pdf/2404.14405.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14405.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14405)), ([:house:](https://huggingface.co/papers/2404.14405)), ([HTML](https://browse.arxiv.org/html/2404.14405v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14405)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14405v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14405)), ([SS](https://api.semanticscholar.org/arXiv:2404.14405))

  * 04/22 - **How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study** 
([:x:](https://arxiv.org/abs/2404.14047)), ([:book:](https://browse.arxiv.org/pdf/2404.14047.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14047.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14047)), ([:house:](https://huggingface.co/papers/2404.14047)), ([HTML](https://browse.arxiv.org/html/2404.14047v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14047)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14047v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14047)), ([SS](https://api.semanticscholar.org/arXiv:2404.14047))

  * 04/22 - **Align Your Steps: Optimizing Sampling Schedules in Diffusion Models** 
([:x:](https://arxiv.org/abs/2404.14507)), ([:book:](https://browse.arxiv.org/pdf/2404.14507.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14507.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14507)), ([:house:](https://huggingface.co/papers/2404.14507)), ([HTML](https://browse.arxiv.org/html/2404.14507v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14507)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14507v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14507)), ([SS](https://api.semanticscholar.org/arXiv:2404.14507))

  * 04/22 - **A Multimodal Automated Interpretability Agent** 
([:x:](https://arxiv.org/abs/2404.14394)), ([:book:](https://browse.arxiv.org/pdf/2404.14394.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14394.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14394)), ([:house:](https://huggingface.co/papers/2404.14394)), ([HTML](https://browse.arxiv.org/html/2404.14394v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.14394)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14394v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14394)), ([SS](https://api.semanticscholar.org/arXiv:2404.14394))

  * 04/21 - **Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis** 
([:x:](https://arxiv.org/abs/2404.13686)), ([:book:](https://browse.arxiv.org/pdf/2404.13686.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.13686.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.13686)), ([:house:](https://huggingface.co/papers/2404.13686)), ([HTML](https://browse.arxiv.org/html/2404.13686v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.13686)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.13686v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.13686)), ([SS](https://api.semanticscholar.org/arXiv:2404.13686))

  * 04/21 - **AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs** 
([:x:](https://arxiv.org/abs/2404.16873)), ([:book:](https://browse.arxiv.org/pdf/2404.16873.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16873.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16873)), ([:house:](https://huggingface.co/papers/2404.16873)), ([HTML](https://browse.arxiv.org/html/2404.16873v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.16873)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16873v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16873)), ([SS](https://api.semanticscholar.org/arXiv:2404.16873))

  * 04/20 - **Music Consistency Models** 
([:x:](https://arxiv.org/abs/2404.13358)), ([:book:](https://browse.arxiv.org/pdf/2404.13358.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.13358.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.13358)), ([:house:](https://huggingface.co/papers/2404.13358)), ([HTML](https://browse.arxiv.org/html/2404.13358v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.13358)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.13358v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.13358)), ([SS](https://api.semanticscholar.org/arXiv:2404.13358))

  * 04/19 - **The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions** 
([:x:](https://arxiv.org/abs/2404.13208)), ([:book:](https://browse.arxiv.org/pdf/2404.13208.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.13208.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.13208)), ([:house:](https://huggingface.co/papers/2404.13208)), ([HTML](https://browse.arxiv.org/html/2404.13208v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.13208)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.13208v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.13208)), ([SS](https://api.semanticscholar.org/arXiv:2404.13208))

  * 04/19 - **TextSquare: Scaling up Text-Centric Visual Instruction Tuning** 
([:x:](https://arxiv.org/abs/2404.12803)), ([:book:](https://browse.arxiv.org/pdf/2404.12803.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12803.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12803)), ([:house:](https://huggingface.co/papers/2404.12803)), ([HTML](https://browse.arxiv.org/html/2404.12803v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12803)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12803v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12803)), ([SS](https://api.semanticscholar.org/arXiv:2404.12803))

  * 04/19 - **PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation** 
([:x:](https://arxiv.org/abs/2404.13026)), ([:book:](https://browse.arxiv.org/pdf/2404.13026.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.13026.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.13026)), ([:house:](https://huggingface.co/papers/2404.13026)), ([HTML](https://browse.arxiv.org/html/2404.13026v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.13026)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.13026v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.13026)), ([SS](https://api.semanticscholar.org/arXiv:2404.13026))

  * 04/19 - **LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency** 
([:x:](https://arxiv.org/abs/2404.12872)), ([:book:](https://browse.arxiv.org/pdf/2404.12872.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12872.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12872)), ([:house:](https://huggingface.co/papers/2404.12872)), ([HTML](https://browse.arxiv.org/html/2404.12872v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12872)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12872v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12872)), ([SS](https://api.semanticscholar.org/arXiv:2404.12872))

  * 04/19 - **How Real Is Real? A Human Evaluation Framework for Unrestricted Adversarial Examples** 
([:x:](https://arxiv.org/abs/2404.12653)), ([:book:](https://browse.arxiv.org/pdf/2404.12653.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12653.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12653)), ([:house:](https://huggingface.co/papers/2404.12653)), ([HTML](https://browse.arxiv.org/html/2404.12653v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12653)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12653v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12653)), ([SS](https://api.semanticscholar.org/arXiv:2404.12653))

  * 04/19 - **How Far Can We Go with Practical Function-Level Program Repair?** 
([:x:](https://arxiv.org/abs/2404.12833)), ([:book:](https://browse.arxiv.org/pdf/2404.12833.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12833.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12833)), ([:house:](https://huggingface.co/papers/2404.12833)), ([HTML](https://browse.arxiv.org/html/2404.12833v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12833)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12833v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12833)), ([SS](https://api.semanticscholar.org/arXiv:2404.12833))

  * 04/19 - **Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models** 
([:x:](https://arxiv.org/abs/2404.13013)), ([:book:](https://browse.arxiv.org/pdf/2404.13013.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.13013.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.13013)), ([:house:](https://huggingface.co/papers/2404.13013)), ([HTML](https://browse.arxiv.org/html/2404.13013v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.13013)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.13013v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.13013)), ([SS](https://api.semanticscholar.org/arXiv:2404.13013))

  * 04/19 - **Does Gaussian Splatting need SFM Initialization?** 
([:x:](https://arxiv.org/abs/2404.12547)), ([:book:](https://browse.arxiv.org/pdf/2404.12547.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12547.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12547)), ([:house:](https://huggingface.co/papers/2404.12547)), ([HTML](https://browse.arxiv.org/html/2404.12547v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12547)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12547v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12547)), ([SS](https://api.semanticscholar.org/arXiv:2404.12547))

  * 04/19 - **AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation** 
([:x:](https://arxiv.org/abs/2404.12753)), ([:book:](https://browse.arxiv.org/pdf/2404.12753.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12753.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12753)), ([:house:](https://huggingface.co/papers/2404.12753)), ([HTML](https://browse.arxiv.org/html/2404.12753v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12753)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12753v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12753)), ([SS](https://api.semanticscholar.org/arXiv:2404.12753))

  * 04/18 - **TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding** 
([:x:](https://arxiv.org/abs/2404.11912)), ([:book:](https://browse.arxiv.org/pdf/2404.11912.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.11912.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.11912)), ([:house:](https://huggingface.co/papers/2404.11912)), ([HTML](https://browse.arxiv.org/html/2404.11912v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.11912)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.11912v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.11912)), ([SS](https://api.semanticscholar.org/arXiv:2404.11912))

  * 04/18 - **Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing** 
([:x:](https://arxiv.org/abs/2404.12253)), ([:book:](https://browse.arxiv.org/pdf/2404.12253.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12253.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12253)), ([:house:](https://huggingface.co/papers/2404.12253)), ([HTML](https://browse.arxiv.org/html/2404.12253v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12253)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12253v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12253)), ([SS](https://api.semanticscholar.org/arXiv:2404.12253))

  * 04/18 - **Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment** 
([:x:](https://arxiv.org/abs/2404.12318)), ([:book:](https://browse.arxiv.org/pdf/2404.12318.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12318.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12318)), ([:house:](https://huggingface.co/papers/2404.12318)), ([HTML](https://browse.arxiv.org/html/2404.12318v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12318)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12318v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12318)), ([SS](https://api.semanticscholar.org/arXiv:2404.12318))

  * 04/18 - **Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models** 
([:x:](https://arxiv.org/abs/2404.12387)), ([:book:](https://browse.arxiv.org/pdf/2404.12387.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12387.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12387)), ([:house:](https://huggingface.co/papers/2404.12387)), ([HTML](https://browse.arxiv.org/html/2404.12387v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12387)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12387v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12387)), ([SS](https://api.semanticscholar.org/arXiv:2404.12387))

  * 04/18 - **OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of Instruction Data** 
([:x:](https://arxiv.org/abs/2404.12195)), ([:book:](https://browse.arxiv.org/pdf/2404.12195.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12195.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12195)), ([:house:](https://huggingface.co/papers/2404.12195)), ([HTML](https://browse.arxiv.org/html/2404.12195v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12195)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12195v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12195)), ([SS](https://api.semanticscholar.org/arXiv:2404.12195))

  * 04/18 - **MeshLRM: Large Reconstruction Model for High-Quality Mesh** 
([:x:](https://arxiv.org/abs/2404.12385)), ([:book:](https://browse.arxiv.org/pdf/2404.12385.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12385.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12385)), ([:house:](https://huggingface.co/papers/2404.12385)), ([HTML](https://browse.arxiv.org/html/2404.12385v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12385)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12385v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12385)), ([SS](https://api.semanticscholar.org/arXiv:2404.12385))

  * 04/18 - **Introducing v0.5 of the AI Safety Benchmark from MLCommons** 
([:x:](https://arxiv.org/abs/2404.12241)), ([:book:](https://browse.arxiv.org/pdf/2404.12241.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12241.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12241)), ([:house:](https://huggingface.co/papers/2404.12241)), ([HTML](https://browse.arxiv.org/html/2404.12241v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12241)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12241v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12241)), ([SS](https://api.semanticscholar.org/arXiv:2404.12241))

  * 04/18 - **Introducing Meta Llama 3: The most capable openly available LLM to date** 
  ([Blog](https://ai.meta.com/blog/meta-llama-3/)), 

  * 04/18 - **EdgeFusion: On-Device Text-to-Image Generation** 
([:x:](https://arxiv.org/abs/2404.11925)), ([:book:](https://browse.arxiv.org/pdf/2404.11925.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.11925.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.11925)), ([:house:](https://huggingface.co/papers/2404.11925)), ([HTML](https://browse.arxiv.org/html/2404.11925v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.11925)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.11925v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.11925)), ([SS](https://api.semanticscholar.org/arXiv:2404.11925))

  * 04/18 - **BLINK: Multimodal Large Language Models Can See but Not Perceive** 
([:x:](https://arxiv.org/abs/2404.12390)), ([:book:](https://browse.arxiv.org/pdf/2404.12390.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12390.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12390)), ([:house:](https://huggingface.co/papers/2404.12390)), ([HTML](https://browse.arxiv.org/html/2404.12390v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12390)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12390v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12390)), ([SS](https://api.semanticscholar.org/arXiv:2404.12390))

  * 04/18 - **AniClipart: Clipart Animation with Text-to-Video Priors** 
([:x:](https://arxiv.org/abs/2404.12347)), ([:book:](https://browse.arxiv.org/pdf/2404.12347.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12347.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12347)), ([:house:](https://huggingface.co/papers/2404.12347)), ([HTML](https://browse.arxiv.org/html/2404.12347v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.12347)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12347v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12347)), ([SS](https://api.semanticscholar.org/arXiv:2404.12347))

  * 04/17 - **MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation** 
([:x:](https://arxiv.org/abs/2404.11565)), ([:book:](https://browse.arxiv.org/pdf/2404.11565.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.11565.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.11565)), ([:house:](https://huggingface.co/papers/2404.11565)), ([HTML](https://browse.arxiv.org/html/2404.11565v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.11565)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.11565v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.11565)), ([SS](https://api.semanticscholar.org/arXiv:2404.11565))

  * 04/17 - **FlowMind: Automatic Workflow Generation with LLMs** 
([:x:](https://arxiv.org/abs/2404.13050)), ([:book:](https://browse.arxiv.org/pdf/2404.13050.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.13050.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.13050)), ([:house:](https://huggingface.co/papers/2404.13050)), ([HTML](https://browse.arxiv.org/html/2404.13050v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.13050)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.13050v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.13050)), ([SS](https://api.semanticscholar.org/arXiv:2404.13050))

  * 04/17 - **Dynamic Typography: Bringing Words to Life** 
([:x:](https://arxiv.org/abs/2404.11614)), ([:book:](https://browse.arxiv.org/pdf/2404.11614.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.11614.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.11614)), ([:house:](https://huggingface.co/papers/2404.11614)), ([HTML](https://browse.arxiv.org/html/2404.11614v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.11614)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.11614v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.11614)), ([SS](https://api.semanticscholar.org/arXiv:2404.11614))

  * 04/17 - **Stable Diffusion 3 API Now Available** 
  ([twitter](https://twitter.com/StabilityAI/status/1780599024707596508)),  ([Blog](https://stability.ai/news/stable-diffusion-3-api?utm_source=twitter&utm_medium=website&utm_campaign=blog)),  ([Demo](https://platform.stability.ai/docs/api-reference#tag/Generate/paths/~1v2beta~1stable-image~1generate~1sd3/post)), 

  * 04/16 - **VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time** 
([:x:](https://arxiv.org/abs/2404.10667)), ([:book:](https://browse.arxiv.org/pdf/2404.10667.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.10667.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.10667)), ([:house:](https://huggingface.co/papers/2404.10667)), ([HTML](https://browse.arxiv.org/html/2404.10667v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.10667)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.10667v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.10667)), ([SS](https://api.semanticscholar.org/arXiv:2404.10667)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/vasa-1-lifelike-audio-driven-talking-faces))

  * 04/16 - **U.S. Commerce Secretary Gina Raimondo Announces Expansion of U.S. AI Safety Institute Leadership Team** 
  ([News](https://www.commerce.gov/news/press-releases/2024/04/us-commerce-secretary-gina-raimondo-announces-expansion-us-ai-safety)), 

  * 04/16 - **Long-form music generation with latent diffusion** 
([:x:](https://arxiv.org/abs/2404.10301)), ([:book:](https://browse.arxiv.org/pdf/2404.10301.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.10301.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.10301)), ([:house:](https://huggingface.co/papers/2404.10301)), ([HTML](https://browse.arxiv.org/html/2404.10301v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.10301)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.10301v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.10301)), ([SS](https://api.semanticscholar.org/arXiv:2404.10301))

  * 04/15 - **LLM Evaluators Recognize and Favor Their Own Generations** 
([:x:](https://arxiv.org/abs/2404.13076)), ([:book:](https://browse.arxiv.org/pdf/2404.13076.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.13076.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.13076)), ([:house:](https://huggingface.co/papers/2404.13076)), ([HTML](https://browse.arxiv.org/html/2404.13076v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.13076)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.13076v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.13076)), ([SS](https://api.semanticscholar.org/arXiv:2404.13076))

  * 04/15 - **Video2Game: Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video** 
([:x:](https://arxiv.org/abs/2404.09833)), ([:book:](https://browse.arxiv.org/pdf/2404.09833.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09833.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09833)), ([:house:](https://huggingface.co/papers/2404.09833)), ([HTML](https://browse.arxiv.org/html/2404.09833v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.09833)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09833v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09833)), ([SS](https://api.semanticscholar.org/arXiv:2404.09833)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/video2game-real-time-interactive-realistic))

  * 04/15 - **Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization** 
([:x:](https://arxiv.org/abs/2404.09956)), ([:book:](https://browse.arxiv.org/pdf/2404.09956.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09956.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09956)), ([:house:](https://huggingface.co/papers/2404.09956)), ([HTML](https://browse.arxiv.org/html/2404.09956v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.09956)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09956v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09956)), ([SS](https://api.semanticscholar.org/arXiv:2404.09956)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/tango-2-aligning-diffusion-based-text-to)), ([:octocat:](https://github.com/declare-lab/tango)![GitHub Repo stars](https://img.shields.io/github/stars/declare-lab/tango?style=social))

  * 04/15 - **Taming Latent Diffusion Model for Neural Radiance Field Inpainting** 
([:x:](https://arxiv.org/abs/2404.09995)), ([:book:](https://browse.arxiv.org/pdf/2404.09995.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09995.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09995)), ([:house:](https://huggingface.co/papers/2404.09995)), ([HTML](https://browse.arxiv.org/html/2404.09995v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.09995)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09995v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09995)), ([SS](https://api.semanticscholar.org/arXiv:2404.09995)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/taming-latent-diffusion-model-for-neural))

  * 04/15 - **Opus can operate as a Turing machine** 
  ([twitter](https://twitter.com/ctjlewis/status/1779740038852690393)), 

  * 04/15 - **MathGPT: Leveraging Llama 2 to create a platform for highly personalized learning** 
 

  * 04/15 - **HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing** 
([:x:](https://arxiv.org/abs/2404.09990)), ([:book:](https://browse.arxiv.org/pdf/2404.09990.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09990.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09990)), ([:house:](https://huggingface.co/papers/2404.09990)), ([HTML](https://browse.arxiv.org/html/2404.09990v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.09990)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09990v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09990)), ([SS](https://api.semanticscholar.org/arXiv:2404.09990)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/hq-edit-a-high-quality-dataset-for))

  * 04/15 - **Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model** 
([:x:](https://arxiv.org/abs/2404.09967)), ([:book:](https://browse.arxiv.org/pdf/2404.09967.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09967.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09967)), ([:house:](https://huggingface.co/papers/2404.09967)), ([HTML](https://browse.arxiv.org/html/2404.09967v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.09967)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09967v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09967)), ([SS](https://api.semanticscholar.org/arXiv:2404.09967)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/ctrl-adapter-an-efficient-and-versatile))

  * 04/15 - **Compression Represents Intelligence Linearly** 
([:x:](https://arxiv.org/abs/2404.09937)), ([:book:](https://browse.arxiv.org/pdf/2404.09937.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09937.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09937)), ([:house:](https://huggingface.co/papers/2404.09937)), ([HTML](https://browse.arxiv.org/html/2404.09937v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.09937)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09937v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09937)), ([SS](https://api.semanticscholar.org/arXiv:2404.09937)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/compression-represents-intelligence-linearly))

  * 04/15 - **CompGS: Efficient 3D Scene Representation via Compressed Gaussian Splatting** 
([:x:](https://arxiv.org/abs/2404.09458)), ([:book:](https://browse.arxiv.org/pdf/2404.09458.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09458.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09458)), ([:house:](https://huggingface.co/papers/2404.09458)), ([HTML](https://browse.arxiv.org/html/2404.09458v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.09458)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09458v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09458)), ([SS](https://api.semanticscholar.org/arXiv:2404.09458)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/compgs-efficient-3d-scene-representation-via))

  * 04/14 - **TextHawk: Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models** 
([:x:](https://arxiv.org/abs/2404.09204)), ([:book:](https://browse.arxiv.org/pdf/2404.09204.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09204.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09204)), ([:house:](https://huggingface.co/papers/2404.09204)), ([HTML](https://browse.arxiv.org/html/2404.09204v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid&pid=2404.09204)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09204v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09204)), ([SS](https://api.semanticscholar.org/arXiv:2404.09204)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/texthawk-exploring-efficient-fine-grained)), ([:octocat:](https://github.com/yuyq96/texthawk)![GitHub Repo stars](https://img.shields.io/github/stars/yuyq96/texthawk?style=social))

  * 04/13 - **Cathie Wood Muscles Into ChatGPT Boom With New OpenAI Stake** 
  ([News](https://finance.yahoo.com/news/cathie-wood-ark-investment-management-232619722.html)),

  * 04/12 - **Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies** 
([:x:](https://arxiv.org/abs/2404.08197)), ([:book:](https://browse.arxiv.org/pdf/2404.08197.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.08197.pdf)),  ([:orange_book:](https://www.a
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/hollobit/GenAI_LLM_timeline

Awesome Lists containing this project

README