{"id":13584207,"url":"https://github.com/hollobit/GenAI_LLM_timeline","last_synced_at":"2025-04-07T01:31:42.958Z","repository":{"id":153136078,"uuid":"618727061","full_name":"hollobit/GenAI_LLM_timeline","owner":"hollobit","description":"ChatGPT, GenerativeAI and LLMs Timeline","archived":false,"fork":false,"pushed_at":"2024-05-19T23:57:02.000Z","size":3302,"stargazers_count":953,"open_issues_count":4,"forks_count":58,"subscribers_count":84,"default_branch":"main","last_synced_at":"2025-04-03T17:12:30.168Z","etag":null,"topics":["agi","chatgpt","chatgpt-api","claude","copilot","generative-ai","generative-models","gpt","langchain","large-language-models","llama","llm","midjourney","openai","palm-e","stable-diffusion","timeline","transformer","vall-e"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hollobit.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-03-25T07:00:16.000Z","updated_at":"2025-03-26T08:36:39.000Z","dependencies_parsed_at":null,"dependency_job_id":"54f06c68-751c-461c-887d-1e7b711e404b","html_url":"https://github.com/hollobit/GenAI_LLM_timeline","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hollobit%2FGenAI_LLM_timeline","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hollobit%2FGenAI_LLM_timeline/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hollobit%2FGenAI_LLM_timeline/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hollobit%2FGenAI_LLM_timeline/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hollobit","download_url":"https://codeload.github.com/hollobit/GenAI_LLM_timeline/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247577956,"owners_count":20961203,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agi","chatgpt","chatgpt-api","claude","copilot","generative-ai","generative-models","gpt","langchain","large-language-models","llama","llm","midjourney","openai","palm-e","stable-diffusion","timeline","transformer","vall-e"],"created_at":"2024-08-01T15:04:05.117Z","updated_at":"2025-04-07T01:31:37.937Z","avatar_url":"https://github.com/hollobit.png","language":null,"funding_links":[],"categories":["Others","Documentation and examples","chatgpt","stable-diffusion"],"sub_categories":["Lists, Guides and examples"],"readme":"# ChatGPT, GenerativeAI and LLMs Timeline \n\nThis repository organizes a timeline of key events (products, services, papers, GitHub, blog posts and news) that occurred before and after the ChatGPT announcement. \n\nIt's curating a variety of information in this timeline, with a particular focus on LLM and Generative AI. \n\nMaybe it's a scene from the hottest history, so I thought it would be important to keep those memories well, so I organized them.\n\n## Statistics \n\nThese diagrams were generated by ChatGPT's Code Interpreter.\n\n\u003cimg src=\"statistics-1224-02.png\"\u003e \n\u003cimg src=\"statistics-1224-01.png\"\u003e\n\n## Contributing\n\nIssues and Pull Requests are greatly appreciated. If you've never contributed to an open source project before I'm more than happy to walk you through how to create a pull request.\n\nYou can start by [opening an issue](https://github.com/hollobit/BCAC_timeline/issues/new) describing the problem that you're looking to resolve and we'll go from there.\n\n## Emoji \n\narXiv :x:, PDF :paperclip:, arxiv-vanity :orange_book:, paper page :house:, papers with code :eight_spoked_asterisk:, Github :octocat:\n\n## License\n\nThis document is licensed under the [MIT license](https://opensource.org/licenses/mit-license.php) © Jonghong Jeon(전종홍)\n\n## Timeline V2\n\n### 2024\n\n  * 05/17 - **OpenAI strikes Reddit deal to train its AI on your posts** \u003cbr\u003e  ([News](https://www.theverge.com/2024/5/16/24158529/reddit-openai-chatgpt-api-access-advertising)), \n  * 05/17 - **OpenAI dissolves team focused on long-term AI risks, less than one year after announcing it** \u003cbr\u003e  ([News](https://www.cnbc.com/2024/05/17/openai-superalignment-sutskever-leike.html)), \n  * 05/17 - **International Scientific Report on the Safety of Advanced AI** \u003cbr\u003e  ([Blog](https://www.gov.uk/government/publications/international-scientific-report-on-the-safety-of-advanced-ai)), \n  * 05/16 - **TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.10315)), ([:book:](https://browse.arxiv.org/pdf/2405.10315.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.10315.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.10315)), ([:house:](https://huggingface.co/papers/2405.10315)), ([HTML](https://browse.arxiv.org/html/2405.10315v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.10315)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.10315v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.10315)), ([SS](https://api.semanticscholar.org/arXiv:2405.10315))\n  * 05/16 - **Toon3D: Seeing Cartoons from a New Perspective** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.10320)), ([:book:](https://browse.arxiv.org/pdf/2405.10320.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.10320.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.10320)), ([:house:](https://huggingface.co/papers/2405.10320)), ([HTML](https://browse.arxiv.org/html/2405.10320v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.10320)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.10320v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.10320)), ([SS](https://api.semanticscholar.org/arXiv:2405.10320))\n  * 05/16 - **Testing the reliability of an AI-based large language model to extract ecological information from the scientific literature** \u003cbr\u003e  ([News](https://www.nature.com/articles/s44185-024-00043-9)), \n  * 05/16 - **Many-Shot In-Context Learning in Multimodal Foundation Models** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.09798)), ([:book:](https://browse.arxiv.org/pdf/2405.09798.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.09798.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.09798)), ([:house:](https://huggingface.co/papers/2405.09798)), ([HTML](https://browse.arxiv.org/html/2405.09798v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.09798)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.09798v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.09798)), ([SS](https://api.semanticscholar.org/arXiv:2405.09798))\n  * 05/16 - **How to Hit Pause on AI Before It’s Too Late** \u003cbr\u003e  ([News](https://time.com/6978790/how-to-pause-artificial-intelligence/)), \n  * 05/16 - **Grounding DINO 1.5: Advance the \"Edge\" of Open-Set Object Detection** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.10300)), ([:book:](https://browse.arxiv.org/pdf/2405.10300.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.10300.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.10300)), ([:house:](https://huggingface.co/papers/2405.10300)), ([HTML](https://browse.arxiv.org/html/2405.10300v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.10300)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.10300v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.10300)), ([SS](https://api.semanticscholar.org/arXiv:2405.10300))\n  * 05/16 - **GPT Store Mining and Analysis** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.10210)), ([:book:](https://browse.arxiv.org/pdf/2405.10210.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.10210.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.10210)), ([:house:](https://huggingface.co/papers/2405.10210)), ([HTML](https://browse.arxiv.org/html/2405.10210v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.10210)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.10210v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.10210)), ([SS](https://api.semanticscholar.org/arXiv:2405.10210))\n  * 05/16 - **Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.09874)), ([:book:](https://browse.arxiv.org/pdf/2405.09874.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.09874.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.09874)), ([:house:](https://huggingface.co/papers/2405.09874)), ([HTML](https://browse.arxiv.org/html/2405.09874v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.09874)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.09874v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.09874)), ([SS](https://api.semanticscholar.org/arXiv:2405.09874))\n  * 05/16 - **Chameleon: Mixed-Modal Early-Fusion Foundation Models** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.09818)), ([:book:](https://browse.arxiv.org/pdf/2405.09818.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.09818.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.09818)), ([:house:](https://huggingface.co/papers/2405.09818)), ([HTML](https://browse.arxiv.org/html/2405.09818v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.09818)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.09818v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.09818)), ([SS](https://api.semanticscholar.org/arXiv:2405.09818))\n  * 05/16 - **CAT3D: Create Anything in 3D with Multi-View Diffusion Models** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.10314)), ([:book:](https://browse.arxiv.org/pdf/2405.10314.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.10314.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.10314)), ([:house:](https://huggingface.co/papers/2405.10314)), ([HTML](https://browse.arxiv.org/html/2405.10314v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.10314)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.10314v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.10314)), ([SS](https://api.semanticscholar.org/arXiv:2405.10314))\n  * 05/15 - **Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.09215)), ([:book:](https://browse.arxiv.org/pdf/2405.09215.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.09215.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.09215)), ([:house:](https://huggingface.co/papers/2405.09215)), ([HTML](https://browse.arxiv.org/html/2405.09215v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.09215)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.09215v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.09215)), ([SS](https://api.semanticscholar.org/arXiv:2405.09215))\n  * 05/15 - **LoRA Learns Less and Forgets Less** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.09673)), ([:book:](https://browse.arxiv.org/pdf/2405.09673.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.09673.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.09673)), ([:house:](https://huggingface.co/papers/2405.09673)), ([HTML](https://browse.arxiv.org/html/2405.09673v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.09673)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.09673v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.09673)), ([SS](https://api.semanticscholar.org/arXiv:2405.09673))\n  * 05/15 - **Google’s invisible AI watermark will help identify generative text and video** \u003cbr\u003e  ([News](https://www.theverge.com/2024/5/14/24155927/google-ai-synthid-watermark-text-video-io)), \n  * 05/15 - **Google I/O 2024: everything announced** \u003cbr\u003e  ([Blog](https://www.theverge.com/24153841/google-io-2024-ai-gemini-android-chrome-photos)), \n  * 05/15 - **BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.09546)), ([:book:](https://browse.arxiv.org/pdf/2405.09546.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.09546.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.09546)), ([:house:](https://huggingface.co/papers/2405.09546)), ([HTML](https://browse.arxiv.org/html/2405.09546v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.09546)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.09546v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.09546)), ([SS](https://api.semanticscholar.org/arXiv:2405.09546))\n  * 05/15 - **ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.09220)), ([:book:](https://browse.arxiv.org/pdf/2405.09220.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.09220.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.09220)), ([:house:](https://huggingface.co/papers/2405.09220)), ([HTML](https://browse.arxiv.org/html/2405.09220v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.09220)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.09220v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.09220)), ([SS](https://api.semanticscholar.org/arXiv:2405.09220))\n  * 05/14 - **Understanding the performance gap between online and offline alignment algorithms** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.08448)), ([:book:](https://browse.arxiv.org/pdf/2405.08448.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08448.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08448)), ([:house:](https://huggingface.co/papers/2405.08448)), ([HTML](https://browse.arxiv.org/html/2405.08448v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.08448)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08448v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08448)), ([SS](https://api.semanticscholar.org/arXiv:2405.08448))\n  * 05/14 - **SpeechVerse: A Large-scale Generalizable Audio Language Model** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.08295)), ([:book:](https://browse.arxiv.org/pdf/2405.08295.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08295.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08295)), ([:house:](https://huggingface.co/papers/2405.08295)), ([HTML](https://browse.arxiv.org/html/2405.08295v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.08295)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08295v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08295)), ([SS](https://api.semanticscholar.org/arXiv:2405.08295))\n  * 05/14 - **SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.08317)), ([:book:](https://browse.arxiv.org/pdf/2405.08317.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08317.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08317)), ([:house:](https://huggingface.co/papers/2405.08317)), ([HTML](https://browse.arxiv.org/html/2405.08317v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.08317)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08317v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08317)), ([SS](https://api.semanticscholar.org/arXiv:2405.08317))\n  * 05/14 - **No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.08344)), ([:book:](https://browse.arxiv.org/pdf/2405.08344.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08344.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08344)), ([:house:](https://huggingface.co/papers/2405.08344)), ([HTML](https://browse.arxiv.org/html/2405.08344v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.08344)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08344v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08344)), ([SS](https://api.semanticscholar.org/arXiv:2405.08344))\n  * 05/14 - **Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.08748)), ([:book:](https://browse.arxiv.org/pdf/2405.08748.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08748.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08748)), ([:house:](https://huggingface.co/papers/2405.08748)), ([HTML](https://browse.arxiv.org/html/2405.08748v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.08748)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08748v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08748)), ([SS](https://api.semanticscholar.org/arXiv:2405.08748))\n  * 05/14 - **Compositional Text-to-Image Generation with Dense Blob Representations** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.08246)), ([:book:](https://browse.arxiv.org/pdf/2405.08246.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08246.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08246)), ([:house:](https://huggingface.co/papers/2405.08246)), ([HTML](https://browse.arxiv.org/html/2405.08246v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.08246)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08246v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08246)), ([SS](https://api.semanticscholar.org/arXiv:2405.08246))\n  * 05/14 - **Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.08707)), ([:book:](https://browse.arxiv.org/pdf/2405.08707.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08707.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08707)), ([:house:](https://huggingface.co/papers/2405.08707)), ([HTML](https://browse.arxiv.org/html/2405.08707v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.08707)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08707v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08707)), ([SS](https://api.semanticscholar.org/arXiv:2405.08707))\n  * 05/13 - **SambaNova SN40L: Scaling the AI Memory Wall with Dataflow and Composition of Experts** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.07518)), ([:book:](https://browse.arxiv.org/pdf/2405.07518.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.07518.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.07518)), ([:house:](https://huggingface.co/papers/2405.07518)), ([HTML](https://browse.arxiv.org/html/2405.07518v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.07518)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.07518v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.07518)), ([SS](https://api.semanticscholar.org/arXiv:2405.07518))\n  * 05/13 - **RLHF Workflow: From Reward Modeling to Online RLHF** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.07863)), ([:book:](https://browse.arxiv.org/pdf/2405.07863.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.07863.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.07863)), ([:house:](https://huggingface.co/papers/2405.07863)), ([HTML](https://browse.arxiv.org/html/2405.07863v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.07863)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.07863v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.07863)), ([SS](https://api.semanticscholar.org/arXiv:2405.07863))\n  * 05/13 - **Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.07990)), ([:book:](https://browse.arxiv.org/pdf/2405.07990.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.07990.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.07990)), ([:house:](https://huggingface.co/papers/2405.07990)), ([HTML](https://browse.arxiv.org/html/2405.07990v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.07990)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.07990v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.07990)), ([SS](https://api.semanticscholar.org/arXiv:2405.07990))\n  * 05/13 - **OpenAI unveils newest AI model, GPT-4o** \u003cbr\u003e  ([News](https://edition.cnn.com/2024/05/13/tech/openai-altman-new-ai-model-gpt-4o/index.html)), \n  * 05/13 - **MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.07526)), ([:book:](https://browse.arxiv.org/pdf/2405.07526.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.07526.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.07526)), ([:house:](https://huggingface.co/papers/2405.07526)), ([HTML](https://browse.arxiv.org/html/2405.07526v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.07526)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.07526v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.07526)), ([SS](https://api.semanticscholar.org/arXiv:2405.07526))\n  * 05/13 - **How Much Research Is Being Written by Large Language Models?** \u003cbr\u003e  ([Blog](https://hai.stanford.edu/news/how-much-research-being-written-large-language-models)), \n  * 05/13 - **Hello GPT-4o** \u003cbr\u003e  ([Blog](https://openai.com/index/hello-gpt-4o/)), \n  * 05/13 - **Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.08054)), ([:book:](https://browse.arxiv.org/pdf/2405.08054.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.08054.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.08054)), ([:house:](https://huggingface.co/papers/2405.08054)), ([HTML](https://browse.arxiv.org/html/2405.08054v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.08054)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.08054v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.08054)), ([SS](https://api.semanticscholar.org/arXiv:2405.08054))\n  * 05/11 - **Piccolo2: General Text Embedding with Multi-task Hybrid Loss Training** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.06932)), ([:book:](https://browse.arxiv.org/pdf/2405.06932.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.06932.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.06932)), ([:house:](https://huggingface.co/papers/2405.06932)), ([HTML](https://browse.arxiv.org/html/2405.06932v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.06932)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.06932v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.06932)), ([SS](https://api.semanticscholar.org/arXiv:2405.06932))\n  * 05/11 - **LogoMotion: Visually Grounded Code Generation for Content-Aware Animation** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.07065)), ([:book:](https://browse.arxiv.org/pdf/2405.07065.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.07065.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.07065)), ([:house:](https://huggingface.co/papers/2405.07065)), ([HTML](https://browse.arxiv.org/html/2405.07065v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.07065)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.07065v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.07065)), ([SS](https://api.semanticscholar.org/arXiv:2405.07065))\n  * 05/10 - **INSPECT - An open-source framework for large language model evaluations** \u003cbr\u003e  ([Blog](https://ukgovernmentbeis.github.io/inspect_ai/)), \n  * 05/10 - **AI Safety Institute releases new AI safety evaluations platform** \u003cbr\u003e  ([News](https://www.gov.uk/government/news/ai-safety-institute-releases-new-ai-safety-evaluations-platform)), \n  * 05/07 - **SUTRA: Scalable Multilingual Language Model Architecture** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.06694)), ([:book:](https://browse.arxiv.org/pdf/2405.06694.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.06694.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.06694)), ([:house:](https://huggingface.co/papers/2405.06694)), ([HTML](https://browse.arxiv.org/html/2405.06694v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.06694)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.06694v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.06694)), ([SS](https://api.semanticscholar.org/arXiv:2405.06694))\n  * 05/07 - **Meta Releases Llama 3 Open-Source LLM** \u003cbr\u003e  ([News](https://www.infoq.com/news/2024/05/meta-llama-3/)), \n  * 05/03 - **What matters when building vision-language models?** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.02246)), ([:book:](https://browse.arxiv.org/pdf/2405.02246.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.02246.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.02246)), ([:house:](https://huggingface.co/papers/2405.02246)), ([HTML](https://browse.arxiv.org/html/2405.02246v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.02246)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.02246v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.02246)), ([SS](https://api.semanticscholar.org/arXiv:2405.02246))\n  * 05/02 - **WildChat: 1M ChatGPT Interaction Logs in the Wild** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.01470)), ([:book:](https://browse.arxiv.org/pdf/2405.01470.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.01470.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.01470)), ([:house:](https://huggingface.co/papers/2405.01470)), ([HTML](https://browse.arxiv.org/html/2405.01470v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.01470)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.01470v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.01470)), ([SS](https://api.semanticscholar.org/arXiv:2405.01470))\n  * 05/02 - **StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.01434)), ([:book:](https://browse.arxiv.org/pdf/2405.01434.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.01434.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.01434)), ([:house:](https://huggingface.co/papers/2405.01434)), ([HTML](https://browse.arxiv.org/html/2405.01434v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.01434)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.01434v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.01434)), ([SS](https://api.semanticscholar.org/arXiv:2405.01434))\n  * 05/02 - **Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.01535)), ([:book:](https://browse.arxiv.org/pdf/2405.01535.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.01535.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.01535)), ([:house:](https://huggingface.co/papers/2405.01535)), ([HTML](https://browse.arxiv.org/html/2405.01535v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.01535)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.01535v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.01535)), ([SS](https://api.semanticscholar.org/arXiv:2405.01535))\n  * 05/02 - **NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.01481)), ([:book:](https://browse.arxiv.org/pdf/2405.01481.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.01481.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.01481)), ([:house:](https://huggingface.co/papers/2405.01481)), ([HTML](https://browse.arxiv.org/html/2405.01481v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.01481)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.01481v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.01481)), ([SS](https://api.semanticscholar.org/arXiv:2405.01481))\n  * 05/02 - **LLM-AD: Large Language Model based Audio Description System** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.00983)), ([:book:](https://browse.arxiv.org/pdf/2405.00983.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00983.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00983)), ([:house:](https://huggingface.co/papers/2405.00983)), ([HTML](https://browse.arxiv.org/html/2405.00983v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.00983)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00983v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00983)), ([SS](https://api.semanticscholar.org/arXiv:2405.00983))\n  * 05/02 - **FLAME: Factuality-Aware Alignment for Large Language Models** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.01525)), ([:book:](https://browse.arxiv.org/pdf/2405.01525.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.01525.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.01525)), ([:house:](https://huggingface.co/papers/2405.01525)), ([HTML](https://browse.arxiv.org/html/2405.01525v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.01525)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.01525v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.01525)), ([SS](https://api.semanticscholar.org/arXiv:2405.01525))\n  * 05/02 - **Customizing Text-to-Image Models with a Single Image Pair** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.01536)), ([:book:](https://browse.arxiv.org/pdf/2405.01536.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.01536.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.01536)), ([:house:](https://huggingface.co/papers/2405.01536)), ([HTML](https://browse.arxiv.org/html/2405.01536v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.01536)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.01536v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.01536)), ([SS](https://api.semanticscholar.org/arXiv:2405.01536))\n  * 05/01 - **Spectrally Pruned Gaussian Fields with Neural Compensation** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.00676)), ([:book:](https://browse.arxiv.org/pdf/2405.00676.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00676.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00676)), ([:house:](https://huggingface.co/papers/2405.00676)), ([HTML](https://browse.arxiv.org/html/2405.00676v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.00676)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00676v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00676)), ([SS](https://api.semanticscholar.org/arXiv:2405.00676))\n  * 05/01 - **Self-Play Preference Optimization for Language Model Alignment** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.00675)), ([:book:](https://browse.arxiv.org/pdf/2405.00675.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00675.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00675)), ([:house:](https://huggingface.co/papers/2405.00675)), ([HTML](https://browse.arxiv.org/html/2405.00675v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.00675)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00675v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00675)), ([SS](https://api.semanticscholar.org/arXiv:2405.00675))\n  * 05/01 - **Is Bigger Edit Batch Size Always Better? -- An Empirical Study on Model Editing with Llama-3** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.00664)), ([:book:](https://browse.arxiv.org/pdf/2405.00664.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00664.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00664)), ([:house:](https://huggingface.co/papers/2405.00664)), ([HTML](https://browse.arxiv.org/html/2405.00664v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.00664)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00664v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00664)), ([SS](https://api.semanticscholar.org/arXiv:2405.00664))\n  * 05/01 - **Clover: Regressive Lightweight Speculative Decoding with Sequential Knowledge** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.00263)), ([:book:](https://browse.arxiv.org/pdf/2405.00263.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00263.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00263)), ([:house:](https://huggingface.co/papers/2405.00263)), ([HTML](https://browse.arxiv.org/html/2405.00263v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.00263)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00263v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00263)), ([SS](https://api.semanticscholar.org/arXiv:2405.00263))\n  * 05/01 - **A Careful Examination of Large Language Model Performance on Grade School Arithmetic** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.00332)), ([:book:](https://browse.arxiv.org/pdf/2405.00332.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00332.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00332)), ([:house:](https://huggingface.co/papers/2405.00332)), ([HTML](https://browse.arxiv.org/html/2405.00332v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.00332)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00332v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00332)), ([SS](https://api.semanticscholar.org/arXiv:2405.00332))\n  * 04/30 - **Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.19752)), ([:book:](https://browse.arxiv.org/pdf/2404.19752.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19752.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19752)), ([:house:](https://huggingface.co/papers/2404.19752)), ([HTML](https://browse.arxiv.org/html/2404.19752v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.19752)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19752v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19752)), ([SS](https://api.semanticscholar.org/arXiv:2404.19752))\n  * 04/30 - **STT: Stateful Tracking with Transformers for Autonomous Driving** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.00236)), ([:book:](https://browse.arxiv.org/pdf/2405.00236.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00236.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00236)), ([:house:](https://huggingface.co/papers/2405.00236)), ([HTML](https://browse.arxiv.org/html/2405.00236v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.00236)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00236v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00236)), ([SS](https://api.semanticscholar.org/arXiv:2405.00236))\n  * 04/30 - **SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.00233)), ([:book:](https://browse.arxiv.org/pdf/2405.00233.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00233.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00233)), ([:house:](https://huggingface.co/papers/2405.00233)), ([HTML](https://browse.arxiv.org/html/2405.00233v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.00233)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00233v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00233)), ([SS](https://api.semanticscholar.org/arXiv:2405.00233))\n  * 04/30 - **Octopus v4: Graph of language models** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.19296)), ([:book:](https://browse.arxiv.org/pdf/2404.19296.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19296.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19296)), ([:house:](https://huggingface.co/papers/2404.19296)), ([HTML](https://browse.arxiv.org/html/2404.19296v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.19296)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19296v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19296)), ([SS](https://api.semanticscholar.org/arXiv:2404.19296))\n  * 04/30 - **MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.19759)), ([:book:](https://browse.arxiv.org/pdf/2404.19759.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19759.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19759)), ([:house:](https://huggingface.co/papers/2404.19759)), ([HTML](https://browse.arxiv.org/html/2404.19759v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.19759)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19759v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19759)), ([SS](https://api.semanticscholar.org/arXiv:2404.19759))\n  * 04/30 - **MicroDreamer: Zero-shot 3D Generation in sim20 Seconds by Score-based Iterative Reconstruction** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.19525)), ([:book:](https://browse.arxiv.org/pdf/2404.19525.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19525.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19525)), ([:house:](https://huggingface.co/papers/2404.19525)), ([HTML](https://browse.arxiv.org/html/2404.19525v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.19525)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19525v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19525)), ([SS](https://api.semanticscholar.org/arXiv:2404.19525))\n  * 04/30 - **Lightplane: Highly-Scalable Components for Neural 3D Fields** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.19760)), ([:book:](https://browse.arxiv.org/pdf/2404.19760.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19760.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19760)), ([:house:](https://huggingface.co/papers/2404.19760)), ([HTML](https://browse.arxiv.org/html/2404.19760v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.19760)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19760v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19760)), ([SS](https://api.semanticscholar.org/arXiv:2404.19760))\n  * 04/30 - **KAN: Kolmogorov-Arnold Networks** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.19756)), ([:book:](https://browse.arxiv.org/pdf/2404.19756.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19756.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19756)), ([:house:](https://huggingface.co/papers/2404.19756)), ([HTML](https://browse.arxiv.org/html/2404.19756v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.19756)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19756v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19756)), ([SS](https://api.semanticscholar.org/arXiv:2404.19756))\n  * 04/30 - **Iterative Reasoning Preference Optimization** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.19733)), ([:book:](https://browse.arxiv.org/pdf/2404.19733.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19733.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19733)), ([:house:](https://huggingface.co/papers/2404.19733)), ([HTML](https://browse.arxiv.org/html/2404.19733v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.19733)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19733v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19733)), ([SS](https://api.semanticscholar.org/arXiv:2404.19733))\n  * 04/30 - **Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.19758)), ([:book:](https://browse.arxiv.org/pdf/2404.19758.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19758.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19758)), ([:house:](https://huggingface.co/papers/2404.19758)), ([HTML](https://browse.arxiv.org/html/2404.19758v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.19758)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19758v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19758)), ([SS](https://api.semanticscholar.org/arXiv:2404.19758))\n  * 04/30 - **InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.19427)), ([:book:](https://browse.arxiv.org/pdf/2404.19427.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19427.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19427)), ([:house:](https://huggingface.co/papers/2404.19427)), ([HTML](https://browse.arxiv.org/html/2404.19427v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.19427)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19427v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19427)), ([SS](https://api.semanticscholar.org/arXiv:2404.19427))\n  * 04/30 - **GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.19702)), ([:book:](https://browse.arxiv.org/pdf/2404.19702.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19702.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19702)), ([:house:](https://huggingface.co/papers/2404.19702)), ([HTML](https://browse.arxiv.org/html/2404.19702v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.19702)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19702v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19702)), ([SS](https://api.semanticscholar.org/arXiv:2404.19702))\n  * 04/30 - **Extending Llama-3's Context Ten-Fold Overnight** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.19553)), ([:book:](https://browse.arxiv.org/pdf/2404.19553.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19553.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19553)), ([:house:](https://huggingface.co/papers/2404.19553)), ([HTML](https://browse.arxiv.org/html/2404.19553v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.19553)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19553v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19553)), ([SS](https://api.semanticscholar.org/arXiv:2404.19553))\n  * 04/30 - **DOCCI: Descriptions of Connected and Contrasting Images** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.19753)), ([:book:](https://browse.arxiv.org/pdf/2404.19753.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19753.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19753)), ([:house:](https://huggingface.co/papers/2404.19753)), ([HTML](https://browse.arxiv.org/html/2404.19753v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.19753)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19753v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19753)), ([SS](https://api.semanticscholar.org/arXiv:2404.19753))\n  * 04/30 - **Better \u0026 Faster Large Language Models via Multi-token Prediction** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.19737)), ([:book:](https://browse.arxiv.org/pdf/2404.19737.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19737.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19737)), ([:house:](https://huggingface.co/papers/2404.19737)), ([HTML](https://browse.arxiv.org/html/2404.19737v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.19737)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19737v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19737)), ([SS](https://api.semanticscholar.org/arXiv:2404.19737))\n  * 04/29 - **Stylus: Automatic Adapter Selection for Diffusion Models** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.18928)), ([:book:](https://browse.arxiv.org/pdf/2404.18928.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.18928.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.18928)), ([:house:](https://huggingface.co/papers/2404.18928)), ([HTML](https://browse.arxiv.org/html/2404.18928v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.18928)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.18928v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.18928)), ([SS](https://api.semanticscholar.org/arXiv:2404.18928))\n  * 04/29 - **SAGS: Structure-Aware 3D Gaussian Splatting** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.19149)), ([:book:](https://browse.arxiv.org/pdf/2404.19149.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.19149.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.19149)), ([:house:](https://huggingface.co/papers/2404.19149)), ([HTML](https://browse.arxiv.org/html/2404.19149v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.19149)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.19149v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.19149)), ([SS](https://api.semanticscholar.org/arXiv:2404.19149))\n  * 04/29 - **Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.18796)), ([:book:](https://browse.arxiv.org/pdf/2404.18796.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.18796.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.18796)), ([:house:](https://huggingface.co/papers/2404.18796)), ([HTML](https://browse.arxiv.org/html/2404.18796v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.18796)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.18796v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.18796)), ([SS](https://api.semanticscholar.org/arXiv:2404.18796))\n  * 04/29 - **NIST  AI RMF Generative AI Profile** \u003cbr\u003e  ([News](https://www.nist.gov/news-events/news/2024/04/department-commerce-announces-new-actions-implement-president-bidens)), \n  * 04/29 - **LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report** \u003cbr\u003e([:x:](https://arxiv.org/abs/2405.00732)), ([:book:](https://browse.arxiv.org/pdf/2405.00732.pdf)), ([:paperclip:](https://arxiv.org/pdf/2405.00732.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2405.00732)), ([:house:](https://huggingface.co/papers/2405.00732)), ([HTML](https://browse.arxiv.org/html/2405.00732v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2405.00732)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2405.00732v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2405.00732)), ([SS](https://api.semanticscholar.org/arXiv:2405.00732))\n  * 04/29 - **Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.18911)), ([:book:](https://browse.arxiv.org/pdf/2404.18911.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.18911.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.18911)), ([:house:](https://huggingface.co/papers/2404.18911)), ([HTML](https://browse.arxiv.org/html/2404.18911v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.18911)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.18911v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.18911)), ([SS](https://api.semanticscholar.org/arXiv:2404.18911))\n  * 04/29 - **Capabilities of Gemini Models in Medicine** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.18416)), ([:book:](https://browse.arxiv.org/pdf/2404.18416.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.18416.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.18416)), ([:house:](https://huggingface.co/papers/2404.18416)), ([HTML](https://browse.arxiv.org/html/2404.18416v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.18416)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.18416v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.18416)), ([SS](https://api.semanticscholar.org/arXiv:2404.18416))\n  * 04/28 - **Paint by Inpaint: Learning to Add Image Objects by Removing Them First** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.18212)), ([:book:](https://browse.arxiv.org/pdf/2404.18212.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.18212.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.18212)), ([:house:](https://huggingface.co/papers/2404.18212)), ([HTML](https://browse.arxiv.org/html/2404.18212v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.18212)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.18212v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.18212)), ([SS](https://api.semanticscholar.org/arXiv:2404.18212))\n  * 04/28 - **LEGENT: Open Platform for Embodied Agents** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.18243)), ([:book:](https://browse.arxiv.org/pdf/2404.18243.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.18243.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.18243)), ([:house:](https://huggingface.co/papers/2404.18243)), ([HTML](https://browse.arxiv.org/html/2404.18243v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.18243)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.18243v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.18243)), ([SS](https://api.semanticscholar.org/arXiv:2404.18243))\n  * 04/27 - **Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.17521)), ([:book:](https://browse.arxiv.org/pdf/2404.17521.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.17521.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.17521)), ([:house:](https://huggingface.co/papers/2404.17521)), ([HTML](https://browse.arxiv.org/html/2404.17521v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.17521)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.17521v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.17521)), ([SS](https://api.semanticscholar.org/arXiv:2404.17521))\n  * 04/26 - **MaPa: Text-driven Photorealistic Material Painting for 3D Shapes** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.17569)), ([:book:](https://browse.arxiv.org/pdf/2404.17569.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.17569.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.17569)), ([:house:](https://huggingface.co/papers/2404.17569)), ([HTML](https://browse.arxiv.org/html/2404.17569v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.17569)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.17569v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.17569)), ([SS](https://api.semanticscholar.org/arXiv:2404.17569))\n  * 04/26 - **BlenderAlchemy: Editing 3D Graphics with Vision-Language Models** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.17672)), ([:book:](https://browse.arxiv.org/pdf/2404.17672.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.17672.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.17672)), ([:house:](https://huggingface.co/papers/2404.17672)), ([HTML](https://browse.arxiv.org/html/2404.17672v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.17672)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.17672v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.17672)), ([SS](https://api.semanticscholar.org/arXiv:2404.17672))\n  * 04/25 - **Tele-FLM Technical Report** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.16645)), ([:book:](https://browse.arxiv.org/pdf/2404.16645.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16645.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16645)), ([:house:](https://huggingface.co/papers/2404.16645)), ([HTML](https://browse.arxiv.org/html/2404.16645v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.16645)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16645v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16645)), ([SS](https://api.semanticscholar.org/arXiv:2404.16645))\n  * 04/25 - **SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.16790)), ([:book:](https://browse.arxiv.org/pdf/2404.16790.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16790.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16790)), ([:house:](https://huggingface.co/papers/2404.16790)), ([HTML](https://browse.arxiv.org/html/2404.16790v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.16790)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16790v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16790)), ([SS](https://api.semanticscholar.org/arXiv:2404.16790))\n  * 04/25 - **Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.16820)), ([:book:](https://browse.arxiv.org/pdf/2404.16820.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16820.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16820)), ([:house:](https://huggingface.co/papers/2404.16820)), ([HTML](https://browse.arxiv.org/html/2404.16820v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.16820)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16820v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16820)), ([SS](https://api.semanticscholar.org/arXiv:2404.16820))\n  * 04/25 - **PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.16994)), ([:book:](https://browse.arxiv.org/pdf/2404.16994.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16994.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16994)), ([:house:](https://huggingface.co/papers/2404.16994)), ([HTML](https://browse.arxiv.org/html/2404.16994v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.16994)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16994v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16994)), ([SS](https://api.semanticscholar.org/arXiv:2404.16994))\n  * 04/25 - **Make Your LLM Fully Utilize the Context** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.16811)), ([:book:](https://browse.arxiv.org/pdf/2404.16811.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16811.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16811)), ([:house:](https://huggingface.co/papers/2404.16811)), ([HTML](https://browse.arxiv.org/html/2404.16811v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.16811)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16811v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16811)), ([SS](https://api.semanticscholar.org/arXiv:2404.16811))\n  * 04/25 - **List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.16375)), ([:book:](https://browse.arxiv.org/pdf/2404.16375.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16375.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16375)), ([:house:](https://huggingface.co/papers/2404.16375)), ([HTML](https://browse.arxiv.org/html/2404.16375v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.16375)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16375v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16375)), ([SS](https://api.semanticscholar.org/arXiv:2404.16375))\n  * 04/25 - **Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.16710)), ([:book:](https://browse.arxiv.org/pdf/2404.16710.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16710.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16710)), ([:house:](https://huggingface.co/papers/2404.16710)), ([HTML](https://browse.arxiv.org/html/2404.16710v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.16710)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16710v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16710)), ([SS](https://api.semanticscholar.org/arXiv:2404.16710))\n  * 04/25 - **Interactive3D: Create What You Want by Interactive 3D Generation** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.16510)), ([:book:](https://browse.arxiv.org/pdf/2404.16510.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16510.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16510)), ([:house:](https://huggingface.co/papers/2404.16510)), ([HTML](https://browse.arxiv.org/html/2404.16510v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.16510)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16510v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16510)), ([SS](https://api.semanticscholar.org/arXiv:2404.16510))\n  * 04/25 - **How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.16821)), ([:book:](https://browse.arxiv.org/pdf/2404.16821.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16821.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16821)), ([:house:](https://huggingface.co/papers/2404.16821)), ([HTML](https://browse.arxiv.org/html/2404.16821v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.16821)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16821v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16821)), ([SS](https://api.semanticscholar.org/arXiv:2404.16821))\n  * 04/25 - **ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.16771)), ([:book:](https://browse.arxiv.org/pdf/2404.16771.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16771.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16771)), ([:house:](https://huggingface.co/papers/2404.16771)), ([HTML](https://browse.arxiv.org/html/2404.16771v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.16771)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16771v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16771)), ([SS](https://api.semanticscholar.org/arXiv:2404.16771))\n  * 04/24 - **XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.15420)), ([:book:](https://browse.arxiv.org/pdf/2404.15420.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.15420.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.15420)), ([:house:](https://huggingface.co/papers/2404.15420)), ([HTML](https://browse.arxiv.org/html/2404.15420v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.15420)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.15420v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.15420)), ([SS](https://api.semanticscholar.org/arXiv:2404.15420))\n  * 04/24 - **The Ethics of Advanced AI Assistants** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.16244)), ([:book:](https://browse.arxiv.org/pdf/2404.16244.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16244.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16244)), ([:house:](https://huggingface.co/papers/2404.16244)), ([HTML](https://browse.arxiv.org/html/2404.16244v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.16244)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16244v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16244)), ([SS](https://api.semanticscholar.org/arXiv:2404.16244))\n  * 04/24 - **PuLID: Pure and Lightning ID Customization via Contrastive Alignment** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.16022)), ([:book:](https://browse.arxiv.org/pdf/2404.16022.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16022.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16022)), ([:house:](https://huggingface.co/papers/2404.16022)), ([HTML](https://browse.arxiv.org/html/2404.16022v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.16022)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16022v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16022)), ([SS](https://api.semanticscholar.org/arXiv:2404.16022))\n  * 04/24 - **NeRF-XL: Scaling NeRFs with Multiple GPUs** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.16221)), ([:book:](https://browse.arxiv.org/pdf/2404.16221.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16221.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16221)), ([:house:](https://huggingface.co/papers/2404.16221)), ([HTML](https://browse.arxiv.org/html/2404.16221v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.16221)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16221v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16221)), ([SS](https://api.semanticscholar.org/arXiv:2404.16221))\n  * 04/24 - **MotionMaster: Training-free Camera Motion Transfer For Video Generation** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.15789)), ([:book:](https://browse.arxiv.org/pdf/2404.15789.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.15789.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.15789)), ([:house:](https://huggingface.co/papers/2404.15789)), ([HTML](https://browse.arxiv.org/html/2404.15789v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.15789)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.15789v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.15789)), ([SS](https://api.semanticscholar.org/arXiv:2404.15789))\n  * 04/24 - **MoDE: CLIP Data Experts via Clustering** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.16030)), ([:book:](https://browse.arxiv.org/pdf/2404.16030.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16030.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16030)), ([:house:](https://huggingface.co/papers/2404.16030)), ([HTML](https://browse.arxiv.org/html/2404.16030v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.16030)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16030v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16030)), ([SS](https://api.semanticscholar.org/arXiv:2404.16030))\n  * 04/24 - **MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.16006)), ([:book:](https://browse.arxiv.org/pdf/2404.16006.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16006.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16006)), ([:house:](https://huggingface.co/papers/2404.16006)), ([HTML](https://browse.arxiv.org/html/2404.16006v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.16006)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16006v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16006)), ([SS](https://api.semanticscholar.org/arXiv:2404.16006))\n  * 04/24 - **MaGGIe: Masked Guided Gradual Human Instance Matting** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.16035)), ([:book:](https://browse.arxiv.org/pdf/2404.16035.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16035.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16035)), ([:house:](https://huggingface.co/papers/2404.16035)), ([HTML](https://browse.arxiv.org/html/2404.16035v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.16035)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16035v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16035)), ([SS](https://api.semanticscholar.org/arXiv:2404.16035))\n  * 04/24 - **ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.15449)), ([:book:](https://browse.arxiv.org/pdf/2404.15449.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.15449.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.15449)), ([:house:](https://huggingface.co/papers/2404.15449)), ([HTML](https://browse.arxiv.org/html/2404.15449v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.15449)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.15449v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.15449)), ([SS](https://api.semanticscholar.org/arXiv:2404.15449))\n  * 04/24 - **Editable Image Elements for Controllable Synthesis** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.16029)), ([:book:](https://browse.arxiv.org/pdf/2404.16029.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16029.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16029)), ([:house:](https://huggingface.co/papers/2404.16029)), ([HTML](https://browse.arxiv.org/html/2404.16029v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.16029)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16029v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16029)), ([SS](https://api.semanticscholar.org/arXiv:2404.16029))\n  * 04/24 - **CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.15653)), ([:book:](https://browse.arxiv.org/pdf/2404.15653.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.15653.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.15653)), ([:house:](https://huggingface.co/papers/2404.15653)), ([HTML](https://browse.arxiv.org/html/2404.15653v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.15653)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.15653v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.15653)), ([SS](https://api.semanticscholar.org/arXiv:2404.15653))\n  * 04/24 - **BASS: Batched Attention-optimized Speculative Sampling** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.15778)), ([:book:](https://browse.arxiv.org/pdf/2404.15778.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.15778.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.15778)), ([:house:](https://huggingface.co/papers/2404.15778)), ([HTML](https://browse.arxiv.org/html/2404.15778v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.15778)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.15778v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.15778)), ([SS](https://api.semanticscholar.org/arXiv:2404.15778))\n  * 04/23 - **Transformers Can Represent n-gram Language Models** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.14994)), ([:book:](https://browse.arxiv.org/pdf/2404.14994.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14994.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14994)), ([:house:](https://huggingface.co/papers/2404.14994)), ([HTML](https://browse.arxiv.org/html/2404.14994v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.14994)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14994v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14994)), ([SS](https://api.semanticscholar.org/arXiv:2404.14994))\n  * 04/23 - **Pegasus-v1 Technical Report** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.14687)), ([:book:](https://browse.arxiv.org/pdf/2404.14687.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14687.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14687)), ([:house:](https://huggingface.co/papers/2404.14687)), ([HTML](https://browse.arxiv.org/html/2404.14687v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.14687)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14687v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14687)), ([SS](https://api.semanticscholar.org/arXiv:2404.14687))\n  * 04/23 - **Multi-Head Mixture-of-Experts** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.15045)), ([:book:](https://browse.arxiv.org/pdf/2404.15045.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.15045.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.15045)), ([:house:](https://huggingface.co/papers/2404.15045)), ([HTML](https://browse.arxiv.org/html/2404.15045v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.15045)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.15045v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.15045)), ([SS](https://api.semanticscholar.org/arXiv:2404.15045))\n  * 04/23 - **FlashSpeech: Efficient Zero-Shot Speech Synthesis** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.14700)), ([:book:](https://browse.arxiv.org/pdf/2404.14700.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14700.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14700)), ([:house:](https://huggingface.co/papers/2404.14700)), ([HTML](https://browse.arxiv.org/html/2404.14700v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.14700)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14700v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14700)), ([SS](https://api.semanticscholar.org/arXiv:2404.14700))\n  * 04/22 - **SnapKV: LLM Knows What You are Looking for Before Generation** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.14469)), ([:book:](https://browse.arxiv.org/pdf/2404.14469.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14469.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14469)), ([:house:](https://huggingface.co/papers/2404.14469)), ([HTML](https://browse.arxiv.org/html/2404.14469v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.14469)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14469v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14469)), ([SS](https://api.semanticscholar.org/arXiv:2404.14469))\n  * 04/22 - **SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.14396)), ([:book:](https://browse.arxiv.org/pdf/2404.14396.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14396.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14396)), ([:house:](https://huggingface.co/papers/2404.14396)), ([HTML](https://browse.arxiv.org/html/2404.14396v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.14396)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14396v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14396)), ([SS](https://api.semanticscholar.org/arXiv:2404.14396))\n  * 04/22 - **Scene Coordinate Reconstruction: Posing of Image Collections via Incremental Learning of a Relocalizer** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.14351)), ([:book:](https://browse.arxiv.org/pdf/2404.14351.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14351.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14351)), ([:house:](https://huggingface.co/papers/2404.14351)), ([HTML](https://browse.arxiv.org/html/2404.14351v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.14351)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14351v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14351)), ([SS](https://api.semanticscholar.org/arXiv:2404.14351))\n  * 04/22 - **Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.14219)), ([:book:](https://browse.arxiv.org/pdf/2404.14219.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14219.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14219)), ([:house:](https://huggingface.co/papers/2404.14219)), ([HTML](https://browse.arxiv.org/html/2404.14219v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.14219)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14219v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14219)), ([SS](https://api.semanticscholar.org/arXiv:2404.14219))\n  * 04/22 - **OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.14619)), ([:book:](https://browse.arxiv.org/pdf/2404.14619.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14619.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14619)), ([:house:](https://huggingface.co/papers/2404.14619)), ([HTML](https://browse.arxiv.org/html/2404.14619v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.14619)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14619v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14619)), ([SS](https://api.semanticscholar.org/arXiv:2404.14619))\n  * 04/22 - **MultiBooth: Towards Generating All Your Concepts in an Image from Text** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.14239)), ([:book:](https://browse.arxiv.org/pdf/2404.14239.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14239.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14239)), ([:house:](https://huggingface.co/papers/2404.14239)), ([HTML](https://browse.arxiv.org/html/2404.14239v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.14239)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14239v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14239)), ([SS](https://api.semanticscholar.org/arXiv:2404.14239))\n  * 04/22 - **Learning H-Infinity Locomotion Control** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.14405)), ([:book:](https://browse.arxiv.org/pdf/2404.14405.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14405.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14405)), ([:house:](https://huggingface.co/papers/2404.14405)), ([HTML](https://browse.arxiv.org/html/2404.14405v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.14405)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14405v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14405)), ([SS](https://api.semanticscholar.org/arXiv:2404.14405))\n  * 04/22 - **How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.14047)), ([:book:](https://browse.arxiv.org/pdf/2404.14047.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14047.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14047)), ([:house:](https://huggingface.co/papers/2404.14047)), ([HTML](https://browse.arxiv.org/html/2404.14047v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.14047)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14047v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14047)), ([SS](https://api.semanticscholar.org/arXiv:2404.14047))\n  * 04/22 - **Align Your Steps: Optimizing Sampling Schedules in Diffusion Models** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.14507)), ([:book:](https://browse.arxiv.org/pdf/2404.14507.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14507.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14507)), ([:house:](https://huggingface.co/papers/2404.14507)), ([HTML](https://browse.arxiv.org/html/2404.14507v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.14507)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14507v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14507)), ([SS](https://api.semanticscholar.org/arXiv:2404.14507))\n  * 04/22 - **A Multimodal Automated Interpretability Agent** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.14394)), ([:book:](https://browse.arxiv.org/pdf/2404.14394.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.14394.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.14394)), ([:house:](https://huggingface.co/papers/2404.14394)), ([HTML](https://browse.arxiv.org/html/2404.14394v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.14394)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.14394v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.14394)), ([SS](https://api.semanticscholar.org/arXiv:2404.14394))\n  * 04/21 - **Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.13686)), ([:book:](https://browse.arxiv.org/pdf/2404.13686.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.13686.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.13686)), ([:house:](https://huggingface.co/papers/2404.13686)), ([HTML](https://browse.arxiv.org/html/2404.13686v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.13686)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.13686v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.13686)), ([SS](https://api.semanticscholar.org/arXiv:2404.13686))\n  * 04/21 - **AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.16873)), ([:book:](https://browse.arxiv.org/pdf/2404.16873.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.16873.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.16873)), ([:house:](https://huggingface.co/papers/2404.16873)), ([HTML](https://browse.arxiv.org/html/2404.16873v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.16873)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.16873v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.16873)), ([SS](https://api.semanticscholar.org/arXiv:2404.16873))\n  * 04/20 - **Music Consistency Models** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.13358)), ([:book:](https://browse.arxiv.org/pdf/2404.13358.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.13358.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.13358)), ([:house:](https://huggingface.co/papers/2404.13358)), ([HTML](https://browse.arxiv.org/html/2404.13358v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.13358)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.13358v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.13358)), ([SS](https://api.semanticscholar.org/arXiv:2404.13358))\n  * 04/19 - **The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.13208)), ([:book:](https://browse.arxiv.org/pdf/2404.13208.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.13208.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.13208)), ([:house:](https://huggingface.co/papers/2404.13208)), ([HTML](https://browse.arxiv.org/html/2404.13208v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.13208)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.13208v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.13208)), ([SS](https://api.semanticscholar.org/arXiv:2404.13208))\n  * 04/19 - **TextSquare: Scaling up Text-Centric Visual Instruction Tuning** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.12803)), ([:book:](https://browse.arxiv.org/pdf/2404.12803.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12803.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12803)), ([:house:](https://huggingface.co/papers/2404.12803)), ([HTML](https://browse.arxiv.org/html/2404.12803v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.12803)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12803v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12803)), ([SS](https://api.semanticscholar.org/arXiv:2404.12803))\n  * 04/19 - **PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.13026)), ([:book:](https://browse.arxiv.org/pdf/2404.13026.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.13026.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.13026)), ([:house:](https://huggingface.co/papers/2404.13026)), ([HTML](https://browse.arxiv.org/html/2404.13026v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.13026)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.13026v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.13026)), ([SS](https://api.semanticscholar.org/arXiv:2404.13026))\n  * 04/19 - **LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.12872)), ([:book:](https://browse.arxiv.org/pdf/2404.12872.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12872.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12872)), ([:house:](https://huggingface.co/papers/2404.12872)), ([HTML](https://browse.arxiv.org/html/2404.12872v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.12872)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12872v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12872)), ([SS](https://api.semanticscholar.org/arXiv:2404.12872))\n  * 04/19 - **How Real Is Real? A Human Evaluation Framework for Unrestricted Adversarial Examples** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.12653)), ([:book:](https://browse.arxiv.org/pdf/2404.12653.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12653.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12653)), ([:house:](https://huggingface.co/papers/2404.12653)), ([HTML](https://browse.arxiv.org/html/2404.12653v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.12653)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12653v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12653)), ([SS](https://api.semanticscholar.org/arXiv:2404.12653))\n  * 04/19 - **How Far Can We Go with Practical Function-Level Program Repair?** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.12833)), ([:book:](https://browse.arxiv.org/pdf/2404.12833.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12833.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12833)), ([:house:](https://huggingface.co/papers/2404.12833)), ([HTML](https://browse.arxiv.org/html/2404.12833v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.12833)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12833v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12833)), ([SS](https://api.semanticscholar.org/arXiv:2404.12833))\n  * 04/19 - **Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.13013)), ([:book:](https://browse.arxiv.org/pdf/2404.13013.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.13013.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.13013)), ([:house:](https://huggingface.co/papers/2404.13013)), ([HTML](https://browse.arxiv.org/html/2404.13013v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.13013)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.13013v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.13013)), ([SS](https://api.semanticscholar.org/arXiv:2404.13013))\n  * 04/19 - **Does Gaussian Splatting need SFM Initialization?** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.12547)), ([:book:](https://browse.arxiv.org/pdf/2404.12547.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12547.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12547)), ([:house:](https://huggingface.co/papers/2404.12547)), ([HTML](https://browse.arxiv.org/html/2404.12547v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.12547)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12547v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12547)), ([SS](https://api.semanticscholar.org/arXiv:2404.12547))\n  * 04/19 - **AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.12753)), ([:book:](https://browse.arxiv.org/pdf/2404.12753.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12753.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12753)), ([:house:](https://huggingface.co/papers/2404.12753)), ([HTML](https://browse.arxiv.org/html/2404.12753v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.12753)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12753v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12753)), ([SS](https://api.semanticscholar.org/arXiv:2404.12753))\n  * 04/18 - **TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.11912)), ([:book:](https://browse.arxiv.org/pdf/2404.11912.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.11912.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.11912)), ([:house:](https://huggingface.co/papers/2404.11912)), ([HTML](https://browse.arxiv.org/html/2404.11912v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.11912)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.11912v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.11912)), ([SS](https://api.semanticscholar.org/arXiv:2404.11912))\n  * 04/18 - **Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.12253)), ([:book:](https://browse.arxiv.org/pdf/2404.12253.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12253.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12253)), ([:house:](https://huggingface.co/papers/2404.12253)), ([HTML](https://browse.arxiv.org/html/2404.12253v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.12253)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12253v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12253)), ([SS](https://api.semanticscholar.org/arXiv:2404.12253))\n  * 04/18 - **Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.12318)), ([:book:](https://browse.arxiv.org/pdf/2404.12318.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12318.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12318)), ([:house:](https://huggingface.co/papers/2404.12318)), ([HTML](https://browse.arxiv.org/html/2404.12318v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.12318)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12318v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12318)), ([SS](https://api.semanticscholar.org/arXiv:2404.12318))\n  * 04/18 - **Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.12387)), ([:book:](https://browse.arxiv.org/pdf/2404.12387.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12387.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12387)), ([:house:](https://huggingface.co/papers/2404.12387)), ([HTML](https://browse.arxiv.org/html/2404.12387v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.12387)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12387v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12387)), ([SS](https://api.semanticscholar.org/arXiv:2404.12387))\n  * 04/18 - **OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of Instruction Data** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.12195)), ([:book:](https://browse.arxiv.org/pdf/2404.12195.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12195.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12195)), ([:house:](https://huggingface.co/papers/2404.12195)), ([HTML](https://browse.arxiv.org/html/2404.12195v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.12195)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12195v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12195)), ([SS](https://api.semanticscholar.org/arXiv:2404.12195))\n  * 04/18 - **MeshLRM: Large Reconstruction Model for High-Quality Mesh** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.12385)), ([:book:](https://browse.arxiv.org/pdf/2404.12385.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12385.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12385)), ([:house:](https://huggingface.co/papers/2404.12385)), ([HTML](https://browse.arxiv.org/html/2404.12385v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.12385)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12385v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12385)), ([SS](https://api.semanticscholar.org/arXiv:2404.12385))\n  * 04/18 - **Introducing v0.5 of the AI Safety Benchmark from MLCommons** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.12241)), ([:book:](https://browse.arxiv.org/pdf/2404.12241.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12241.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12241)), ([:house:](https://huggingface.co/papers/2404.12241)), ([HTML](https://browse.arxiv.org/html/2404.12241v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.12241)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12241v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12241)), ([SS](https://api.semanticscholar.org/arXiv:2404.12241))\n  * 04/18 - **Introducing Meta Llama 3: The most capable openly available LLM to date** \u003cbr\u003e  ([Blog](https://ai.meta.com/blog/meta-llama-3/)), \n  * 04/18 - **EdgeFusion: On-Device Text-to-Image Generation** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.11925)), ([:book:](https://browse.arxiv.org/pdf/2404.11925.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.11925.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.11925)), ([:house:](https://huggingface.co/papers/2404.11925)), ([HTML](https://browse.arxiv.org/html/2404.11925v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.11925)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.11925v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.11925)), ([SS](https://api.semanticscholar.org/arXiv:2404.11925))\n  * 04/18 - **BLINK: Multimodal Large Language Models Can See but Not Perceive** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.12390)), ([:book:](https://browse.arxiv.org/pdf/2404.12390.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12390.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12390)), ([:house:](https://huggingface.co/papers/2404.12390)), ([HTML](https://browse.arxiv.org/html/2404.12390v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.12390)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12390v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12390)), ([SS](https://api.semanticscholar.org/arXiv:2404.12390))\n  * 04/18 - **AniClipart: Clipart Animation with Text-to-Video Priors** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.12347)), ([:book:](https://browse.arxiv.org/pdf/2404.12347.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.12347.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.12347)), ([:house:](https://huggingface.co/papers/2404.12347)), ([HTML](https://browse.arxiv.org/html/2404.12347v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.12347)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.12347v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.12347)), ([SS](https://api.semanticscholar.org/arXiv:2404.12347))\n  * 04/17 - **MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.11565)), ([:book:](https://browse.arxiv.org/pdf/2404.11565.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.11565.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.11565)), ([:house:](https://huggingface.co/papers/2404.11565)), ([HTML](https://browse.arxiv.org/html/2404.11565v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.11565)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.11565v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.11565)), ([SS](https://api.semanticscholar.org/arXiv:2404.11565))\n  * 04/17 - **FlowMind: Automatic Workflow Generation with LLMs** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.13050)), ([:book:](https://browse.arxiv.org/pdf/2404.13050.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.13050.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.13050)), ([:house:](https://huggingface.co/papers/2404.13050)), ([HTML](https://browse.arxiv.org/html/2404.13050v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.13050)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.13050v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.13050)), ([SS](https://api.semanticscholar.org/arXiv:2404.13050))\n  * 04/17 - **Dynamic Typography: Bringing Words to Life** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.11614)), ([:book:](https://browse.arxiv.org/pdf/2404.11614.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.11614.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.11614)), ([:house:](https://huggingface.co/papers/2404.11614)), ([HTML](https://browse.arxiv.org/html/2404.11614v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.11614)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.11614v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.11614)), ([SS](https://api.semanticscholar.org/arXiv:2404.11614))\n  * 04/17 - **Stable Diffusion 3 API Now Available** \u003cbr\u003e  ([twitter](https://twitter.com/StabilityAI/status/1780599024707596508)),  ([Blog](https://stability.ai/news/stable-diffusion-3-api?utm_source=twitter\u0026utm_medium=website\u0026utm_campaign=blog)),  ([Demo](https://platform.stability.ai/docs/api-reference#tag/Generate/paths/~1v2beta~1stable-image~1generate~1sd3/post)), \n  * 04/16 - **VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.10667)), ([:book:](https://browse.arxiv.org/pdf/2404.10667.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.10667.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.10667)), ([:house:](https://huggingface.co/papers/2404.10667)), ([HTML](https://browse.arxiv.org/html/2404.10667v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.10667)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.10667v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.10667)), ([SS](https://api.semanticscholar.org/arXiv:2404.10667)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/vasa-1-lifelike-audio-driven-talking-faces))\n  * 04/16 - **U.S. Commerce Secretary Gina Raimondo Announces Expansion of U.S. AI Safety Institute Leadership Team** \u003cbr\u003e  ([News](https://www.commerce.gov/news/press-releases/2024/04/us-commerce-secretary-gina-raimondo-announces-expansion-us-ai-safety)), \n  * 04/16 - **Long-form music generation with latent diffusion** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.10301)), ([:book:](https://browse.arxiv.org/pdf/2404.10301.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.10301.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.10301)), ([:house:](https://huggingface.co/papers/2404.10301)), ([HTML](https://browse.arxiv.org/html/2404.10301v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.10301)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.10301v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.10301)), ([SS](https://api.semanticscholar.org/arXiv:2404.10301))\n  * 04/15 - **LLM Evaluators Recognize and Favor Their Own Generations** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.13076)), ([:book:](https://browse.arxiv.org/pdf/2404.13076.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.13076.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.13076)), ([:house:](https://huggingface.co/papers/2404.13076)), ([HTML](https://browse.arxiv.org/html/2404.13076v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.13076)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.13076v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.13076)), ([SS](https://api.semanticscholar.org/arXiv:2404.13076))\n  * 04/15 - **Video2Game: Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.09833)), ([:book:](https://browse.arxiv.org/pdf/2404.09833.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09833.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09833)), ([:house:](https://huggingface.co/papers/2404.09833)), ([HTML](https://browse.arxiv.org/html/2404.09833v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.09833)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09833v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09833)), ([SS](https://api.semanticscholar.org/arXiv:2404.09833)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/video2game-real-time-interactive-realistic))\n  * 04/15 - **Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.09956)), ([:book:](https://browse.arxiv.org/pdf/2404.09956.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09956.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09956)), ([:house:](https://huggingface.co/papers/2404.09956)), ([HTML](https://browse.arxiv.org/html/2404.09956v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.09956)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09956v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09956)), ([SS](https://api.semanticscholar.org/arXiv:2404.09956)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/tango-2-aligning-diffusion-based-text-to)), ([:octocat:](https://github.com/declare-lab/tango)![GitHub Repo stars](https://img.shields.io/github/stars/declare-lab/tango?style=social))\n  * 04/15 - **Taming Latent Diffusion Model for Neural Radiance Field Inpainting** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.09995)), ([:book:](https://browse.arxiv.org/pdf/2404.09995.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09995.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09995)), ([:house:](https://huggingface.co/papers/2404.09995)), ([HTML](https://browse.arxiv.org/html/2404.09995v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.09995)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09995v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09995)), ([SS](https://api.semanticscholar.org/arXiv:2404.09995)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/taming-latent-diffusion-model-for-neural))\n  * 04/15 - **Opus can operate as a Turing machine** \u003cbr\u003e  ([twitter](https://twitter.com/ctjlewis/status/1779740038852690393)), \n  * 04/15 - **MathGPT: Leveraging Llama 2 to create a platform for highly personalized learning** \u003cbr\u003e \n  * 04/15 - **HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.09990)), ([:book:](https://browse.arxiv.org/pdf/2404.09990.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09990.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09990)), ([:house:](https://huggingface.co/papers/2404.09990)), ([HTML](https://browse.arxiv.org/html/2404.09990v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.09990)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09990v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09990)), ([SS](https://api.semanticscholar.org/arXiv:2404.09990)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/hq-edit-a-high-quality-dataset-for))\n  * 04/15 - **Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.09967)), ([:book:](https://browse.arxiv.org/pdf/2404.09967.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09967.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09967)), ([:house:](https://huggingface.co/papers/2404.09967)), ([HTML](https://browse.arxiv.org/html/2404.09967v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.09967)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09967v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09967)), ([SS](https://api.semanticscholar.org/arXiv:2404.09967)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/ctrl-adapter-an-efficient-and-versatile))\n  * 04/15 - **Compression Represents Intelligence Linearly** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.09937)), ([:book:](https://browse.arxiv.org/pdf/2404.09937.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09937.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09937)), ([:house:](https://huggingface.co/papers/2404.09937)), ([HTML](https://browse.arxiv.org/html/2404.09937v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.09937)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09937v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09937)), ([SS](https://api.semanticscholar.org/arXiv:2404.09937)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/compression-represents-intelligence-linearly))\n  * 04/15 - **CompGS: Efficient 3D Scene Representation via Compressed Gaussian Splatting** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.09458)), ([:book:](https://browse.arxiv.org/pdf/2404.09458.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09458.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09458)), ([:house:](https://huggingface.co/papers/2404.09458)), ([HTML](https://browse.arxiv.org/html/2404.09458v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.09458)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09458v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09458)), ([SS](https://api.semanticscholar.org/arXiv:2404.09458)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/compgs-efficient-3d-scene-representation-via))\n  * 04/14 - **TextHawk: Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.09204)), ([:book:](https://browse.arxiv.org/pdf/2404.09204.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.09204.pdf)),  ([:orange_book:](https://www.arxiv-vanity.com/papers/2404.09204)), ([:house:](https://huggingface.co/papers/2404.09204)), ([HTML](https://browse.arxiv.org/html/2404.09204v1)), ([SL](https://arxiv-sanity-lite.com/?rank=pid\u0026pid=2404.09204)), ([SP](https://www.summarizepaper.com/en/arxiv-id/2404.09204v1/)), ([GS](https://scholar.google.com/scholar_lookup?arxiv_id=2404.09204)), ([SS](https://api.semanticscholar.org/arXiv:2404.09204)), ([:eight_spoked_asterisk:](https://paperswithcode.com/paper/texthawk-exploring-efficient-fine-grained)), ([:octocat:](https://github.com/yuyq96/texthawk)![GitHub Repo stars](https://img.shields.io/github/stars/yuyq96/texthawk?style=social))\n  * 04/13 - **Cathie Wood Muscles Into ChatGPT Boom With New OpenAI Stake** \u003cbr\u003e  ([News](https://finance.yahoo.com/news/cathie-wood-ark-investment-management-232619722.html)),\n  * 04/12 - **Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies** \u003cbr\u003e([:x:](https://arxiv.org/abs/2404.08197)), ([:book:](https://browse.arxiv.org/pdf/2404.08197.pdf)), ([:paperclip:](https://arxiv.org/pdf/2404.08197.pdf)),  ([:orange_book:](https://www.a","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhollobit%2FGenAI_LLM_timeline","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhollobit%2FGenAI_LLM_timeline","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhollobit%2FGenAI_LLM_timeline/lists"}