Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/InfiniteAICreations/awesome-llm-papers

😎 Awesome lists about all kinds of LLM related papers
https://github.com/InfiniteAICreations/awesome-llm-papers

List: awesome-llm-papers

awesome awesome-list llm paper pr-welcome

Last synced: 3 months ago
JSON representation

😎 Awesome lists about all kinds of LLM related papers

Awesome Lists containing this project

README

        


awesome-llm-papers



awesome-llm-projects |
awesome-llm-datasets


Logo


English | įŽ€äŊ“中文



Awesome


http://makeapullrequest.com


😎 Awesome lists about all kinds of LLM related papers

## Text / Information / Knowledge
- [Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks](https://arxiv.org/abs/2005.11401): This paper introduces a general-purpose fine-tuning recipe for retrieval-augmented generation (RAG) -- models which combine pre-trained parametric and non-parametric memory for language generation.
- [Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering](https://arxiv.org/abs/2404.17723)

## Image
- [Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2006.11239): This paper introduces a new class of generative models called denoising diffusion probabilistic models (DDPMs).
- [Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs](https://arxiv.org/pdf/2404.05719.pdf): Apple's paper on multimodal LLMs for mobile UI understanding.

## Architecture
- [Attention is All You Need](https://arxiv.org/abs/1706.03762): This paper introduces the Transformer architecture, which is based on the multi-head attention mechanism.
- [Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention](https://arxiv.org/abs/2404.07143): This work introduces an efficient method to scale Transformer-based Large Language Models (LLMs) to infinitely long inputs with bounded memory and computation.
- [A Primer on the Inner Workings of Transformer-based Language Models](https://arxiv.org/abs/2405.00208): This paper presents a technical introduction to current techniques used to interpret the inner workings of Transformer-based language models.
- [KAN: Kolmogorov-Arnold Networks](https://arxiv.org/abs/2404.19756v2): Inspired by the Kolmogorov-Arnold representation theorem, the paper propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs)

## Survey
- [A Survey of Large Language Models](https://arxiv.org/abs/2303.18223)
- [A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 All You Need?](https://arxiv.org/abs/2303.11717)
- [A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT](https://arxiv.org/abs/2302.09419)
- [Multilingual Large Language Model: A Survey of Resources, Taxonomy and Frontiers](https://arxiv.org/abs/2404.04925)
- [A Comprehensive Overview of Large Language Models](https://arxiv.org/abs/2307.06435)
- [RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing](https://arxiv.org/abs/2404.19543)

## Operating System
- [AIOS: LLM Agent Operating System](https://arxiv.org/abs/2403.16971)

## Text To SQL
- [Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task](https://arxiv.org/abs/1809.08887)
- [Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning](https://arxiv.org/abs/1709.00103)
- [DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction](https://arxiv.org/abs/2304.11015)
- [KaggleDBQA: Realistic Evaluation of Text-to-SQL Parsers](https://arxiv.org/abs/2106.11455)
- [Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs](https://arxiv.org/pdf/2305.03111.pdf)

## LLMs
- [CodeGemma: Open Code Models Based on Gemma](https://storage.googleapis.com/deepmind-media/gemma/codegemma_report.pdf)
- [Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone](https://arxiv.org/pdf/2404.14219.pdf)
- [Code Llama: Open Foundation Models for Code](https://arxiv.org/pdf/2308.12950.pdf)
- [Advancing Multimodal Medical Capabilities of Gemini](https://arxiv.org/pdf/2405.03162)
- [An Introduction to Vision-Language Modeling](https://arxiv.org/pdf/2405.17247)

## LLM fine-tuning
- [Efficient Training of Language Models to Fill in the Middle](https://arxiv.org/abs/2207.14255)
- [Better & Faster Large Language Models via Multi-token Prediction](https://arxiv.org/pdf/2404.19737)

## Security
- [LLM Agents can Autonomously Exploit One-day Vulnerabilities](https://arxiv.org/abs/2404.08144v2)

## Evaluation
- [SWE-BENCH: CAN LANGUAGE MODELS RESOLVE REAL-WORLD GITHUB ISSUES?](https://arxiv.org/pdf/2310.06770):
SWE-bench is an evaluation framework comprising 2,294 software engineering problems from GitHub, designed to test language models on complex code-editing tasks requiring extensive contextual understanding and reasoning.
- [GAIA: A Benchmark for General AI Assistants](https://arxiv.org/pdf/2311.12983)

## Agents
- [AutoDev: Automated AI-Driven Development](https://arxiv.org/pdf/2403.08299.pdf)
- [Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration](https://arxiv.org/pdf/2406.01014)
- [Mixture-of-Agents Enhances Large Language Model Capabilities](https://arxiv.org/abs/2406.04692)

## Vision
- [A Hierarchical 3D Gaussian Representation for Real-Time Rendering of Very Large Datasets](https://repo-sam.inria.fr/fungraph/hierarchical-3d-gaussians/)
- [Render and Diffuse: Aligning Image and Action Spaces for Diffusion-based Behaviour Cloning](https://arxiv.org/pdf/2405.18196)

## Audio
- [XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model](https://arxiv.org/abs/2406.04904)
- [VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers](https://arxiv.org/abs/2406.05370)