https://github.com/InfiniteAICreations/awesome-llm-papers

😎 Awesome lists about all kinds of LLM related papers
https://github.com/InfiniteAICreations/awesome-llm-papers

List: awesome-llm-papers

awesome awesome-list llm paper pr-welcome

Last synced: 6 months ago
JSON representation

😎 Awesome lists about all kinds of LLM related papers

Host: GitHub
URL: https://github.com/InfiniteAICreations/awesome-llm-papers
Owner: InfiniteAICreations
License: cc0-1.0
Created: 2024-03-12T14:39:41.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-04-23T14:25:42.000Z (about 1 year ago)
Last Synced: 2024-04-23T14:40:50.413Z (about 1 year ago)
Topics: awesome, awesome-list, llm, paper, pr-welcome
Homepage:
Size: 9.77 KB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

ultimate-awesome - awesome-llm-papers - 😎 Awesome lists about all kinds of LLM related papers. (Other Lists / Julia Lists)

README

        


  awesome-llm-papers

  

    awesome-llm-projects |

    awesome-llm-datasets

  

  

  

      English | 简体中文

  


  


    

      

    

    

      

    

  



😎 Awesome lists about all kinds of LLM related papers

## Text / Information / Knowledge

- [Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks](https://arxiv.org/abs/2005.11401): This paper introduces a general-purpose fine-tuning recipe for retrieval-augmented generation (RAG) -- models which combine pre-trained parametric and non-parametric memory for language generation.

- [Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering](https://arxiv.org/abs/2404.17723)

## Image

- [Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2006.11239):  This paper introduces a new class of generative models called denoising diffusion probabilistic models (DDPMs).

- [Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs](https://arxiv.org/pdf/2404.05719.pdf): Apple's paper on multimodal LLMs for mobile UI understanding.

## Architecture

- [Attention is All You Need](https://arxiv.org/abs/1706.03762): This paper introduces the Transformer architecture, which is based on the multi-head attention mechanism.

- [Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention](https://arxiv.org/abs/2404.07143): This work introduces an efficient method to scale Transformer-based Large Language Models (LLMs) to infinitely long inputs with bounded memory and computation.

- [A Primer on the Inner Workings of Transformer-based Language Models](https://arxiv.org/abs/2405.00208): This paper presents a technical introduction to current techniques used to interpret the inner workings of Transformer-based language models.

- [KAN: Kolmogorov-Arnold Networks](https://arxiv.org/abs/2404.19756v2): Inspired by the Kolmogorov-Arnold representation theorem, the paper propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs)

## Survey

- [A Survey of Large Language Models](https://arxiv.org/abs/2303.18223)

- [A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 All You Need?](https://arxiv.org/abs/2303.11717)

- [A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT](https://arxiv.org/abs/2302.09419)

- [Multilingual Large Language Model: A Survey of Resources, Taxonomy and Frontiers](https://arxiv.org/abs/2404.04925)

- [A Comprehensive Overview of Large Language Models](https://arxiv.org/abs/2307.06435)

- [RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing](https://arxiv.org/abs/2404.19543)

## Operating System

- [AIOS: LLM Agent Operating System](https://arxiv.org/abs/2403.16971)

## Text To SQL

- [Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task](https://arxiv.org/abs/1809.08887)

- [Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning](https://arxiv.org/abs/1709.00103)

- [DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction](https://arxiv.org/abs/2304.11015)

- [KaggleDBQA: Realistic Evaluation of Text-to-SQL Parsers](https://arxiv.org/abs/2106.11455)

- [Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs](https://arxiv.org/pdf/2305.03111.pdf)

## LLMs

- [CodeGemma: Open Code Models Based on Gemma](https://storage.googleapis.com/deepmind-media/gemma/codegemma_report.pdf)

- [Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone](https://arxiv.org/pdf/2404.14219.pdf)

- [Code Llama: Open Foundation Models for Code](https://arxiv.org/pdf/2308.12950.pdf)

- [Advancing Multimodal Medical Capabilities of Gemini](https://arxiv.org/pdf/2405.03162)

- [An Introduction to Vision-Language Modeling](https://arxiv.org/pdf/2405.17247)

## LLM fine-tuning

- [Efficient Training of Language Models to Fill in the Middle](https://arxiv.org/abs/2207.14255)

- [Better & Faster Large Language Models via Multi-token Prediction](https://arxiv.org/pdf/2404.19737)

## Security

- [LLM Agents can Autonomously Exploit One-day Vulnerabilities](https://arxiv.org/abs/2404.08144v2)

## Evaluation

- [SWE-BENCH: CAN LANGUAGE MODELS RESOLVE REAL-WORLD GITHUB ISSUES?](https://arxiv.org/pdf/2310.06770): 

SWE-bench is an evaluation framework comprising 2,294 software engineering problems from GitHub, designed to test language models on complex code-editing tasks requiring extensive contextual understanding and reasoning. 

- [GAIA: A Benchmark for General AI Assistants](https://arxiv.org/pdf/2311.12983)

## Agents

- [AutoDev: Automated AI-Driven Development](https://arxiv.org/pdf/2403.08299.pdf)

- [Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration](https://arxiv.org/pdf/2406.01014)

- [Mixture-of-Agents Enhances Large Language Model Capabilities](https://arxiv.org/abs/2406.04692)

## Vision

- [A Hierarchical 3D Gaussian Representation for Real-Time Rendering of Very Large Datasets](https://repo-sam.inria.fr/fungraph/hierarchical-3d-gaussians/)

- [Render and Diffuse: Aligning Image and Action Spaces for Diffusion-based Behaviour Cloning](https://arxiv.org/pdf/2405.18196)

## Audio

- [XTTS: a Massively Multilingual Zero-Shot Text-to-Speech Model](https://arxiv.org/abs/2406.04904)

- [VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers](https://arxiv.org/abs/2406.05370)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/InfiniteAICreations/awesome-llm-papers

Awesome Lists containing this project

README

awesome-llm-papers

awesome-llm-projects |
awesome-llm-datasets

https://github.com/InfiniteAICreations/awesome-llm-papers

Awesome Lists containing this project

README

awesome-llm-papers

awesome-llm-projects | awesome-llm-datasets

awesome-llm-projects |
awesome-llm-datasets