Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Awesome-Resource-Efficient-LLM-Papers

a curated list of high-quality papers on resource-efficient LLMs 🌱
https://github.com/tiingweii-shii/Awesome-Resource-Efficient-LLM-Papers

Last synced: 4 days ago
JSON representation

System Design
LLM Inference
- Model Compression
- Dynamic Acceleration
LLM Architecture Design
- Efficient Transformer Architecture
- Non-transformer Architecture
LLM Pre-Training
- Memory Efficiency
- Data Efficiency
Resource-Efficiency Evaluation Metrics \& Benchmarks
- 🧮 Computation Metrics
  - \[end-to-end latency in seconds\
  - \[tokens/s\
  - \[minutes, days\ - 0069/23-0069.pdf)|
  - \[end-to-end latency in seconds\
  - \[inference time speed-up\ - up\]](https://github.com/NVIDIA/FasterTransformer)|
- ⚡️ Energy Metrics
- Benchmarks
  - GLUE - 2301.pdf), and [SQuAD](https://arxiv.org/pdf/1606.05250.pdf), etc. | [A Comprehensive Overview of Large Language Models](https://arxiv.org/pdf/2307.06435.pdf)|
  - Long Range Arena: A Benchmark for Efficient Transformers
  - Towards Efficient NLP: A Standard Evaluation and A Strong Baseline
  - MS MARCO - query latency and cost alongside accuracy, facilitating a comprehensive evaluation of IR systems | [Moving Beyond Downstream Task Accuracy for Information Retrieval Benchmarking](https://arxiv.org/pdf/2212.01340.pdf)|
  - GLUE - 2301.pdf), and [SQuAD](https://arxiv.org/pdf/1606.05250.pdf), etc. | [A Comprehensive Overview of Large Language Models](https://arxiv.org/pdf/2307.06435.pdf)|
  - Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking
  - NeurIPS 2020 - efficient QA systems | [NeurIPS 2020 EfficientQA Competition: Systems, Analyses and Lessons Learned](https://proceedings.mlr.press/v133/min21a/min21a.pdf)|
  - SustaiNLP 2020 - efficient NLP models by assessing their performance across eight NLU tasks using SuperGLUE metrics and evaluating their energy consumption during inference | [Overview of the SustaiNLP 2020 Shared Task](https://aclanthology.org/2020.sustainlp-1.24.pdf)|
  - VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models
  - Long Range Arena: A Benchmark for Efficient Transformers
  - MS MARCO - query latency and cost alongside accuracy, facilitating a comprehensive evaluation of IR systems | [Moving Beyond Downstream Task Accuracy for Information Retrieval Benchmarking](https://arxiv.org/pdf/2212.01340.pdf)|
- 💾 Memory Metrics
- 📨 Network Communication Metric
  - \[communication volume in TB\
  - \[communication volume in TB\
- 💡 Other Metrics
  - \[after-attack accuracy, query number\
  - \[Pareto frontier (cost and accuracy)\
LLM Fine-Tuning
- Parameter-Efficient Fine-Tuning
- Full-Parameter Fine-Tuning

Programming Languages

Python 5 HTML 1 C++ 1

Categories

LLM Inference 41 Resource-Efficiency Evaluation Metrics \& Benchmarks 32 LLM Pre-Training 32 LLM Architecture Design 29 System Design 16 LLM Fine-Tuning 11

Sub Categories

Model Compression 33 Data Efficiency 18 Efficient Transformer Architecture 17 Memory Efficiency 14 Non-transformer Architecture 12 Support Infrastructure 11 Benchmarks 11 Dynamic Acceleration 8 ⚡️ Energy Metrics 8 Full-Parameter Fine-Tuning 6 Parameter-Efficient Fine-Tuning 5 🧮 Computation Metrics 5 💾 Memory Metrics 4 Deployment optimization 3 📨 Network Communication Metric 2 💡 Other Metrics 2 Other Systems 2

Keywords

transformer 1 pytorch 1 gpt 1 bert 1