Awesome-Code-LLM
[TMLR] A curated list of language modeling researches for code (and other software engineering activities), plus related datasets.
https://github.com/codefuse-ai/Awesome-Code-LLM
Last synced: 11 days ago
JSON representation
-
News
- codefuse-ai/CodeFuse-CGM
- codefuse-ai/RepoFuse
- codefuse-ai/EasyDeploy
- codefuse-ai/rodimus
- codefuse-ai/CodeFuse-muAgent
- codefuse-ai/CodeFuse-CGE
- codefuse-ai/D2LLM
- codefuse-ai/CodeFuse-MFT-VLM
- codefuse-ai/MFTCoder
- Qwen2.5-Omni Technical Report
- F2LLM - ai/CodeFuse-Embeddings)] [[model & data](https://huggingface.co/collections/codefuse-ai/codefuse-embeddings-68d4b32da791bbba993f8d14)]
- SWE-Compass: Towards Unified Evaluation of Agentic Coding Abilities for Large Language Models
- CudaForge: An Agent Framework with Hardware Feedback for CUDA Kernel Optimization
- CodeClash: Benchmarking Goal-Oriented Software Engineering
- Instella: Fully Open Language Models with Stellar Performance
- DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models - AI.
- 2025/12/09
- LLaDA2.0: Scaling Up Diffusion Language Models to 100B
- T5Gemma 2: Seeing, Reading, and Understanding Longer
- Olmo 3
- Scaling Laws for Code: Every Programming Language Matters
- NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents
- SimpleDevQA: Benchmarking Large Language Models on Development Knowledge QA - Sen University.
- C2LLM Technical Report: A New Frontier in Code Retrieval via Adaptive Cross-Attention Pooling
- Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning
- MiMo-V2-Flash Technical Report
- K-EXAONE Technical Report
- SWE-RM: Execution-free Feedback For Software Engineering Agents
- Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model
- Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces
- X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests
- CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
- GLM-5: from Vibe Coding to Agentic Engineering
- SWE-Universe: Scale Real-World Verifiable Environments to Millions
- Composer 2 Technical Report
- F2LLM-v2 - of-the-art on at least 11 MTEB benchmarks. [[code](https://github.com/codefuse-ai/CodeFuse-Embeddings)] [[model & data](https://huggingface.co/collections/codefuse-ai/f2llm)]
- Beyond Retrieval: A Multitask Benchmark and Model for Code Search
- ML-Embed - ai/CodeFuse-Embeddings)] [[model & data](https://huggingface.co/collections/codefuse-ai/codefuse-embeddings)]
-
Other Awesome LLM Reading Lists
-
8.2 Benchmarks
-
-
Star History
-
5.2 Benchmarks
- ![Star History Chart - history.com/#codefuse-ai/Awesome-Code-LLM&Date)
-
8.2 Benchmarks
- ![Star History Chart - history.com/#codefuse-ai/Awesome-Code-LLM&Date)
-
Programming Languages
Categories
5. Methods/Models for Downstream Tasks
1,248
8. Datasets
583
3. When Coding Meets Reasoning
315
2. Models
286
6. Analysis of AI-Generated Code
246
4. Code LLM for Low-Resource, Low-Level, and Domain-Specific Languages
122
7. Human-LLM Interaction
73
News
62
9. Recommended Readings
32
5. Datasets
29
4. Datasets
20
1. Surveys
17
6. Datasets
4
Other Awesome LLM Reading Lists
3
Star History
2
7. User-LLM Interaction
1
Sub Categories
8.2 Benchmarks
613
3.5 Frontend Navigation
179
Text-To-SQL
171
3.3 Code Agents
119
Vulnerability Detection
116
Others
113
2.1 Base LLMs and Pretraining Strategies
98
Code Generation
92
Code Commenting and Summarization
83
Test Generation
79
2.4 (Instruction) Fine-Tuning on Code
76
Malicious Code Detection
75
Program Repair
75
3.1 Coding for Reasoning
66
Security and Vulnerabilities
59
3.4 Interactive Coding
55
2.3 General Pretraining on Code
54
Code Review
49
Code Translation
47
Frontend Development
46
2.5 Reinforcement Learning on Code
44
Repository-Level Coding
42
Code Similarity and Embedding (Clone Detection, Code Search)
38
Correctness
34
Issue Resolution
32
5.2 Benchmarks
30
Requirement Engineering
28
Log Analysis
26
Program Proof
26
Automated Machine Learning
25
AI-Generated Code Detection
25
Compiler Optimization
24
Code RAG
23
Code Refactoring and Migration
23
Binary Analysis and Decompilation
23
4.2 Benchmarks
20
Efficiency
20
3.2 Code Simulation
18
Software Configuration
17
Code Ranking
16
Code QA & Reasoning
16
Robustness
15
Oracle Generation
15
2.2 Existing LLM Adapted to Code
14
Hallucination
13
Fuzz Testing
12
Interpretability
12
Software Modeling
10
API Usage
10
Privacy
9
Commit Message Generation
8
Mutation Testing
7
Bias
7
8.1 Pretraining
6
Type Prediction
4
6.2 Benchmarks
4
Contamination
3