{"id":19691818,"url":"https://github.com/yueyuel/reliablelm4code","last_synced_at":"2026-01-31T13:05:59.040Z","repository":{"id":203894793,"uuid":"691969285","full_name":"yueyueL/ReliableLM4Code","owner":"yueyueL","description":"Collections of research, benchmarks and tools towards more robust and reliable language models for code;  LM4Code; LM4SE; reliable LLM; LLM4Code","archived":false,"fork":false,"pushed_at":"2023-12-14T13:29:12.000Z","size":4074,"stargazers_count":26,"open_issues_count":1,"forks_count":2,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-06-18T03:02:15.649Z","etag":null,"topics":["code-generation","code-intelligence","language-models","llm4code","lm4se","reliability","software-"],"latest_commit_sha":null,"homepage":"https://yueyuel.github.io/ReliableLM4Code/","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yueyueL.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-09-15T09:19:22.000Z","updated_at":"2025-04-02T02:29:19.000Z","dependencies_parsed_at":null,"dependency_job_id":"ca924e73-e706-4a7c-9fa8-3b632a040b11","html_url":"https://github.com/yueyueL/ReliableLM4Code","commit_stats":{"total_commits":125,"total_committers":3,"mean_commits":"41.666666666666664","dds":0.392,"last_synced_commit":"49b051a1d3e0c15b9d1af2ad75a6bc8eb35d1761"},"previous_names":["yueyuel/reliablelm4code"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/yueyueL/ReliableLM4Code","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yueyueL%2FReliableLM4Code","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yueyueL%2FReliableLM4Code/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yueyueL%2FReliableLM4Code/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yueyueL%2FReliableLM4Code/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yueyueL","download_url":"https://codeload.github.com/yueyueL/ReliableLM4Code/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yueyueL%2FReliableLM4Code/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28943938,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-31T13:02:32.153Z","status":"ssl_error","status_checked_at":"2026-01-31T13:00:07.528Z","response_time":128,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["code-generation","code-intelligence","language-models","llm4code","lm4se","reliability","software-"],"created_at":"2024-11-11T19:11:11.744Z","updated_at":"2026-01-31T13:05:58.981Z","avatar_url":"https://github.com/yueyueL.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# [ReliableLM4Code](https://yueyuel.github.io/ReliableLM4Code/)\n\nThis repository extends from our recent work, \"[Pitfalls in Language Models for Code Intelligence: A Taxonomy and Survey](https://arxiv.org/abs/2310.17903)\" and \"[Large language models for software engineering: A systematic literature review](https://arxiv.org/abs/2308.10620)\". It includes necessary information for our research and a curated collection of LM4Code papers and other resources (datasets, tutorials, etc.). The focus is primarily on papers that use pre-trained models, especially large language models, to improve the reliability of language models in Software Engineering research.\n\nFor more details, please access this [site](https://yueyuel.github.io/ReliableLM4Code/)\n\n\n\u003e Modern language models (LMs) have been successfully employed in source code generation and understanding, leading to a significant increase in research focused on learning-based code intelligence, such as automated\nbug repair, and test case generation. Despite their great potential, **language models for code intelligence (LM4Code) are susceptible to potential pitfalls, which hinder realistic performance and further impact their reliability and applicability in real-world deployment**. Such challenges drive the need for a comprehensive understanding - not just identifying these issues but delving into their possible implications and existing solutions to build more reliable language models tailored to code intelligence. Based on a well-defined systematic research approach, we conducted an extensive literature review to uncover the pitfalls inherent in LM4Code. Finally, 67 primary studies from top-tier venues have been identified. After carefully examining these studies, we designed a taxonomy of pitfalls in LM4Code research and conducted a systematic study to summarize the issues, implications, current solutions, and challenges of different pitfalls for LM4Code systems. We developed a comprehensive classification scheme that dissects pitfalls across four crucial aspects: data collection and labeling, system design and learning, performance evaluation, and deployment and maintenance. Through this study, we aim to provide a roadmap for researchers and practitioners, facilitating their understanding and utilization of LM4Code in reliable and trustworthy ways.\n\nPlease feel free to send a pull request to add papers and relevant content that are not listed here.  We uploaded our completed paper lists to Google Drive with detailed reviewed information. \n\n## Content\n- [About our survey](https://yueyuel.github.io/ReliableLM4Code/docs/reliable_LM4Code_review)\n- [What is LM4Code?](https://yueyuel.github.io/ReliableLM4Code/docs/LM4Code)\n    - [LLMs](https://yueyuel.github.io/ReliableLM4Code/docs/LM4Code/LMmodels/)\n    - [LM4Code Tasks](https://yueyuel.github.io/ReliableLM4Code/docs/LM4Code/SEtasks/)\n    - [Benchmark Datasets](https://yueyuel.github.io/ReliableLM4Code/docs/LM4Code/benchmark/)\n- [Relevant Surveys and Tutorial](https://yueyuel.github.io/ReliableLM4Code/docs/relevant_surveys/)\n- [Explanable LM4Code](https://yueyuel.github.io/ReliableLM4Code/docs/xai_lm4code/)\n- [Top Researchers in LM4Code](https://yueyuel.github.io/ReliableLM4Code/docs/researchers/)\n- [Relevant Venus](https://yueyuel.github.io/ReliableLM4Code/docs/venus/)\n- [LLMs in Securty](https://yueyuel.github.io/ReliableLM4Code/docs/LMinsecurity/)\n\n# Papers\n\n## Data Collection and Labeling\n### *Unbalanced Distribution*\n- **Deep Learning Based Vulnerability Detection** (2021), arxiv, S Chakraborty, R Krishna, Y Ding, et al. [[pdf]](https://arxiv.org/pdf/2009.07235)\n- **Does data sampling improve deep learning-based vulnerability detection? Yeas! and Nays!** (2023), ICSE, X Yang, et al. [[pdf]](https://shaoweiwang2010.github.io/papers/ICSE_2023_Sampling_Vulnerablity.pdf)\n- **On the Value of Oversampling for Deep Learning in Software Defect Prediction** (2021), TSE, R Yedida, T Menzies. [[pdf]](https://arxiv.org/pdf/2008.03835)\n- **Robust Learning of Deep Predictive Models from Noisy and Imbalanced Software Engineering Datasets** (2022), ASE, Z Li, et al. [[pdf]](https://dl.acm.org/doi/pdf/10.1145/3551349.3556941)\n- **An empirical study of deep learning models for vulnerability detection** (2023), arxiv, B Steenhoek, et al. [[pdf]](https://arxiv.org/pdf/2212.08109)\n\n### *Label Errors*\n- **Robust Learning of Deep Predictive Models from Noisy and Imbalanced Software Engineering Datasets** (2022), ASE, Z Li, et al. [[pdf]](https://dl.acm.org/doi/pdf/10.1145/3551349.3556941)\n- **XCode: Towards Cross-Language Code Representation with Large-Scale Pre-Training** (2022), TOSEM, Z Lin, et al. [[pdf]](https://dl.acm.org/doi/pdf/10.1145/3506696)\n- **Understanding and Tackling Label Errors in Deep Learning-Based Vulnerability Detection (Experience Paper)** (2023), ISSTA, X Nie, et al. [[pdf]](https://dl.acm.org/doi/pdf/10.1145/3597926.3598037)\n\n### *Data Noise*\n- **Slice-Based Code Change Representation Learning** (2023), SANER, F Zhang, et al. [[pdf]](https://chenbihuan.github.io/paper/saner23-zhang-ccs2vec.pdf)\n- **Are we building on the rock? on the importance of data preprocessing for code summarization** (2022), FSE, L Shi, et al. [[pdf]](https://dl.acm.org/doi/pdf/10.1145/3540250.3549145)\n- **Neural-Machine-Translation-Based Commit Message Generation: How Far Are We?** (2018), ASE, Z Liu, et al. [[pdf]](https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=5299\u0026context=sis_research)\n\n## System Design and Learning\n### *Data Snooping*\n- **AutoTransform: automated code transformation to support modern code review process** (2022), ICSE, Thongtanunam, Patanamon, Chanathip Pornprasit, and Chakkrit Tantithamthavorn. [[pdf]](https://www.researchgate.net/profile/Patanamon-Thongtanunam-2/publication/358486098_AutoTransform_Automated_Code_Transformation_to_Support_Modern_Code_Review_Process/links/6204767d075f695e892eb1f4/AutoTransform-Automated-Code-Transformation-to-Support-Modern-Code-Review-Process.pdf)\n- **Can Neural Clone Detection Generalize to Unseen Functionalitiesƒ** (2021), ASE, C Liu, et al. [[pdf]](https://www.microsoft.com/en-us/research/uploads/prod/2022/01/Can-Neural-Clone-Detection-Generalize-to-UnseenFunctionalities.pdf)\n- **CD-VulD: Cross-Domain Vulnerability Discovery Based on Deep Domain Adaptation** (2020), TDSC, S Liu, et al. [[pdf]](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=\u0026arnumber=9054952)\n- **Deep just-in-time defect prediction: how far are we?** (2021), ISSTA, Z Zeng, et al. [[pdf]](https://par.nsf.gov/servlets/purl/10273272)\n- **Patching as translation: the data and the metaphor** (2020), ASE, Y Ding, et al. [[pdf]](https://dl.acm.org/doi/pdf/10.1145/3324884.3416587)\n- **An empirical study of deep learning models for vulnerability detection** (2023), ICSE, B Steenhoek, et al. [[pdf]](https://arxiv.org/pdf/2212.08109)\n- **Keeping Pace with Ever-Increasing Data: Towards Continual Learning of Code Intelligence Models** (2302), ICSE, S Gao, et al. [[pdf]](https://arxiv.org/pdf/2302.03482)\n- **Revisiting Learning-based Commit Message Generation** (2023), ICSE, J Dong, Y Lou, D Hao, et al. [[pdf]](https://www.cs.purdue.edu/homes/lintan/publications/commit-icse23.pdf)\n- **Syntax and Domain Aware Model for Unsupervised Program Translation** (2302), ICSE, F Liu, J Li, L Zhang. [[pdf]](https://arxiv.org/pdf/2302.03908)\n- **How Effective Are Neural Networks for Fixing Security Vulnerabilities** (2023), ISSTA, Y Wu, N Jiang, HV Pham, et al. [[pdf]](https://arxiv.org/pdf/2305.18607)\n- **Towards More Realistic Evaluation for Neural Test Oracle Generation** (2305), ISSTA, Z Liu, K Liu, X Xia, et al. [[pdf]](https://arxiv.org/pdf/2305.17047)\n- **On the Evaluation of Neural Code Summarization** (2022), ICSE, E Shi, Y Wang, L Du, et al. [[pdf]](https://arxiv.org/pdf/2107.07112)\n\n### *Spurious Correlations*\n- **Deep Learning Based Vulnerability Detection: Are We There Yet?** (2021), TSE, S Chakraborty, R Krishna, Y Ding, et al. [[pdf]](https://arxiv.org/pdf/2009.07235)\n- **Diet code is healthy: simplifying programs for pre-trained models of code** (2022), FSE, Z Zhang, H Zhang, B Shen, et al. [[pdf]](https://arxiv.org/pdf/2206.14390)\n- **Explaining mispredictions of machine learning models using rule induction** (2021), FSE, J Cito, I Dillig, S Kim, et al. [[pdf]](https://www.cs.utexas.edu/~isil/md.pdf)\n- **Interpreting Deep Learning-based Vulnerability Detector Predictions Based on Heuristic Searching** (2021), TOSEM, D Zou, Y Zhu, S Xu, et al. [[pdf]](https://par.nsf.gov/servlets/purl/10281044)\n- **Thinking Like a Developer? Comparing the Attention of Humans with Neural Models of Code** (2021), ASE, M Paltenghi, M Pradel. [[pdf]](https://www.software-lab.org/publications/ase2021.pdf)\n- **Vulnerability detection with fine-grained interpretations** (2021), FSE, Y Li, S Wang, TN Nguyen. [[pdf]](https://dl.acm.org/doi/pdf/10.1145/3468264.3468597)\n- **What do they capture? a structural analysis of pre-trained language models for source code** (2022), ICSE, Y Wan, W Zhao, H Zhang, et al. [[pdf]](https://arxiv.org/pdf/2202.06840)\n- **An empirical study of deep learning models for vulnerability detection** (2023), ICSE, B Steenhoek, MM Rahman, R Jiles, et al. [[pdf]](https://arxiv.org/pdf/2212.08109)\n- **Towards Efficient Fine-Tuning of Pre-trained Code Models: An Experimental Study and Beyond** (2023), ISSTA, E Shi, Y Wang, H Zhang, et al. [[pdf]](https://arxiv.org/pdf/2304.05216)\n\n### *Inappropriate Model Design*\n- **Deep Learning Based Vulnerability Detection: Are We There Yet?** (2021), TSE, S Chakraborty, R Krishna, Y Ding, et al. [[pdf]](https://arxiv.org/pdf/2009.07235)\n- **Enhancing DNN-Based Binary Code Function Search With Low-Cost Equivalence Checking** (2022), TSE, H Wang, P Ma, Y Yuan, et al. [[pdf]](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=\u0026arnumber=9707874)\n- **Improving automatic source code summarization via deep reinforcement learning** (2018), ASE, Y Wan, Z Zhao, M Yang, et al.[[pdf]](https://arxiv.org/pdf/1811.07234)\n- **Patching as translation: the data and the metaphor** (2020), ASE, Y Ding, B Ray, P Devanbu, et al.[[pdf]](https://dl.acm.org/doi/pdf/10.1145/3324884.3416587)\n- **Reinforcement-Learning-Guided Source Code Summarization Using Hierarchical Attention** (2020), TSE, W Wang, Y Zhang, Y Sui, et al. [[pdf]](https://opus.lib.uts.edu.au/bitstream/10453/139555/3/Binder1.pdf)\n- **XCode: Towards Cross-Language Code Representation with Large-Scale Pre-Training** (2022), TOSEM, Z Lin, G Li, J Zhang, et al. [[pdf]](https://dl.acm.org/doi/pdf/10.1145/3506696)\n- **RepresentThemAll: A Universal Learning Representation of Bug Reports** (2023), ICSE, S Fang, T Zhang, Y Tan, et al. [[pdf]](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=\u0026arnumber=10172597)\n- **Template-based Neural Program Repair** (2023), ICSE, X Meng, X Wang, H Zhang, et al. [[pdf]](https://github.com/mxx1219/TENURE/blob/main/paper.pdf)\n\n## Performance Evaluation\n### *Inappropriate Baseline*\n- **Towards More Realistic Evaluation for Neural Test Oracle Generationr** (2023), ARXIV, Z Liu, K Liu, X Xia, et al. [[pdf]](https://arxiv.org/pdf/2305.17047)\n\n### *Inappropriate Evaluation Dataset*\n- **Deep Learning Based Program Generation From Requirements Text: Are We There Yet?** (2020), TSE, H Liu, M Shen, J Zhu, et al. [[pdf]](https://liuhuigmail.github.io/publishedPappers/CodeGeneration.pdf)\n- **Generating realistic vulnerabilities via neural code editing: an empirical study** (2022), FSE, Y Nong, Y Ou, M Pradel, et al. [[pdf]](https://dl.acm.org/doi/pdf/10.1145/3540250.3549128)\n\n### *Low Reproducibility*\n- **An extensive study on pre-trained models for program understanding and generation** (2022), ISSTA, Z Zeng, H Tan, H Zhang, et al. [[pdf]](https://lingming.cs.illinois.edu/publications/issta2022.pdf)\n\n### *Inappropriate Performance Measures*\n- **Deep Learning Based Vulnerability Detection: Are We There Yet?** (2021), TSE, S Chakraborty, R Krishna, Y Ding, et al. [[pdf]](https://arxiv.org/pdf/2009.07235)\n- **Improving automatic source code summarization via deep reinforcement learning** (2018), ASE, Y Wan, Z Zhao, M Yang, et al. [[pdf]](https://arxiv.org/pdf/1811.07234)\n- **Multi-task learning based pre-trained language model for code completion** (2020), ASE, F Liu, G Li, Y Zhao, et al. [[pdf]](https://arxiv.org/pdf/2012.14631)\n- **On the Value of Oversampling for Deep Learning in Software Defect Prediction** (2021), TSE, R Yedida, T Menzies. [[pdf]](https://arxiv.org/pdf/2008.03835)\n- **Patching as translation: the data and the metaphor** (2020), ASE, Y Ding, B Ray, P Devanbu, et al. [[pdf]](https://dl.acm.org/doi/pdf/10.1145/3324884.3416587)\n- **Reinforcement-Learning-Guided Source Code Summarization Using Hierarchical Attention** (2020), TSE, W Wang, Y Zhang, Y Sui, et al. [[pdf]](https://opus.lib.uts.edu.au/bitstream/10453/139555/3/Binder1.pdf)\n- **SynShine: Improved Fixing of Syntax Errors** (2022), TSE, Ahmed T, Ledesma N R, Devanbu P. [[pdf]](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=\u0026arnumber=9913705)\n- **An empirical study of deep learning models for vulnerability detection** (2023), ICSE, B Steenhoek, MM Rahman, R Jiles, et al. [[pdf]](https://arxiv.org/pdf/2212.08109)\n- **Revisiting Learning-based Commit Message Generation** (2023), ICSE, J Dong, Y Lou, D Hao, et al. [[pdf]](https://www.cs.purdue.edu/homes/lintan/publications/commit-icse23.pdf)\n- **Tare: Type-Aware Neural Program Repair** (2023), ICSE, Q Zhu, Z Sun, W Zhang, et al. [[pdf]](https://xiongyingfei.github.io/papers/ICSE23a.pdf)\n- **How Effective Are Neural Networks for Fixing Security Vulnerabilities** (2023), ISSTA, Y Wu, N Jiang, HV Pham, et al. [[pdf]](https://arxiv.org/pdf/2305.18607)\n- **Towards More Realistic Evaluation for Neural Test Oracle Generation** (2305), ISSTA, Z Liu, K Liu, X Xia, et al. [[pdf]](https://arxiv.org/pdf/2305.17047)\n- **GitHub Copilot AI pair programmer: Asset or Liability?** (2023), JSS, AM Dakhel, V Majdinasab, A Nikanjam, et al. [[pdf]](https://arxiv.org/pdf/2206.15331)\n\n## Deployment and Maintainance\n### *Real-World Constraints*\n- **Examining Zero-Shot Vulnerability Repair with Large Language Models** (2023), S\u0026P, H Pearce, B Tan, B Ahmad, et al. [[pdf]](https://arxiv.org/pdf/2112.02125)\n- **A Performance-Sensitive Malware Detection System Using Deep Learning on Mobile Devices** (2020), TIFS, R Feng, S Chen, X Xie, et al. [[pdf]](https://arxiv.org/pdf/2005.04970)\n- **Diet code is healthy: simplifying programs for pre-trained models of code** (2022), FSE, Z Zhang, H Zhang, B Shen, et al.[[pdf]](https://arxiv.org/pdf/2206.14390)\n- **When Code Completion Fails: A Case Study on Real-World Completions** (2019), ICSE, VJ Hellendoorn, S Proksch, HC Gall, et al. [[pdf]](http://www.sback.it/publications/icse2019b.pdf)\n- **Lost at C: A User Study on the Security Implications of Large Language Model Code Assistants** (2023), arxiv, G Sandoval, H Pearce, T Nys, et al. [[pdf]](https://www.usenix.org/system/files/sec23fall-prepub-353-sandoval.pdf)\n- **Grounded Copilot: How Programmers Interact with Code-Generating Models** (2023), OOPSLA1, S Barke, MB James, N Polikarpova. [[pdf]](https://dl.acm.org/doi/pdf/10.1145/3586030)\n- **LLaMA-Reviewer: Advancing Code Review Automation with Large Language Models through Parameter-Efficient Fine-Tuning** (2308), arxiv, J Lu, L Yu, X Li, et al.[[pdf]](https://arxiv.org/pdf/2308.11148)\n- **Compressing Pre-trained Models of Code into 3 MB** (2022), ASE, J Shi, Z Yang, B Xu, et al.[[pdf]](https://dl.acm.org/doi/pdf/10.1145/3551349.3556964)\n\n### *Attack Threats*\n- **You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completion** (2021), USENIX Security, R Schuster, C Song, E Tromer, et al. [[pdf]](https://www.usenix.org/system/files/sec21-schuster.pdf)\n- **Adversarial Robustness of Deep Code Comment Generation** (2022), TOSEM, Y Zhou, X Zhang, J Shen, et al. [[pdf]](https://arxiv.org/pdf/2108.00213)\n- **An extensive study on pre-trained models for program understanding and generation** (2022), ISSTA, Z Zeng, H Tan, H Zhang, et al. [[pdf]](https://lingming.cs.illinois.edu/publications/issta2022.pdf)\n- **Generating Adversarial Examples for Holding Robustness of Source Code Processing Models** (2020), AAAI, H Zhang, Z Li, G Li, et al. [[pdf]](https://ojs.aaai.org/index.php/AAAI/article/view/5469/5325)\n- **Semantic Robustness of Models of Source Code** (2020), SANER, G Ramakrishnan, J Henkel, Z Wang, et al. [[pdf]](https://arxiv.org/pdf/2002.03043)\n- **You see what I want you to see: poisoning vulnerabilities in neural code search** (2022), FSE, Y Wan, S Zhang, H Zhang, et al. [[pdf]](https://opus.lib.uts.edu.au/bitstream/10453/164890/2/fse22_code_attack_camera%20%281%29.pdf)\n- **Contrabert: Enhancing code pre-trained models via contrastive learning** (2023), ICSE, S Liu, B Wu, X Xie, et al. [[pdf]](https://arxiv.org/pdf/2301.09072)\n- **On the robustness of code generation techniques: An empirical study on github copilot** (2023), ICSE, A Mastropaolo, L Pascarella, E Guglielmi, et al. [[pdf]](https://arxiv.org/pdf/2302.00438)\n- **Two sides of the same coin: Exploiting the impact of identifiers in neural code comprehension** (2023), ICSE, S Gao, C Gao, C Wang, et al. [[pdf]](https://yuyue.github.io/res/paper/NeuralCode-ICSE2023.pdf)\n- **Multi-target Backdoor Attacks for Code Pre-trained Models** (2023), ACL, Y Li, S Liu, K Chen, et al. [[pdf]](https://arxiv.org/pdf/2306.08350)\n- **Backdooring Neural Code Search** (2023), ACL, W Sun, Y Chen, G Tao, et al. [[pdf]](https://arxiv.org/pdf/2305.17506)\n- **ReCode: Robustness Evaluation of Code Generation Models** (2022), ACL, S Wang, Z Li, H Qian, et al. [[pdf]](https://arxiv.org/pdf/2212.10264)\n- **Natural Attack for Pre-trained Models of Code** (2022), ICSE, Z Yang, J Shi, J He, et al. [[pdf]](https://arxiv.org/pdf/2201.08698)\n- **Coprotector: Protect open-source code against unauthorized training usage with data poisoning** (2022), WWW, Z Sun, X Du, F Song, et al. [[pdf]](https://arxiv.org/pdf/2110.12925)\n- **On the Security Vulnerabilities of Text-to-SQL Models** (2211), ISSRE, X Peng, Y Zhang, J Yang, et al. [[pdf]](https://arxiv.org/pdf/2211.15363)\n\n### *Security Concerns in Generated Code*\n- **Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions** (2022), S\u0026P, H Pearce, B Ahmad, B Tan, et al. [[pdf]](https://arxiv.org/pdf/2108.09293.pdf?trk=article-ssr-frontend-pulse_x-social-details_comments-action_comment-text)\n- **Automated repair of programs from large language models** (2023), ICSE, Z Fan, X Gao, M Mirchev, et al. [[pdf]](https://abhikrc.com/pdf/ICSE23.pdf)\n- **Cctest: Testing and repairing code completion systems** (2023), ICSE, Z Li, C Wang, Z Liu, et al. [[pdf]](https://arxiv.org/pdf/2208.08289)\n- **Analyzing Leakage of Personally Identifiable Information in Language Models** (2023), S\u0026P, N Lukas, A Salem, R Sim, et al. [[pdf]](https://arxiv.org/pdf/2302.00539)\n- **CodexLeaks: Privacy Leaks from Code Generation Language Models in GitHub Copilot** (2023), USENIX Security, L Niu, S Mirza, Z Maradni, et al. [[pdf]](https://www.usenix.org/system/files/usenixsecurity23-niu.pdf)\n\n\n# Language Models for Code Intelligence\n## Decoder-only Models\n\n### GPT-1\n- Release Date: 2018-06\n- Institute: OpenAI\n- Paper: [Improving Language Understanding by Generative Pre-Training](https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf)\n\n### GPT-2\n- Release Date: 2019-02\n- Institute: OpenAI\n- Paper: [Language Models are Unsupervised Multitask Learners](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)\n\n### GPT-3\n- Release Date: 2020-05\n- Institute: OpenAI\n- Paper: [Language models are few-shot learners](https://papers.nips.cc/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf)\n\n### Codex\n- Release Date: 2021-08\n- Institute: OpenAI\n- Paper: [Evaluating Large Language Models Trained on Code](https://arxiv.org/pdf/2107.03374.pdf)\n\n### GPT-NeoX\n- Release Date: 2022-04\n- Access: [ckpt](https://github.com/EleutherAI/gpt-neox)\n- Paper: [GPT-NeoX-20B: An Open-Source Autoregressive Language Model](https://arxiv.org/pdf/2204.06745.pdf)\n\n### GPT-Neo\n- Release Date: 2021-03\n- Source: [Github](https://github.com/EleutherAI/gpt-neo)\n\n### CodeGen\n- Release Date: 2022/03\n- Paper: [CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis](https://arxiv.org/abs/2203.13474)\n\n### InstructGPT\n- Release Date: 2022/01\n- Paper: [Training language models to follow instructions with human feedback](http://arxiv.org/abs/2203.02155v1)\n\n### CodeGeeX\n- Title: CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X\n- Year: 2023\n- Paper: [Link](https://arxiv.org/abs/2303.17568)\n\n### GPT-J\n- Release Date: 2023/06\n- Access: [GPT-J-6B](https://github.com/kingoflolz/mesh-transformer-jax/#gpt-j-6b), [GPT4All-J](https://github.com/nomic-ai/gpt4all#raw-model)\n- Paper: [GPT-J-6B: 6B JAX-Based Transformer](https://arankomatsuzaki.wordpress.com/2021/06/04/gpt-j/)\n\n### LLaMA\n- Release Date: 2023-02\n- Institute: Meta\n- Paper: [LLaMA: Open and Efficient Foundation Language Models](https://research.facebook.com/publications/llama-open-and-efficient-foundation-language-models/)\n\n### ChatGPT\n- Release Date: 2022-11\n- Access: [demo](https://openai.com/blog/chatgpt/), [api](https://share.hsforms.com/1u4goaXwDRKC9-x9IvKno0A4sk30)\n- Origin: [Blog](https://openai.com/blog/chatgpt/)\n\n### StableLM-Alpha\n- Release Date: 2023/04\n- Access: [StableLM-Alpha](https://github.com/Stability-AI/StableLM#stablelm-alpha)\n- Paper: [Stability AI Launches the First of its StableLM Suite of Language Models](https://stability.ai/blog/stability-ai-launches-the-first-of-its-stablelm-suite-of-language-models)\n\n\n### InCoder\n- Paper: \"InCoder: A Generative Model for Code Infilling and Synthesis\"\n- Authors: Daniel Fried et al.\n- Release Date: 2023   \n- Paper: [Link](http://arxiv.org/abs/2204.05999)\n\n### GPT-4\n- Release Date: 2023-03\n- Institute: OpenAI\n- Paper: [GPT-4 Technical Report](https://openai.com/research/gpt-4)\n\n### WizardCoder\n- Access: [WizardCoder](https://github.com/nlpxucan/WizardLM\n- Release Date: 2023\n- Paper: [WizardCoder: Empowering Code Large Language Models with Evol-Instruct\n](https://arxiv.org/abs/2306.08568)\n\n### PanGu-Coder\n- Part of: PanGu-α\n- Release Date: 2020\n- Paper: [\"PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation\"](https://arxiv.org/abs/2010.11934)\n\n### OPT\n- Release Date: 2022-05\n- Access: [api](https://opt.alpa.ai), [ckpt](https://github.com/facebookresearch/metaseq/tree/main/projects/OPT)\n- Paper: [OPT: Open Pre-trained Transformer Language Models](https://arxiv.org/pdf/2205.01068.pdf)\n\n### StarCoder\n- Release Date: 2023/05\n- Access: [starcoder](https://huggingface.co/bigcode/starcoder)\n- Papers: [StarCoder: A State-of-the-Art LLM for Code](https://huggingface.co/blog/starcoder), [StarCoder: May the source be with you!](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view)\n\n### SantaCoder\n- Release Date: 2023/01\n- Access: [santacoder](https://huggingface.co/bigcode/santacoder)\n- Paper: [SantaCoder: don't reach for the stars!](https://arxiv.org/abs/2301.03988)\n\n### PaLM\n- Release Date: 2022-04\n- Institute: Google\n- Paper: [PaLM: Scaling Language Modeling with Pathways](https://arxiv.org/pdf/2204.02311.pdf)\n\n### Vicuna\n- Release Date: 2023/03\n- Blog: [Link](https://lmsys.org/blog/2023-03-30-vicuna/)\n\n### Flan-UL2\n- Release Date: 2023-03\n- Institute: Google\n- Blog: [Flan-UL2 Blog](https://www.yitay.net/blog/flan-ul2-20b)\n\n### CPM-Bee\n- Release Date: 2022-10\n- Institute: Baidu\n- Paper: [CPM: A Large-scale Generative Chinese Pre-trained Language Model](https://arxiv.org/pdf/2012.00413.pdf)\n\n### MT-NLG\n- Release Date: 2022-01\n- Institute: Microsoft\n- Paper: [Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model](https://arxiv.org/pdf/2201.11990.pdf)\n\n### GLM\n- Release Date: 2022-10\n- Institute: Tsinghua University\n- Paper: [GLM-130B: AN OPEN BILINGUAL PRE-TRAINED MODEL](https://arxiv.org/pdf/2210.02414.pdf)\n\n### YaLM\n- Release Date: 2022-06\n- Institute: Yandex\n- Blog: [YaLM Blog](https://medium.yandex/yandex-publishes-yalm-100b-its-the-largest-gpt-like-neural-network-in-open-source-d1df53d0e9a6)\n\n### Alpaca\n- Release Date: 2023-03\n- Institute: Stanford University\n- Access: [Alpaca GitHub](https://github.com/tatsu-lab/stanford_alpaca)\n\n### RWKV-4\n- Release Date: 2022-09\n- Institute: Independent (BlinkDL)\n- Access: [RWKV-4 GitHub](https://github.com/BlinkDL/RWKV-LM)\n\n### Sparrow\n- Release Date: 2022-09\n- Institute: DeepMind\n- Paper: [Improving alignment of dialogue agents via targeted human judgements](https://arxiv.org/pdf/2209.14375.pdf)\n\n### Falcon\n- Release Date: 2023-05\n- Institute: Technology Innovation Institute (TII)\n- Access: [Falcon Homepage](https://falconllm.tii.ae)\n\n### Code Llama\n- Release Date: 2023\n- Institute: Meta (Facebook)\n- Paper: [Code Llama: Open Foundation Models for Code](https://ai.meta.com/research/publications/code-llama-open-foundation-models-for-code/)\n\n### RedPajama-INCITE\n- Release Date: Not specified\n- Blog: [RedPajama-INCITE Blog](https://www.together.xyz/blog/redpajama-models-v1)\n\n### DeciCoder-1B\n- Release Date: 2023-08\n- Institute: Deci AI\n- Blog: [DeciCoder Blog](https://deci.ai/blog/decicoder-efficient-and-accurate-code-generation-llm/)\n\n### OpenLLaMA\n- Release Date: 2023-05\n- Institute: Not specified\n- Access: [OpenLLaMA Access](https://huggingface.co/Salesforce/codegen25-7b-multi/blob/main/README.md)\n\n\n### CodeGPT\n- Release Date: 2021\n- Paper: [CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation](https://arxiv.org/pdf/2102.04664.pdf)\n\n\n## Encoder-only Models\n### BERT\n- Release Date: 2018-10\n- Institute: Google\n- Paper: [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://aclanthology.org/N19-1423.pdf)\n\n### ALBERT\n- Release Date: 2019\n- Paper: [ALBERT: A Lite BERT for Self-supervised Learning of Language Representations](https://arxiv.org/abs/1909.11942)\n\n### RoBERTa\n- Release Date: 2019\n- Paper: [RoBERTa: A Robustly Optimized BERT Pretraining Approach](https://arxiv.org/abs/1907.11692)\n\n### CodeBERT\n- Release Date: 2020-04\n- Institute: Microsoft\n- Paper: [CodeBERT: A Pre-Trained Model for Programming and Natural Languages](https://arxiv.org/abs/2002.08155)\n\n### GraphCodeBERT\n- Release Date: 2022/03\n- Access: [GraphCodeBERT](https://huggingface.co/microsoft/graphcodebert-base)\n- Paper: [GraphCodeBERT: Pre-training Code Representations with Data Flow\n](https://arxiv.org/abs/2009.08366)\n\n## Encoder-decoder Models\n### AlphaCode\n- Release Date: 2022/02\n- Access: [AlphaCode](https://alphacode.deepmind.com/)\n- Institute: DeepMind\n\n### T5\n- Release Date: 2019\n- Paper: [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/abs/1910.10683)\n- Checkpoint: [Link](https://huggingface.co/t5-11b)\n\n### CodeT5\n- Release Date: 2021\n- Access: [CodeT5](https://huggingface.co/salesforce/codet5-small)\n- Paper: [CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation](https://arxiv.org/abs/2109.00859)\n\n\n### CodeT5+\n- Release Date: 2023/05\n- Access: [CodeT5+](https://github.com/salesforce/CodeT5/tree/main/CodeT5+)\n- Paper: [CodeT5+: Open Code Large Language Models for Code Understanding and Generation](https://arxiv.org/abs/2305.07922)\n\n\n### UnixCoder\n- Release Date: 2022\n- Access: [UniXcoder on Hugging Face](https://huggingface.co/microsoft/unixcoder-base)\n- Paper: [UniXcoder: Unified Cross-Modal Pre-training for Code Representation\n](https://arxiv.org/abs/2203.03850)\n\n### PLBART\n- Release Date: 2021\n- Paper: [Unified Pre-training for Program Understanding and Generation\n](https://arxiv.org/abs/2103.06333)\n\n\n### CodeReviewer\n- Release Date: 2022\n- Access: [CodeReviewer](https://huggingface.co/microsoft/codereviewer)\n- Paper: [Automating Code Review Activities by Large-Scale Pre-training](https://arxiv.org/abs/2203.09095)\n\n\n## Relevant Surveys on LM4Code\n- Large Language Models for Software Engineering: Survey and Open Problems, 2023, [paper](https://arxiv.org/pdf/2310.03533)\n- Large Language Models for Software Engineering: A Systematic Literature Review, 2023, [paper](https://arxiv.org/abs/2308.10620)\n- A Survey of Large Language Models for Code: Evolution, Benchmarking, and Future Trends, 2023, [paper](https://arxiv.org/pdf/2311.10372.pdf)\n- Unifying the Perspectives of NLP and Software Engineering: A Survey on Language Models for Code, 2023, [paper](https://arxiv.org/abs/2311.07989)\n- Software testing with large language model: Survey, landscape, and vision, 2023, [paper](https://arxiv.org/pdf/2307.07221)\n- Pitfalls in Language Models for Code Intelligence: A Taxonomy and Survey, 2023, [paper](https://arxiv.org/pdf/2310.17903)\n- Generative Artificial Intelligence for Software Engineering--A Research Agenda, 2023, [paper](https://arxiv.org/pdf/2310.18648)\n-  A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the Ugly, 2023, [paper](https://arxiv.org/abs/2312.02003)\n- Trustworthy and Synergistic Artificial Intelligence for Software Engineering: Vision and Roadmaps, 2023, [paper](https://arxiv.org/pdf/2309.04142)\n- Large language models meet NL2Code: A survey, 2023, [paper](https://aclanthology.org/2023.acl-long.411.pdf)\n- A Survey on Pretrained Language Models for Neural Code Intelligence, 2022, [paper](https://arxiv.org/abs/2212.10079)\n\n## General Surveys on AI4SE\n- A systematic literature review on the use of deep learning in software engineering research, TOSEM 2022, [paper](https://dl.acm.org/doi/pdf/10.1145/3485275)\n- A survey on deep learning for software engineering, CSUR 2022, [paper](https://dl.acm.org/doi/abs/10.1145/3505243)\n- Software engineering for AI-based systems: a survey, TOSEM 2021, [paper](https://dl.acm.org/doi/abs/10.1145/3487043)\n- Machine/deep learning for software engineering: A systematic literature review, TSE 2022, [paper](https://ieeexplore.ieee.org/abstract/document/9772253/)\n- Machine Learning Applied to Software Testing: A Systematic Mapping Study, 2019, [paper](https://ieeexplore.ieee.org/abstract/document/8638573/)\n- A survey of machine learning for big code and naturalness, CSUR 2018, [paper](https://dl.acm.org/doi/abs/10.1145/3212695)\n\n## General Surveys on LLM\n- Large Language Models: A Comprehensive Survey of Applications, Challenges, Limitations, and Future Prospects, 2023, [paper](https://d197for5662m48.cloudfront.net/documents/publicationstatus/181139/preprint_pdf/edf41a1f2a93aadb235a3c3aff2dcf08.pdf)\n- A survey of large language models, 2023, [paper](https://arxiv.org/pdf/2303.18223.pdf?fbclid=IwAR3GYBQ2P9Cww2HVM3oUbML9i5i3DMDBVv5_FvYWfEi-vdZqZoSM78jE2-s)\n- A Survey on Evaluation of Large Language Models, 2023, [paper](https://arxiv.org/pdf/2307.03109.pdf)\n- Recent advances in natural language processing via large pre-trained language models: A survey, CSUR 2023, [paper](https://arxiv.org/pdf/2111.01243)\n- A Survey of GPT-3 Family Large Language Models Including ChatGPT and GPT-4, 2023, [paper](https://arxiv.org/pdf/2310.12321.pdf)\n- Challenges and Applications of Large Language Models: A Survey, 2023, [paper](https://arxiv.org/pdf/2307.10169.pdf)\n- Harnessing the power of llms in practice: A survey on chatgpt and beyond, 2023, [paper](https://arxiv.org/pdf/2304.13712.pdf)\n- A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT, 2023, [paper](https://arxiv.org/pdf/2303.04226.pdf)\n\n\n## Repositories and Resources for LM4Code\n- LLM4SE: Large Language Models for Software Engineering\n    - [Repository](https://github.com/gai4se/LLM4SE)\n    - This repository is associated with prominent software engineering conferences like ICSE, FSE, and ASE.\n- Awesome-Code-LLM\n    - [Repository](https://github.com/codefuse-ai/Awesome-Code-LLM)\n    - This is the repo for one survey - a comprehensive review of LLM researches for code. Works in each category are ordered chronologically. A curated list of language modeling researches for code and related datasets.\n- awesome-ai4code-papers\n    - [Repository](https://github.com/bdqnghi/awesome-ai4code-papers)\n    - A collection of recent papers, benchmarks and datasets of AI4Code domain.\n- ml4code\n    - [Repository](https://ml4code.github.io/)\n    - Research on machine learning for source code.\n- awesome-machine-learning-on-source-code\n    - [Repository](https://github.com/src-d/awesome-machine-learning-on-source-code)\n    - Cool links \u0026 research papers related to Machine Learning applied to source code (MLonCode)\n- saltudelft/ml4se\n    - [Repository](https://github.com/saltudelft/ml4se)\n    - A curated list of papers, theses, datasets, and tools related to the application of Machine Learning for Software Engineering\n- CUHK-ARISE/ml4code-dataset\n    - [Repository](https://github.com/CUHK-ARISE/ml4code-dataset)\n    - A collection of datasets for machine learning for big code\n\n\n\n## Repositories and Resources for LLM\n- Awesome-LLM4Tool: A Curated List of Resources for LLM Tools\n    - [Repository](https://github.com/OpenGVLab/Awesome-LLM4Tool)\n    - Offers a curated list of papers, repositories, tutorials, and resources related to large language models for tools.\n- LLMsPracticalGuide: A Curated List of Practical Resources\n    - [Repository](https://github.com/Mooler0410/LLMsPracticalGuide)\n    - It includes an evolutionary tree of modern Large Language Models to trace the development over the years\n- Hannibal046/Awesome-LLM\n    - [Repository](https://github.com/Hannibal046/Awesome-LLM)\n    - Awesome-LLM: a curated list of Large Language Model\n- awesome-decentralized-llm\n    - [Repository](https://github.com/imaurer/awesome-decentralized-llm)\n    - Collection of LLM resources that can be used to build products you can \"own\" or to perform reproducible research.\n- RUCAIBox/LLMSurvey\n    - [Repository](https://github.com/RUCAIBox/LLMSurvey)\n    - The official GitHub page for the survey paper \"A Survey of Large Language Models\".\n- tensorchord/Awesome-LLMOps\n    - [Repository](https://github.com/tensorchord/Awesome-LLMOps)\n    - An awesome \u0026 curated list of best LLMOps tools for developers\n- luban-agi/Awesome-Domain-LLM\n    - [Repository](https://github.com/luban-agi/Awesome-Domain-LLM)\n    - A curated list of domain-specific large language models in Chinese\n- underlines/awesome-ml\n    - [Repository](https://github.com/underlines/awesome-ml)\n    - Curated list of useful LLM / Analytics / Datascience resources\n\n# Benchmarks\n## Bug Repair\n### Defects4J\n- Release year: 2014\n- Paper: [\"Defects4J: A Database of Existing Faults to Enable Controlled Testing Studies for Java Programs\"](https://dl.acm.org/doi/10.1145/2610384.2628055)\n\n### ManyBugs/IntroClass\n- Release year: 2015\n- Paper: [\"The ManyBugs and IntroClass Benchmarks for Automated Repair of C Programs\"](https://ieeexplore.ieee.org/document/7153570)\n\n### BugAID\n- Release year: 2016\n- Paper: [\"Discovering Bug Patterns in JavaScript\"](https://dl.acm.org/doi/10.1145/2950290.2950308)\n\n### CoCoNut\n- Release year: 2020\n- Paper: [\"CoCoNuT: combining context-aware neural translation models using ensemble for program repair\"](https://dl.acm.org/doi/10.1145/3395363.3397369)\n\n### QuixBugs\n- Release year: 2017\n- Paper: [\"QuixBugs: a multi-lingual program repair benchmark set based on the quixey challenge\"](https://dl.acm.org/doi/10.1145/3135932.3135941)\n\n### Bugs.jar\n- Release year: 2018\n- Paper: [\"Bugs.jar: a large-scale, diverse dataset of real-world Java bugs\"](https://dl.acm.org/doi/10.1145/3196398.3196473)\n\n### BugsInPy\n- Release year: 2020\n- Paper: [\"BugsInPy: A Database of Existing Bugs in Python Programs to Enable Controlled Testing and Debugging Studies\"](https://dl.acm.org/doi/abs/10.1145/3368089.3417943)\n\n### DeepFix\n- Release year: 2017\n- Paper: [\"DeepFix: Fixing Common C Language Errors by Deep Learning\"](https://ojs.aaai.org/index.php/AAAI/article/view/10742)\n\n\n## Code Generation/Synthesis\n\n### CONCODE\n- Release year: 2018\n- Paper: [\"Mapping Language to Code in Programmatic Context\"](https://arxiv.org/abs/1808.09588)\n\n### HumanEval\n- Release year: 2021\n- Paper: [\"Evaluating Large Language Models Trained on Code\"](https://arxiv.org/abs/2107.03374) \n\n### MBPP/MathQA-Python\n- Release year: 2021\n- Paper: [\"Program Synthesis with Large Language Models\"](https://arxiv.org/abs/2108.07732) \n\n## Code Sumarization\n### CODE-NN\n- Release year: 2016\n- Paper: [\"Summarizing Source Code using a Neural Attention Model\"](https://aclanthology.org/P16-1195/)\n\n### TL-CodeSum\n- Release year: 2018\n- Paper: [\"Summarizing Source Code with Transferred API Knowledge\"](https://www.ijcai.org/proceedings/2018/314)\n\n### CodeSearchNet\n- Release year: 2019\n- Paper: [\"CodeSearchNet Challenge: Evaluating the State of Semantic Code Search\"](https://arxiv.org/abs/1909.09436)\n\n## Cites\nIf you find this repository useful, please cite our survey paper:\n```\n@article{she2023pitfalls,\n  title={Pitfalls in Language Models for Code Intelligence: A Taxonomy and Survey},\n  author={She, Xinyu and Liu, Yue and Zhao, Yanjie and He, Yiling and Li, Li and Tantithamthavorn, Chakkrit and Qin, Zhan and Wang, Haoyu},\n  journal={arXiv preprint arXiv:2310.17903},\n  year={2023}\n}\n\n@article{hou2023large,\n  title={Large language models for software engineering: A systematic literature review},\n  author={Hou, Xinyi and Zhao, Yanjie and Liu, Yue and Yang, Zhou and Wang, Kailong and Li, Li and Luo, Xiapu and Lo, David and Grundy, John and Wang, Haoyu},\n  journal={arXiv preprint arXiv:2308.10620},\n  year={2023}\n}\n```\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyueyuel%2Freliablelm4code","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyueyuel%2Freliablelm4code","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyueyuel%2Freliablelm4code/lists"}