Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-ai4code
A collection of recent papers, benchmarks and datasets of AI4Code domain.
https://github.com/bdqnghi/awesome-ai4code
Last synced: about 14 hours ago
JSON representation
-
Tools/Products
-
AI code completion tools
-
More General Coding Assistants
-
ChatGPT in your editor
-
LLM-powered natural language compilers
-
-
Academic
-
Conferences
- Automated Software Engineering (ASE)
- Programming Language Design and Implementation (PLDI)
- International Conference on Learning Representation (ICLR)
- Empirical Methods in Natural Language Processing (EMNLP)
- North American Chapter of the Association for Computational Linguistics (NAACL)
- Annual Meeting of the Association for Computational Linguistics (ACL)
- Interational Conference on Software Engineering (ICSE)
- Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE)
- Interational Conference in Machine Learning (ICML)
- International Conference on Neural Information Processing Systems(NeurIPS)
-
Papers (This list is a bit outdated, need to update)
- Large Language Models of Code Fail at Completing Code with Potential Bugs - Tuan Dinh, Jinman Zhao, Samson Tan, Renato Negrinho, Leonard Lausen, Sheng Zha, George Karypis.
- Large Language Models Meet NL2Code: A Survey - Daoguang Zan, Bei Chen, Fengji Zhang, Dianjie Lu, Bingchao Wu, Bei Guan, Yongji Wang, Jian-Guang Lou (EMNLP 2023)
- RepoFusion: Training Code Models to Understand Your Repository - Disha Shrivastava, Denis Kocetkov, Harm de Vries, Dzmitry Bahdanau, Torsten Scholak
- XCODEEVAL: An Execution-based Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval - Mohammad Abdullah Matin Khan, M Saiful Bari, Xuan Long Do, Weishi Wang, Md Rizwan Parvez, Shafiq Joty
- RepoFusion: Training Code Models to Understand Your Repository - Disha Shrivastava, Denis Kocetkov, Harm de Vries, Dzmitry Bahdanau, Torsten Scholak
-
-
Pretrained CodeLLMs
-
Papers (This list is a bit outdated, need to update)
- CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation - Yue Wang, Weishi Wang, Shafiq Joty, Steven C.H. Hoi (EMNLP 2021) (***CodeT5***).
- CodeBERT:A Pre-Trained Model for Programming and Natural Language - Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, Ming Zhou (EMNLP 2020 Findings) (***CodeBERT***).
- Learning and Evaluating Contextual Embedding of Source Code - Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, Kensen Shi. (ICML 2020) (***CuBERT***).
- InferCode: Self-Supervised Learning of Code Representations by Predicting Subtrees - Nghi D. Q. BUI, Yijun YU, Lingxiao JIANG (ICSE 2021) (***InferCode***).
- Unsupervised Translation of Programming Languages - Marie-Anne Lachaux, Baptiste Roziere, Lowik Chanussot, Guillaume Lample (NeurIPS 2020) (***Transcoder***).
- Contrastive Code Representation Learning
- CoTexT: Multi-task Learning with Code-Text Transformer
- How could Neural Networks understand Programs? - Yan Liu (ICML 2021) (***OSCAR***)
- Unified Pre-training for Program Understanding and Generation - Wei Chang (NAACL 2021) (***PLBART***).
- Exploring Software Naturalness through Neural Language Models - BERT***).
- PYMT5: multi-mode translation of natural language and PYTHON code with transformers
- DOBF: A Deobfuscation Pre-Training Objective for Programming Languages - Anne Lachaux, Marc Szafraniec, Guillaume Lample, (arXiv 2021) (***DOBF***).
- Studying the Usage of Text-To-Text Transfer Transformer to Support Code-Related Tasks
- Disentangled Code Representation Learning for Multiple Programming Languages - Fingings 2021) (***CODEDISEN***).
- SYNCOBERT: Syntax-Guided Multi-Modal Contrastive Pre-Training for Code Representation
- TreeBERT: A Tree-Based Pre-Trained Model for Programming Language
- Empirical Study of Transformers for Source Code
- CodeTrans: Towards Cracking the Language of Siliconeβs Code Through Self-Supervised Deep Learning and High Performance Computing
- Self-Supervised Learning for Code Retrieval and Summarization through Semantic-Preserving Program Transformations - Nghi D. Q. BUI, Yijun YU, Lingxiao JIANG (SIGIR 2021) (***Corder***).
-
-
Talks and Tutorials
-
Papers (This list is a bit outdated, need to update)
-
-
Talk and Tutorial
-
Dataset and Benchmark
-
Papers (This list is a bit outdated, need to update)
-
Programming Languages
Categories
Sub Categories
Keywords
program-synthesis
3
ai
3
typescript
2
python
2
chatgpt
2
ml
2
machine-learning
2
data-science
1
data
1
cnn
1
bert
1
robotics
1
lean
1
vscode
1
gpt-4
1
gpt-3
1
llms
1
gpt
1
generative-models
1
emacs
1
search
1
react
1
javascript
1
frontend
1
language-model
1
documentation-generator
1
stylometry
1
source
1
jam-programming-competition
1
google-code-jam
1
dataset
1
contest
1
code
1
authorship-recognition
1
authorship-identification
1
authorship-attribution
1
puzzles
1
programming-competitions
1
code-generation
1
tensorflow
1
self-attention
1
rnn
1
representation-learning
1
programming-language-theory
1
open-data
1
nlp-machine-learning
1
nlp
1
neural-networks
1
natural-language-processing
1
machine-learning-on-source-code
1