Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/Linyxus/awesome-neural-code-intelligence

A curated list for awesome machine learning methods for neural code intelligence.
https://github.com/Linyxus/awesome-neural-code-intelligence

List: awesome-neural-code-intelligence

Last synced: 3 months ago
JSON representation

A curated list for awesome machine learning methods for neural code intelligence.

Awesome Lists containing this project

README

        

# Awesome Neural Code Intelligence
A curated list for awesome machine learning methods for neural code intelligence.

## Websites

- [ML for Code](https://ml4code.github.io)

## Paper List

### RNN/LSTM-based

- code2vec: Learning Distributed Representations of Code, Alon et al., Proc. ACM Program. Lang. (2019): 40:1-40:29
[[arXiv]](https://arxiv.org/abs/1803.09473)
[[GitHub]](https://github.com/tech-srl/code2vec)
[[Demo]](https://code2vec.org/)
- code2seq: Generating Sequences from Structured Representations of Code, Alon et al., ICLR (2019)
[[arXiv]](https://arxiv.org/abs/1808.01400)
[[GitHub]](https://github.com/tech-srl/code2seq)
[[Demo]](https://code2seq.org/)

### Transformer-based

- CodeBERT: A Pre-Trained Model for Programming and Natural Languages, Feng et al., EMNLP (2020): 1536-1547
[[arXiv]](https://arxiv.org/abs/2002.08155)
[[GitHub]](https://github.com/microsoft/CodeBERT)
- GraphCodeBERT: Pre-training Code Representations with Data Flow, Guo et al., ICLR (2021)
[[OpenReview]](https://openreview.net/pdf?id=jLoC4ez43PZ)
[[GitHub]](https://github.com/microsoft/CodeBERT)
- CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation., Wang et al., arXiv (2021)
[[arXiv]](https://arxiv.org/abs/2109.00859)
[[GitHub]](https://github.com/salesforce/CodeT5)
- PyMT5: multi-mode translation of natural language and Python code with transformers, Clement et al., EMNLP (2020): 9052-9065
[[arXiv]](https://arxiv.org/abs/2010.03150)
- Evaluating Large Language Models Trained on Code, Chen et al., CoRR (2021)
[[arXiv]](https://arxiv.org/abs/2107.03374)
[[GitHub]](https://github.com/openai/human-eval)
- Studying the Usage of Text-To-Text Transfer Transformer to Support Code-Related Tasks, Mastropaolo et al., ICSE (2021): 336-347
[[arXiv]](https://arxiv.org/abs/2102.02017)
[[GitHub]](https://github.com/antonio-mastropaolo/T5-learning-ICSE_2021)
- Multi-task Learning based Pre-trained Language Model for Code Completion, Liu et al., ASE (2020): 473-485
[[arXiv]](https://arxiv.org/abs/2012.14631)
[[GitHub]](https://github.com/LiuFang816/CugLM)
- Unsupervised Translation of Programming Languages, Rozière et al., NeurIPS (2020)
[[arXiv]](https://arxiv.org/abs/2006.03511)
[[GitHub]](https://github.com/facebookresearch/CodeGen)
- DOBF: A Deobfuscation Pre-Training Objective for Programming Languages, Rozière et al., CoRR (2021)
[[arXiv]](https://arxiv.org/abs/2102.07492)
[[GitHub]](https://github.com/facebookresearch/CodeGen)
- Leveraging Automated Unit Tests for Unsupervised Code Translation, Rozière et al., CoRR (2021)
[[arXiv]](https://arxiv.org/abs/2110.06773)
[[GitHub]](https://github.com/facebookresearch/CodeGen)
- IntelliCode compose: code generation using transformer, Svyatkovskiy et al., ESEC/SIGSOFT FSE (2020): 1433-1443
[[arXiv]](https://arxiv.org/abs/2005.08025)
- Exploring Software Naturalness through Neural Language Models, Buratti et al., CoRR (2020)
[[arXiv]](https://arxiv.org/abs/2006.12641)
- Unified Pre-training for Program Understanding and Generation, Ahmad et al., NAACL-HLT (2021): 2655-2668
[[arXiv]](https://arxiv.org/abs/2103.06333)

### GNN-based

- Learning Execution through Neural Code Fusion, Shi et al., ICLR (2020)
[[arXiv]](https://arxiv.org/abs/1906.07181)
[[Talk]](https://papertalk.org/papertalks/3759)
- Learning to Represent Programs with Graphs, Allamanis et al., ICLR (2018)
[[arXiv]](https://arxiv.org/abs/1711.00740)
[[GitHub]](https://github.com/Microsoft/graph-based-code-modelling)

### Benchmarks & Surveys

- CodeBLEU: a Method for Automatic Evaluation of Code Synthesis, Ren et al., CoRR (2020)
[[arXiv]](https://arxiv.org/abs/2009.10297)
- CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation, Lu et al., CoRR (2021)
[[arXiv]](https://arxiv.org/abs/2102.04664)
[[GitHub]](https://github.com/microsoft/CodeXGLUE)
[[Website]](https://microsoft.github.io/CodeXGLUE/)
- Measuring Coding Challenge Competence With APPS, Hendrycks et al., CoRR (2021)
[[arXiv]](https://arxiv.org/abs/2105.09938)