An open API service indexing awesome lists of open source software.

https://github.com/tthtlc/awesome-source-analysis

Source code understanding via Machine Learning techniques
https://github.com/tthtlc/awesome-source-analysis

List: awesome-source-analysis

automated-programming deep-learning machine-learning source-code-analysis

Last synced: about 1 month ago
JSON representation

Source code understanding via Machine Learning techniques

Awesome Lists containing this project

README

          

# Awesome Source Code Analysis Via Machine Learning Techniques

A list of resources for source code analysis application using Machine Learning techniques (eg, Deep Learning, PCA, SVM, Bayesian, proabilistic models, reinformcement learning techniques etc)

Maintainers - [Peter Teoh](https://github.com/tthtlc)

## Contributing
Please feel free to [pull requests](https://github.com/tthtlc/awesome-source-analysis/pulls), email Peter Teoh (htmldeveloper@gmail.com) or join our chats to add links.

[[Join the chat at https://gitter.im/tthtlc/awesome-source-analysis](https://gitter.im/tthtlc/awesome-source-analysis)]

## Sharing
## Table of Contents

Machine-Learning-Guided Selectively Unsound Static Analysis
http://www.seas.upenn.edu/~kheo/home/paper/icse17-heohyi.pdf

A Survey of Machine Learning for Big Code and Naturalness
https://arxiv.org/pdf/1709.06182

Ariadne: Analysis for Machine Learning Programs
https://arxiv.org/pdf/1805.04058

The use of machine learning with signal- and NLP processing of source code to fingerprint, detect, and classify vulnerabilities and weaknesses with MARFCAT
https://arxiv.org/abs/1010.2511

VulDeePecker: A Deep Learning-Based System for Vulnerability Detection
https://arxiv.org/pdf/1801.01681

code2vec: Learning Distributed Representations of Code
https://arxiv.org/pdf/1803.09473

Automated software vulnerability detection with machine learning
https://arxiv.org/abs/1803.04497

Automatic feature learning for vulnerability prediction
https://arxiv.org/pdf/1708.02368

Neural Turing Machines
https://arxiv.org/pdf/1410.5401.pdf

DeepCoder: Learning to Write Programs
https://arxiv.org/abs/1611.01989

Recent Advances in Neural Program Synthesis
https://arxiv.org/pdf/1802.02353

Neural-Guided Deductive Search for Real-Time Program Synthesis
https://arxiv.org/pdf/1804.01186

RobustFill: Neural Program Learning under Noisy I/O
https://arxiv.org/pdf/1703.07469

On End-to-End Program Generation from User Intention by Deep
https://arxiv.org/pdf/1510.07211

Neural Program Search: Solving Programming Tasks from Description
https://arxiv.org/pdf/1802.04335

A Syntactic Neural Model for General-Purpose Code Generation
https://arxiv.org/pdf/1704.01696

Building Machines That Learn and Think Like People
https://arxiv.org/pdf/1604.00289

Differentiable Programs with Neural Libraries
https://arxiv.org/pdf/1611.02109

Summary-TerpreT: A Probabilistic Programming Language for Program Induction
https://arxiv.org/pdf/1612.00817

Auto-Documenation for Software Development
https://arxiv.org/pdf/1701.08485

BOOK: Storing Algorithm-Invariant Episodes for Deep Reinforcement Learning
https://arxiv.org/pdf/1709.01308

Boda-RTC: Productive Generation of Portable, Efficient Code ...
https://arxiv.org/pdf/1606.00094

Making Neural Programming Architectures Generalize via Recursion
https://arxiv.org/pdf/1704.06611

Differentiable Functional Program Interpreters
https://arxiv.org/pdf/1611.01988

Utilizing Static Analysis and Code Generation to Accelerate
https://arxiv.org/pdf/1206.6466

Deep Probabilistic Programming Languages: A Qualitative Study
https://arxiv.org/pdf/1804.06458

BinPro: A Tool for Binary Source Code Provenance
https://arxiv.org/pdf/1711.00830

A Survey on Compiler Autotuning using Machine Learning
https://arxiv.org/pdf/1801.04405

Estimating defectiveness of source code: A predictive model using GitHub content
https://arxiv.org/pdf/1803.07764

EMBER: An Open Dataset for Training Static PE Malware Machine
https://arxiv.org/pdf/1804.04637

On End-to-End Program Generation from User Intention by Deep Neural Networks
https://arxiv.org/pdf/1510.07211

Utilizing Static Analysis and Code Generation to Accelerate Neural Networks
https://arxiv.org/abs/1206.6466

DLPaper2Code: Auto-generation of Code from Deep Learning Research Paper
https://arxiv.org/pdf/1711.03543

Inferring Generative Model Structure with Static Analysis
https://arxiv.org/pdf/1709.02477

Sorting and Transforming Program Repair Ingredients via Deep Learning Code Similarities
https://arxiv.org/pdf/1707.04742

DeepAPT: Nation-State APT Attribution Using End-to-End Deep Neural Networks
https://arxiv.org/pdf/1711.09666

Automatic Structure Discovery for Large Source Code
https://arxiv.org/pdf/1202.3335

Comment Generation for Source Code: Survey
https://arxiv.org/pdf/1802.02971

Towards Reverse-Engineering Black-Box Neural Networks
https://arxiv.org/abs/1711.01768

Database Reverse Engineering based on Association Rule Mining
https://arxiv.org/pdf/1004.3272.pdf

Automated detection and classification of cryptographic algorithms in binary programs through machine learning
https://arxiv.org/pdf/1503.01186

Automatically Generating Commit Messages from Diffs using Neural Machine Translation
https://arxiv.org/pdf/1708.09492

When Coding Style Survives Compilation: De-anonymizing Programmers from Executable
https://arxiv.org/pdf/1512.08546

Code smells
https://arxiv.org/pdf/1802.06063

Data Driven Exploratory Attacks on Black Box Classifiers in Adversarial Domains
https://arxiv.org/pdf/1703.07909

pix2code: Generating Code from a Graphical User Interface Screenshot
https://arxiv.org/pdf/1705.07962

Deep Learning in Software Engineering
https://arxiv.org/pdf/1805.04825

Predicting Software Defects Through SVM: An Empirical Approach
https://arxiv.org/pdf/1803.03220

A Survey of Reverse Engineering and Program Comprehension
https://arxiv.org/pdf/cs/0503068

https://owasp.org/www-project-top-ten/2017/

https://arxiv.org/pdf/1709.07101.pdf

https://arxiv.org/pdf/1805.05206.pdf

https://arxiv.org/pdf/1807.09160.pdf

https://arxiv.org/pdf/1806.07336.pdf

Or just search arxiv.org (inaccuracies in identifying papers expected): [recent arxiv.org search](/summary_6dec2018.md)

[LLVM based vulnerabilities search](/summary_llvm_source6dec2018.md)

As an extension

https://ml4code.github.io/

(this site being an offshoot of the paper: https://arxiv.org/abs/1709.06182)