Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/zwang4/awesome-machine-learning-in-compilers

Must read research papers and links to tools and datasets that are related to using machine learning for compilers and systems optimisation
https://github.com/zwang4/awesome-machine-learning-in-compilers

List: awesome-machine-learning-in-compilers

artificial-intelligence auto-tuning compiler machine-learning multi-cores operating-systems optimisation parallel-computing parallel-programming parallelisation parallelism

Last synced: about 2 months ago
JSON representation

Must read research papers and links to tools and datasets that are related to using machine learning for compilers and systems optimisation

Awesome Lists containing this project

README

        

# Awesome machine learning for compilers and program optimisation
[![Awesome](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome)
[![Maintenance](https://img.shields.io/badge/Maintained%3F-YES-green.svg)](https://github.com/zwang4/awesome-machine-learning-in-compilers/graphs/commit-activity)

A curated list of awesome research papers, datasets, and tools for applying machine learning techniques to compilers and program optimisation.

## Contents
- [Papers](#papers)
- [Survey](#survey)
- [Iterative Compilation and Compiler Option Tuning](#iterative-compilation-and-compiler-option-tuning)
- [Instruction-level Optimisation](#instruction-level-optimisation)
- [Parallelism Mapping and Task Scheduling](#parallelism-mapping-and-task-scheduling)
- [Languages and Compilation](#languages-and-compilation)
- [Auto-tuning and Design Space Exploration](#auto-tuning-and-design-space-exploration)
- [Code Size Reduction](#code-size-reduction)
- [Cost and Performance Models](#cost-and-performance-models)
- [Domain-specific Optimisation](#domain-specific-optimisation)
- [Learning Program Representation](#learning-program-representation)
- [Enabling ML in Compilers and Systems Optimisation](#enabling-ml-in-compilers-and-systems-optimisation)
- [Memory/Cache Modeling/Analysis](#memorycache-modelinganalysis)
- [Books](#books)
- [Talks and Tutorials](#talks-and-tutorials)
- [Software](#software)
- [Benchmarks and Datasets](#benchmarks-and-datasets)
- [Conferences](#conferences)
- [Journals](#journals)
- [How to Contribute](#how-to-contribute)

## Papers
#### Survey
- 23-pages [Machine Learning in Compiler Optimisation](https://zwang4.github.io/publications/pieee18.pdf) - Zheng Wang and Michael O'Boyle, Proceedings of the IEEE, 2018
- 43-pages [A survey on compiler autotuning using machine learning](https://dl.acm.org/doi/abs/10.1145/3197978) - Amir H. Ashouri, William Killian, John Cavazos, Gianluca Palermo, and Cristina Silvano, ACM Computing Surveys (CSUR), 2018
- 43-pages [A survey of machine learning for big code and naturalness](https://arxiv.org/abs/1709.06182) - Miltiadis Allamanis, Earl T. Barr, Premkumar Devanbu, and Charles Sutton, ACM Computing Surveys (CSUR), 2018
- 9-pages [A Taxonomy of ML for Systems Problems](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9153088) - Martin Maas, IEEE Micro, 2020
- 34-pages [The Deep Learning Compiler: A Comprehensive Survey](https://arxiv.org/abs/2002.03794) - Mingzhen Li, Yi Liu, Xiaoyan Liu, Qingxiao Sun, Xin You, Hailong Yang, Zhongzhi Luan, Lin Gan, Guangwen Yang, Depei Qian, IEEE Transactions on Parallel and Distributed Systems, 2021

#### Iterative Compilation and Compiler Option Tuning
- 13-pages [SRTuner: Effective Compiler Optimization
Customization by Exposing Synergistic Relations](https://ieeexplore.ieee.org/document/9741263) - Sunghyun Park, Salar Latifi, Yongjun Park, Armand Behroozi, Byungsoo Jeon, Scott Mahlke. CGO 2022.
- 25-pages [Iterative Compilation Optimization Based on Metric Learning and Collaborative Filtering](https://dl.acm.org/doi/full/10.1145/3480250) - Hongzhi Liu, Jie Luo, Ying Li, Zhonghai Wu. ACM TACO 2022.
- 17-pages [Bayesian Optimization is Superior to Random Search for
Machine Learning Hyperparameter Tuning: Analysis of the Black-Box Optimization Challenge 2020](https://arxiv.org/pdf/2104.10201.pdf) - Ryan Turner, David Eriksson, Michael McCourt, Juha Kiili, Eero Laaksonen, Zhen Xu, Isabelle Guyon. arXiv 2021.
- 16-pages [Bliss: auto-tuning complex applications using a pool of diverse lightweight learning models](https://dl.acm.org/doi/abs/10.1145/3453483.3454109) - RB Roy, T Patel, V Gadepally, D Tiwari. PLDI 2021.
- 12-pages [Efficient Compiler Autotuning via Bayesian Optimization](https://drive.google.com/file/d/1uc5d6xn3EUYXWVV8VFSdtfZ9eqvTL3k1/view) - Junjie Chen, Ningxin Xu, Peiqi Chen, Hongyu Zhang. ICSE 2021.
- 11-pages [Customized Monte Carlo Tree Search for LLVM/Polly's Composable Loop Optimization Transformations](https://arxiv.org/abs/2105.04555) - Jaehoon Koo, Prasanna Balaprakash, Michael Kruse, Xingfu Wu, Paul Hovland, Mary Hall. Arxiv.org, 2021.
- 11-pages [Improved basic block reordering](https://arxiv.org/pdf/1809.04676.pdf) - Andy Newell and Sergey Pupyrev. IEEE Transactions on Computers, 2020.
- 11-pages [Static Neural Compiler Optimization via Deep Reinforcement Learning](https://arxiv.org/abs/2008.08951) - Rahim Mammadli, Ali Jannesari, Felix Wolf. LLVM HPC Workshop, 2020.
- 11-pages [Autotuning Search Space for Loop Transformations](https://arxiv.org/pdf/1809.04676.pdf) - Michael Kruse, Hal Finkel, Xingfu Wu. LLVM HPC Workshop, 2020.
- 11-pages [A Collaborative Filtering Approach for the Automatic Tuning of Compiler Optimisations](https://dl.acm.org/doi/abs/10.1145/3372799.3394361) - Stefano Cereda, Gianluca Palermo, Paolo Cremonesi, and Stefano Doni, LCTES 2020.
- 12-pages [Autophase: Compiler phase-ordering for hls with deep reinforcement learning](https://proceedings.mlsys.org/paper/2020/file/4e732ced3463d06de0ca9a15b6153677-Paper.pdf). Ameer Haj-Ali, Qijing Huang, William Moses, John Xiang, Ion Stoica, Krste Asanovic, John Wawrzynek. MLSys 2020.
- 10-pages [FuncyTuner: Auto-tuning Scientific Applications With Per-loop
Compilation](https://arcb.csc.ncsu.edu/~mueller/ftp/pub/mueller/papers/icpp19.pdf) - Tao Wang, Nikhil Jain, David Beckingsale, David Böhme, Frank Mueller, Todd Gamblin. ICPP 2019.
- 21-pages [Micomp: Mitigating the compiler phase-ordering problem using optimization sub-sequences and machine learning](https://core.ac.uk/download/pdf/93751619.pdf) - Amir H. Ashouri, Andrea Bignoli, Gianluca Palermo, Cristina Silvano, Sameer Kulkarni, and John Cavazos. ACM Transactions on Architecture and Code Optimization (TACO) 2017.
- 26-pages [Iterative Schedule Optimization for Parallelization in the
Polyhedron Model](https://www.infosun.fim.uni-passau.de/publications/docs/GGS+17.pdf) - Stefan Ganser, Armin Grösslinger, Norbert Siegmund, Sven Apel, and Christian Lengauer. ACM Transactions on Architecture and Code Optimization (TACO), 2017.
- 14-pages [Learning to superoptimize programs](https://arxiv.org/abs/1611.01787v3) - Rudy Bunel, Alban Desmaison, M. Pawan Kumar, Philip H.S. Torr, Pushmeet Kohlim. ICLR 2017
- 25-pages [Continuous learning of compiler heuristics](https://dl.acm.org/doi/abs/10.1145/2400682.2400705) - Michele Tartara and Stefano Crespi Reghizzi. ACM Transactions on Architecture and Code Optimization (TACO), 2013.
- 16-pages [Mitigating the compiler optimization phase-ordering problem using machine learning](https://www.eecis.udel.edu/~cavazos/oopsla-2012.pdf) - Sameer Kulkarni and John Cavazos. OOPSLA 2012.
- 10-pages [An evaluation of different modeling techniques for iterative compilation](https://www.eecis.udel.edu/~cavazos/cases-2011.pdf) - Eunjung Park, Sameer Kulkarni, and John Cavazos. CASES 2011.
- 12-pages [Evaluating iterative optimization across 1000 datasets](https://users.elis.ugent.be/~leeckhou/papers/pldi10.pdf) - Yang Chen, Yuanjie Huang, Lieven Eeckhout, Grigori Fursin, Liang Peng, Olivier Temam, and Chengyong Wu. PLDI 2010
- 11-pages [Iterative optimization in the polyhedral model: Part II, multidimensional time](https://www.eecis.udel.edu/~cavazos/pldi-2008.pdf) - Louis-Noël Pouchet, Cédric Bastoul, Albert Cohen, and John Cavazos. PLDI 2008.
- 10-pages [Cole: compiler optimization level exploration](https://users.elis.ugent.be/~leeckhou/papers/cgo08.pdf) - Kenneth Hoste and Lieven Eeckhout. CGO 2008.
- 13-pages [MILEPOST GCC: machine learning based research compiler](http://www.fursin.net/papers/fmtp2008.pdf) - Grigori Fursin, Cupertino Miranda, Olivier Temam, Mircea Namolaru, Elad Yom-Tov, Ayal Zaks, Bilha Mendelson et al., 2008
- 13-pages [Evaluating heuristic optimization phase order search algorithms](http://www.cs.fsu.edu/~whalley/papers/cgo07.pdf) - J. W. Davidson, Gary S. Tyson, D. B. Whalley, and P. A. Kulkarni. CGO 2007.
- 13-pages [Rapidly selecting good compiler optimizations using performance counters](http://ebonilla.github.io/papers/cavazos-et-al-cgo-2007.pdf) - John Cavazos, Grigori Fursin, Felix Agakov, Edwin Bonilla, Michael FP O'Boyle, and Olivier Temam. CGO 2007.
- 11-pages [Using machine learning to focus iterative optimization](http://homepages.inf.ed.ac.uk/bfranke/Publications/cgo-2006.pdf) - Felix Agakov, Edwin Bonilla, John Cavazos, Björn Franke, Grigori Fursin, Michael FP O'Boyle, John Thomson, Marc Toussaint, and Christopher KI Williams. CGO 2006.
- 12-pages [Method-specific dynamic compilation using logistic regression](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.132.4370&rep=rep1&type=pdf) - John Cavazos and Michael FP O'boyle. OOPSLA 2005.
- 14-pages [Predicting unroll factors using supervised classification](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.93.2788&rep=rep1&type=pdf) - Mark Stephenson and Saman Amarasinghe. CGO 2005.
- 12-pages [Fast searches for effective optimization phase sequences](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.93.2788&rep=rep1&type=pdf) - Prasad Kulkarni, Stephen Hines, Jason Hiser, David Whalley, Jack Davidson, and Douglas Jones. PLDI 2004.

#### Instruction-level Optimisation
- 12-pages [RL4ReAl: Reinforcement Learning for Register Allocation](https://arxiv.org/pdf/2204.02013.pdf) - S. VenkataKeerthy, Siddharth Jain, Anilava Kundu, Rohit Aggarwal, Albert Cohen, Ramakrishna Upadrasta. CC 2023.
- 12-pages [Reinforcement Learning assisted Loop Distribution for Locality and Vectorization](https://www.researchgate.net/publication/365475992_Reinforcement_Learning_assisted_Loop_Distribution_for_Locality_and_Vectorization) - Shalini Jain, S. VenkataKeerthy, Rohit Aggarwal, Tharun Kumar Dangeti, Dibyendu Das, Ramakrishna Upadrasta. LLVM HPC Workshop 2022.
- 17-pages [Discovering faster matrix multiplication algorithms with reinforcement learning](https://www.nature.com/articles/s41586-022-05172-4.pdf) - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- 13-pages [A Reinforcement Learning Environment for Polyhedral Optimizations](https://arxiv.org/pdf/2104.13732.pdf) - Alexander Brauckmann, Andrés Goens, Jeronimo Castrillon. PACT, 2021.
- 12-pages [AI Powered Compiler Techniques for DL Code Optimization](https://arxiv.org/pdf/2104.05573.pdf) - Sanket Tavarageri, Gagandeep Goyal, Sasikanth Avancha, Bharat Kaul, Ramakrishna Upadrasta. Arxiv.org, 2021.
- 13-pages [VeGen: A Vectorizer Generator for SIMD and Beyond](http://groups.csail.mit.edu/commit/papers/2021/vegen.pdf) - Yishen Chen, Charith Mendis, Michael Carbin, Saman Amarasinghe. ASPLOS 2021.
- 11-pages [Deep Learning-based Hybrid Graph-Coloring Algorithm for Register Allocation](https://arxiv.org/abs/1912.03700) - Dibyendu Das, Shahid Asghar Ahmad, Kumar Venkataramanan. LLVM HPC Workshop, 2020.
- 14-pages [NeuroVectorizer: end-to-end vectorization with deep reinforcement learning](https://people.eecs.berkeley.edu/~krste/papers/neurovectorizer-cgo2020.pdf) - Ameer Haj-Ali, Nesreen K. Ahmed, Ted Willke, Yakun Sophia Shao, Krste Asanovic, and Ion Stoica. CGO 2020.
- 13-pages [Unleashing the Power of Learning: An Enhanced Learning-Based Approach for Dynamic Binary Translation](https://www.usenix.org/system/files/atc19-song_0.pdf) - Changheng Song, Wenwen Wang, Pen-Chung Yew, Antonia Zhai, Weihua Zhang. USENIX ATC 2019.
- 11-pages [Compiler Auto-Vectorization with Imitation Learning](https://papers.nips.cc/paper/9604-compiler-auto-vectorization-with-imitation-learning.pdf) - Charith Mendis, Cambridge Yang, Yewen Pu, Saman P. Amarasinghe, Michael Carbin. NeurIPS 2019.
- 19-pages [Multi-objective Exploration for Practical Optimization Decisions in Binary Translation](https://dl.acm.org/doi/abs/10.1145/3358185) - Sunghyun Park, Youfeng Wu, Janghaeng Lee, Amir Aupov, and Scott Mahlke. ACM Transactions on Embedded Computing Systems (TECS), 2019.
- 12-pages [Automatic construction of inlining heuristics using machine learning.](https://dl.acm.org/doi/abs/10.1109/CGO.2013.6495004) - Sameer Kulkarni, John Cavazos, Christian Wimmer, and Douglas Simon. CGO 2013.
- 11-pages [Automatic tuning of inlining heuristics](http://sc05.supercomputing.org/schedule/pdf/pap274.pdf) - John Cavazos and Michael O'Boyle. SC 2005.
- 13-pages [Inducing heuristics to decide whether to schedule](https://www.eecis.udel.edu/~cavazos/pldi-2004.pdf) - John Cavazos and J. Eliot B. Moss. PLDI 2003.
- 14-pages [Meta optimization: Improving compiler heuristics with machine learning](http://groups.csail.mit.edu/cag/metaopt/papers/metaopt-pldi03.pdf) - Mark Stephenson, Saman Amarasinghe, Martin Martin, and Una-May O'Reilly. PLDI 2003.
- 7-pages [Learning to schedule straight-line code](http://papers.nips.cc/paper/1349-learning-to-schedule-straight-line-code.pdf) - J. Eliot B. Moss, Paul E. Utgoff, John Cavazos, Doina Precup, Darko Stefanovic, Carla E. Brodley, and David Scheeff. NeurIPS 1998.

#### Auto-tuning and Design Space Exploration
- 13-pages [Accelerated Auto-Tuning of GPU Kernels for Tensor Computations](https://dl.acm.org/doi/pdf/10.1145/3650200.3656626) - Chendi Li and Yufan Xu and Sina Mahdipour Saravani and P. Sadayappan. ICS 2024.
- 12-pages [Revealing Compiler Heuristics through Automated Discovery and Optimization](https://www.research.ed.ac.uk/files/389049758/Revealing_computer_heuristics_SEEKER_DOA31072023_AFV_CC_BY.pdf) - Volker Seeker, Chris Cummins, Murray Cole, Björn Franke, Kim Hazelwood, Hugh Leather. CGO 2024.
- 29-pages [The Droplet Search Algorithm for Kernel Scheduling](https://dl.acm.org/doi/10.1145/3650109) - Michael Canesche, Vanderson M. Rosario, Edson Borin, Fernando Magno Quintão Pereira. ACM TACO 2024
- 24-pages [BaCO: A Fast and Portable Bayesian Compiler Optimization Framework](https://arxiv.org/pdf/2212.11142.pdf) - Erik Hellsten, Artur Souza, Johannes Lenfers, Rubens Lacouture, Olivia Hsu, Adel Ejjeh, Fredrik Kjolstad, Michel Steuwer, Kunle Olukotun, Luigi Nardi. ASPLOS 2024.
- 12-pages [(De/Re)-Compositions Expressed Systematically via MDH-Based Schedules](https://doi.org/10.1145/3578360.3580269) - Ari Rasch , Richard Schulze , Denys Shabalin , Anne Elster , Sergei Gorlatch , Mary Hall. CC 2023.
- 23-pages [Autotuning Convolutions is Easier Than You Think](https://dl.acm.org/doi/10.1145/3570641) - Nicolas Tollenaere , Guillaume Iooss , Stéphane Pouget , Hugo Brunie , Christophe Guillon , Albert Cohen , P. Sadayappan , Fabrice Rastello. ACM TACO 2022.
- 12-pages [Transfer-Tuning: Reusing Auto-Schedules for Efficient Tensor Program Code Generation](https://dl.acm.org/doi/10.1145/3559009.3569682) - Perry Gibson, Jose Cano. PACT 2022.
- 6-pages [Glimpse: Mathematical Embedding of Hardware Specification for Neural Compilation](https://dl.acm.org/doi/abs/10.1145/3489517.3530590) - Byung Hoon Ahn, Sean Kinzer, Hadi Esmaeilzadeh. DAC 2022.
- 14-pages [One-shot tuner for deep learning compilers](https://dl.acm.org/doi/abs/10.1145/3497776.3517774) - Jaehun Ryu, Eunhyeok Park, Hyojin Sung. CC 2022.
- 16-pages [A Flexible Approach to Autotuning Multi-Pass Machine Learning Compilers](https://mangpo.net/papers/xla-autotuning-pact2021.pdf) - Phitchaya Mangpo Phothilimthana, Amit Sabne, Nikhil Sarda, Karthik Srinivasa Murthy, Yanqi Zhou, Christof Angermueller, Mike Burrows, Sudip Roy, Ketan Mandke, Rezsa Farahani, Yu Emma Wang, Berkin Ilbeyi, Blake Hechtman, Bjarke Roune, Shen Wang, Yuanzhong Xu, and Samuel J. Kaufman. PACT 2021.
- 16-pages [TASO: Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions]([https://mangpo.net/papers/xla-autotuning-pact2021.pdf](https://doi.org/10.1145/3341301.3359630)) - Zhihao Jia, Oded Padon, James Thomas, Todd Warszawski, Matei Zaharia, and Alex Aiken. ACM SOSP 2019.
- 12-pages [Value Learning for Throughput Optimization of Deep Neural Workloads](https://proceedings.mlsys.org/paper/2021/file/73278a4a86960eeb576a8fd4c9ec6997-Paper.pdf) - Benoit Steiner, Chris Cummins, Horace He, Hugh Leather. MLSys 2021.
- 14-pages [DynaTune: Dynamic Tensor Program Optimization in Deep Neural NetworkCompilation](https://openreview.net/pdf?id=GTGb3M_KcUl) - Minjia Zhang, Menghao Li, Chi Wang, Mingqin Li. ICLR 2021.
- 17-pages [Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning](https://openreview.net/pdf?id=-6vS_4Kfz0) - Shauharda Khadka, Estelle Aflalo, Mattias Mardar, Avrech Ben-David, Santiago Miret, Shie Mannor, Tamir Hazan, Hanlin Tang, Somdeb Majumdar. ICLR 2021.
- 13-pages [GPTune: Multitask Learning for Autotuning Exascale Applications](https://dl.acm.org/doi/10.1145/3437801.3441621) - Yang Liu, Wissam M. Sid-Lakhdar, Osni Marques, Xinran Zhu, Chang Meng, James W. Demmel, Xiaoye S. Li. PPoPP 2021.
- 16-pages [ApproxTuner: A Compiler and Runtime System for Adaptive Approximations](https://dl.acm.org/doi/10.1145/3437801.3446108) - Hashim Sharif, Yifan Zhao, Maria Kotsifakou, Akash Kothari, Ben Schreiber, Elizabeth Wang, Yasmin Sarita, Nathan Zhao, Keyur Joshi, Vikram S. Adve, Sasa Misailovic, Sarita Adve. PPoPP 2021.
- 26-pages [Efficient Auto-Tuning of Parallel Programs with Interdependent Tuning Parameters via Auto-Tuning Framework (ATF)
](https://dl.acm.org/doi/10.1145/3427093) - Ari Rasch , Richard Schulze , Michel Steuwer , Sergei Gorlatch. ACM TACO 2021.
- 17-pages [Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation](https://openreview.net/pdf?id=rygG4AVFvH) - Byung Hoon Ahn, Prannoy Pilligundla, Amir Yazdanbakhsh, Hadi Esmaeilzadeh. ICLR 2020.
- 18-pages [Ansor: Generating High-Performance Tensor Programs for Deep Learning](https://www.usenix.org/system/files/osdi20-zheng.pdf) - Lianmin Zheng, Chengfan Jia, Minmin Sun, Zhao Wu, Cody Hao Yu, Ameer Haj-Ali, Yida Wang, Jun Yang, Danyang Zhuo, Koushik Sen, Joseph E. Gonzalez, Ion Stoica. OSDI 2020. ([slides](https://www.usenix.org/sites/default/files/conference/protected-files/osdi20_slides_zheng.pdf), [presentation](https://www.youtube.com/watch?v=A2hJ_Mj02zk))
- 13-pages [A Pattern Based Algorithmic Autotuner for Graph Processing on GPUs](https://readingxtra.github.io/docs/gpu-graph/MengPPoPP2019.pdf) - Ke Meng, Jiajia Li, Guangming Tan, Ninghui Sun. PPoPP 2019.
- 10-pages [FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search](https://arxiv.org/pdf/1812.03443.pdf) - Bichen Wu, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Peter Vajda, Yangqing Jia, Kurt Keutzer. CVPR 2019.
- 17-pages [TVM: An automated end-to-end optimizing compiler for deep learning](https://www.usenix.org/system/files/osdi18-chen.pdf) - Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan et al., OSDI 2018
- 10-pages [BOAT: Building auto-tuners with structured Bayesian optimization](https://www.cl.cam.ac.uk/~ey204/teaching/ACS/R244_2018_2019/papers/dalibard_WWW_2017.pdf) - Valentin Dalibard, Michael Schaarschmidt, and Eiko Yoneki, WWW 2017.
- 25-pages [Cobayn: Compiler autotuning framework using bayesian networks](https://groups.csail.mit.edu/commit/papers/2014/ansel-pact14-opentuner.pdf) - Amir H. Ashouri, Giovanni Mariani, Gianluca Palermo, Eunjung Park, John Cavazos, and Cristina Silvano, ACM Transactions on Architecture and Code Optimization (TACO), 2016.
- 12-pages [Autotuning algorithmic choice for input sensitivity](http://groups.csail.mit.edu/commit/papers/2015/yding-pldi15-pbinput.pdf) - Yufei Ding, Jason Ansel, Kalyan Veeramachaneni, Xipeng Shen, Una-May O'Reilly, and Saman Amarasinghe. PLDI 2015
- 26-pages [Fast: A fast stencil autotuning framework based on an optimal-solution space model](http://www.elbagarza.com/pdfs/jia_2015_gpu.pdf) - Yulong Luo, Guangming Tan, Zeyao Mo, and
Ninghui Sun. ACM Transactions on Architecture and Code Optimization (TACO), 2015.
- 10-pages [GPU performance and power tuning using regression trees](https://dl.acm.org/doi/abs/10.1145/2751205.2751214) - Wenhao Jia, Elba Garza, Kelly A. Shaw, and Margaret Martonosi. SC 2015.
- 6-pages [Reinforcement learning-based inter-and intra-application thermal optimization for lifetime improvement of multicore systems](https://cfaed.tu-dresden.de/files/user/akumar/pdf/dac2014.pdf) - Anup K Das, Rishad Ahmed Shafik, Geoff V Merrett, Bashir M Al-Hashimi, Akash Kumar, Bharadwaj Veeravalli. DAC 2014
- 13-pages [Opentuner: An extensible framework for program autotuning](https://groups.csail.mit.edu/commit/papers/2014/ansel-pact14-opentuner.pdf) - Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom, Una-May O'Reilly, and Saman Amarasinghe. PACT 2014
- 12-pages [Taming parallel I/O complexity with auto-tuning](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.714.1995&rep=rep1&type=pdf) - Babak Behzad, Huong Vu Thanh Luu, Joseph Huchette, Surendra Byna, Ruth Aydt, Quincey Koziol, and Marc Snir. SC 2013.
- 12-pages [A multi-objective auto-tuning framework for parallel codes](https://www.researchgate.net/profile/Philipp_Gschwandtner/publication/235436717_A_multi-objective_auto-tuning_framework_for_parallel_codes/links/55b5d86b08aed621de02f1d9/A-multi-objective-auto-tuning-framework-for-parallel-codes.pdf) - Herbert Jordan, Peter Thoman, Juan J. Durillo, Simone Pellegrini, Philipp Gschwandtner, Thomas Fahringer, and Hans Moritsch. SC 2012.
- 8-pages [Bandit-based optimization on graphs with application to library performance tuning](https://www.icml.cc/Conferences/2009/papers/494.pdf) - Frédéric De Mesmay, Arpad Rimmel, Yevgen Voronenko, and Markus Püschel. ICML 2009.
- 12-pages [Combining models and guided empirical search to optimize for multiple levels of the memory hierarchy](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.532.9511&rep=rep1&type=pdf) - Chun Chen, Jacqueline Chame, and Mary Hall. CGO 2005
- 11-pages [Active harmony: towards automated performance tuning](http://www.cs.umd.edu/~hollings/papers/sc02a.pdf) - Cristian Tapus , I-Hsin Chung , Jeffrey K. Hollingsworth. SC 2002

#### Parallelism Mapping and Task Scheduling
- 13-pages [Exploration of Convolutional Neural Network models for source code classification](https://doi.org/10.1016/j.engappai.2020.104075) - Francesco Barchi, Emanuele Parisi, Gianvito Urgese, Elisa Ficarra, and Andrea Acquaviva. Engineering Applications of Artificial Intelligence, January 2021.
- 16-pages [Autopilot: workload autoscaling at Google](https://dl.acm.org/doi/10.1145/3342195.3387524) - Krzysztof Rzadca, Pawel Findeisen, Jacek Swiderski, Przemyslaw Zych, Przemyslaw Broniek, Jarek Kusmierek, Pawel Nowak, Beata Strack, Piotr Witusowski, Steven Hand, John Wilkes. EuroSys 2020. [slides](https://www.eurosys2020.org/wp-content/uploads/2020/04/slides/149_rzadca_slides.pdf)
- 13-pages [Modeling and optimizing NUMA effects and prefetching with machine learning](https://dl.acm.org/doi/10.1145/3392717.3392765) - Isaac Sánchez Barrera, David Black-Schaffer, Marc Casas, Miquel Moretó, Anastasiia Stupnikova, and Mihail Popov. ICS 2020.
- 14-pages [Poise: Balancing thread-level parallelism and memory system performance in GPUs using machine learning](https://homepages.inf.ed.ac.uk/vnagaraj/papers/hpca19.pdf) - Saumay Dublish, Vijay Nagarajan, and Nigel Tophama. HPCA 2019.
- 10-pages [Data and thread placement in NUMA architectures: A statistical learning approach](https://www.mcs.anl.gov/research/projects/argo/publications/2019-icpp-denoyelle.pdf) - Nicolas Denoyelle, Brice Goglin, Emmanuel Jeannot, and Thomas Ropars. ICPP 2019.
- 6-pages [Code Mapping in Heterogeneous Platforms Using Deep Learning and LLVM-IR](https://iris.polito.it/retrieve/handle/11583/2726074/327896/document_post_print.pdf) - Francesco Barchi, Gianvito Urgese, Enrico Macii, and Andrea Acquaviva. DAC 2019.
- 10-pages [Adaptive optimization for OpenCL programs on embedded heterogeneous systems](https://core.ac.uk/download/pdf/83920402.pdf) - Ben Taylor, Vicent Sanz Marco, and Zheng Wang. LCTES 2017.
- 14-pages [Improving spark application throughput via memory aware task co-location: A mixture of experts approach](https://zwang4.github.io/publications/middleware17.pdf) - Vicent Sanz Marco, Ben Taylor, Barry Porter, and Zheng Wang. Middleware 2017.
- 10-pages [Smart multi-task scheduling for OpenCL programs on CPU/GPU heterogeneous platforms](http://www.lancaster.ac.uk/staff/wangz3/publications/hipc14.pdf) - Yuan Wen, Zheng Wang, and Michael FP O'Boyle. HiPC 2015.
- 17-pages [Quasar: resource-efficient and QoS-aware cluster management](http://csl.stanford.edu/~christos/publications/2014.quasar.asplos.pdf) - Christina Delimitrou, and Christos Kozyrakis. ASPLOS 2014.
- 25-pages [Automatic and portable mapping of data parallel programs to opencl for gpu-based heterogeneous systems](https://zwang4.github.io/publications/zheng_taco_2015.pdf) - Zheng Wang, Dominik Grewe, and Michael O'boyle. ACM Transactions on Architecture and Code Optimization (TACO), 2014.
- 26-pages [Integrating Profile-Driven Parallelism Detection and Machine-Learning-Based Mapping](https://zwang4.github.io/publications/taco14.pdf) - Zheng Wang, Georgios Tournavitis, Björn Franke, and Michael FP O'boyle. ACM Transactions on Architecture and Code Optimization (TACO), 2014.
- 13-pages [Portable Performance on Heterogeneous Architectures](https://mangpo.net/papers/pbgpu-asplos13.pdf) - Phitchaya Mangpo Phothilimthana, Jason Ansel, Jonathan Ragan-Kelley, Saman Amarasinghe. ASPLOS 2013.
- 10-pages [Smart, adaptive mapping of parallelism in the presence of external workload](https://dl.acm.org/doi/abs/10.1109/CGO.2013.6495010) - Murali Krishna Emani, Zheng Wang, and Michael O'Boyle. CGO 2013.
- 12-pages [Partitioning streaming parallelism for multi-cores: a machine learning based approach](https://zwang4.github.io/publications/pact10.pdf) - Zheng Wang and Michael O'Boyle. PACT 2010.
- 11-pages [Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping](http://www.sphong.net/MICRO_2009.pdf) - Chi-Keung Luk, Sunpyo Hong, and Hyesoon Kim. MICRO 2009.
- 10-pages [Mapping parallelism to multi-cores: a machine learning based approach](http://llvm.org/pubs/2009-02-PPoPP-MappingParallelism.pdf) - Zheng Wang and Michael O'Boyle. PPoPP 2009.

#### Domain-specific Optimisation
- 10-pages [Seer: Predictive Runtime Kernel Selection for Irregular Problems](https://ieeexplore.ieee.org/abstract/document/10444812) - Ryan Swann, Muhammad Osama, Karthik Sangaiah, Jalal Mahmud. CGO 2024
- 18-pages [Tensor Program Optimization with Probabilistic Programs](https://arxiv.org/pdf/2205.13603.pdf) - Junru Shao, Xiyou Zhou, Siyuan Feng, Bohan Hou, Ruihang Lai, Hongyi Jin, Wuwei Lin, Masahiro Masuda, Cody Hao Yu, Tianqi Chen. NeurIPS 2022
- 11-pages [moTuner: a compiler-based auto-tuning approach for mixed-precision operators](https://dl.acm.org/doi/abs/10.1145/3528416.3530231) - Zewei Mo, Zejia Lin, Xianwei Zhang, Yutong Lu. CF 2022
- 12-pages [Collage: Automated Integration of Deep Learning Backends](https://dl.acm.org/doi/10.1145/3559009.3569651) - Byungsoo Jeon, Sunghyun Park, Peiyuan Liao, Sheng Xu, Tianqi Chen, Zhihao Jia. PACT 2022
- 15-pages [Learning Nonlinear Loop Invariants with Gated Continuous Logic Networks](https://www.cs.columbia.edu/~rgu/publications/pldi20-yao.pdf) - J. Yao, G. Ryan, J. Wong, S. Jana, and R. Gu. PLDI 2020.
- 16-pages [Learning-based Memory Allocation for C++ Server Workloads](https://www.cs.utexas.edu/users/mckinley/papers/llama-asplos-2020.pdf) - Maas, Martin, David G. Andersen, Michael Isard, Mohammad Mahdi Javanmard, Kathryn S. McKinley, and Colin Raffel. ASPLOS 2020. [presetnation](https://www.youtube.com/watch?v=gs8m5W-xdDM&feature=emb_title)
- 15-pages [Bridging the gap between deep learning and sparse matrix format selection](https://people.engr.ncsu.edu/xshen5/Publications/ppopp18.pdf) - Yue Zhao, Jiajia Li, Chunhua Liao and Xipeng Shen. PPoPP 2018.
- 10-pages [Camel: Smart, Adaptive Energy Optimization for Mobile Web Interactions](http://eprints.whiterose.ac.uk/155720/1/paper.pdf) - Jie Ren, Y. Lu, Petteri Nurmi, Xiaoming Wang, Miao Ma, Ling Gao, Zhanyong Tang, Jie Zheng, and Zheng Wang. INFOCOM 2020.
- 12-pages [Optimizing sorting with genetic algorithms](http://polaris.cs.uiuc.edu/~garzaran/doc/cgo05.pdf) - Xiaoming Li, Maria Jesus Garzaran, and David Padua. CGO 2005.

#### Languages and Compilation
- 74-pages [(De/Re)-Composition of Data-Parallel Computations via Multi-Dimensional Homomorphisms](https://dl.acm.org/doi/10.1145/3665643) - Ari Rasch, TOPLAS 2024.
- 12-pages [Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines](https://core.ac.uk/download/pdf/20024748.pdf) - Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe, PLDI 2013.
- 12-pages [PetaBricks: a language and compiler for algorithmic choice](http://people.csail.mit.edu/cychan/papers/2009pldi-petabricks.pdf) - Jason Ansel, Cy Chan, Yee Lok Wong, Marek Olszewski, Qin Zhao, Alan Edelman, and Saman Amarasinghe. PLDI 2009.
- 29-pages [Achieving High-performance the Functional Way: a Functional Pearl on Expressing High-performance Optimizations as Rewrite Strategies](https://dl.acm.org/doi/10.1145/3408974) - Bastian Hagedorn, Johannes Lenfers, Thomas K{\oe}hler, Xueying Qin, Sergei Gorlatch, and Michel Steuwer. Proceedings of the ACM on Programming Languages 2020.

#### Code Size Reduction
- 15-pages [Learning Compiler Pass Orders using Coreset and Normalized Value Prediction](https://arxiv.org/pdf/2301.05104.pdf) - Youwei Liang, Kevin Stone, Ali Shameli, Chris Cummins, Mostafa Elhoushi, Jiadong Guo, Benoit Steiner, Xiaomeng Yang, Pengtao Xie, Hugh Leather, Yuandong Tian. ICML 2023.
- 11-pages [POSET-RL: Phase ordering for Optimizing Size and Execution Time using Reinforcement Learning](https://ieeexplore.ieee.org/document/9804673) - Shalini Jain, Yashas Andaluri, S. VenkataKeerthy, Ramakrishna Upadrasta. ISPASS 2022.
- - 12-pages [Exploring the space of optimization sequences for code-size reduction: insights and tools](https://dl.acm.org/doi/10.1145/3446804.3446849) - Anderson Faustino da Silva, Bernardo N. B. de Lima, and Fernando Magno Quintao Pereira. CC 2021. [Code and Data](https://zenodo.org/record/4416117)
- 9-pages [Using machine learning to predict the code size impact of duplication heuristics in a dynamic compiler](https://dl.acm.org/doi/10.1145/3475738.3480943?sid=SCITRUS) - Raphael Mosaner, David Leopoldseder, Lukas Stadler, and Hanspeter Mössenböck. MPLR 2021.
- 13-pages [ANGHABENCH: a Suite with One Million
Compilable C Benchmarks for Code-Size Reduction](https://homepages.dcc.ufmg.br/~fernando/publications/papers/FaustinoCGO21.pdf) - Anderson Faustino da Silva, Bruno Conde Kind, Jose Wesley de Souza Magalhaes, Jeronimo Nunes Rocha, Breno Campos Ferreira Guimaraes, Fernando Magno Quintao Pereira. CGO 2021. [Code and Data](http://cuda.dcc.ufmg.br/angha/home)
- 7-pages [Reinforcement Learning Guided Software Debloating](http://www.csl.sri.com/users/gehani/papers/MLSys-2019.DeepOCCAM.pdf) - Nham Le Van, Ashish Gehani, Arie Gurfinkel, Susmit Jha, and Jorge A. Navas. MLSys 2019.
- 9-pages [Optimizing for reduced code space using genetic algorithms](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.13.1586&rep=rep1&type=pdf) - Keith D. Cooper, Philip J. Schielke, and Devika Subramanian. LCTES 1999.

#### Cost and Performance Models
- 13-pages [TLP: A Deep Learning-Based Cost Model for Tensor Program Tuning](https://arxiv.org/pdf/2211.03578) - Yi Zhai, Yu Zhang, Shuo Liu, Xiaomeng Chu, Jie Peng, Jianmin Ji, Yanyong Zhang, ASPLOS, 2023.
- 13-pages [Performance-Detective: Automatic Deduction of Cheap and Accurate Performance Models](https://mcopik.github.io/assets/pdf/2022_ics_schmid_perf_detective.pdf) - Larissa Schmid, Marcin Copik, Alexandru Calotoiu, Dominik Werle, Andreas Reiter, Michael Selzer, Anne Koziolek, Torsten Hoefler, ICS, 2022.
- 14-pages [Neural Network-based Performance Prediction for Task Migration on S-NUCA Many-Cores](https://ieeexplore.ieee.org/document/9190026) - Martin Rapp, Anuj Pathania, Tulika Mitra, Jörg Henkel, IEEE Transactions on Computers, 2021.
- 13-pages [A Deep Learning Based Cost Model for Automatic Code Optimization](https://proceedings.mlsys.org/paper/2021/file/3def184ad8f4755ff269862ea77393dd-Paper.pdf) - Riyadh Baghdadi, Massinissa Merouani, Mohamed-Hicham LEGHETTAS, Kamel Abdous, Taha Arbaoui, Karima BENATCHBA, Saman amarasinghe, MLSys 2021
- 11-pages [Comparative Code Structure Analysis using Deep Learning for Performance Prediction](https://arxiv.org/abs/2102.07660) - Nathan Pinnow, Tarek Ramadan, Tanzima Z. Islam, Chase Phelps, Jayaraman J. Thiagarajan, ISPASS 2021
- 15-pages [Extracting Clean Performance Models from Tainted Programs](https://ieeexplore.ieee.org/document/9139798) - Marcin Copik, Alexandru Calotoiu, Tobias Grosser, Nicolas Wicki, Felix Wolf, Torsten Hoefler. PPoPP 2021.
- 15-pages [PMEvo: Portable Inference of Port Mappings for Out-of-Order Processors by Evolutionary Optimization](https://arxiv.org/pdf/2004.10044.pdf) - Fabian Ritter, Sebastian Hack. PLDI 2020.
- 10-pages [An Active Learning Method for Empirical Modeling in Performance Tuning](https://ieeexplore.ieee.org/document/9139798) - Jiepeng Zhang, Jingwei Sun, Wenju Zhou, Guangzhong Sun. IPDPS 2020.
- 12-pages [Learning to Optimize Halide with Tree Search and Random Programs](https://dl.acm.org/doi/pdf/10.1145/3306346.3322967) - Andrew Adams, Karima Ma, Luke Anderson, Riyadh Baghdadi, Tzu-Mao Li, Michael Gharbi, Benoit Steiner, Steven Johson, Kayvon Fatahalian, Fredo Durand, Jonathan Ragan-Kelley. ACM Trans Graph, 2019.
- 11-pages [Ithemal: Accurate, portable and fast basic block throughput estimation using deep neural networks](http://proceedings.mlr.press/v97/mendis19a/mendis19a.pdf) - Charith Mendis, Alex Renda, Saman Amarasinghe, and Michael Carbin. ICML 2019.
- 13-pages [Absinthe: Learning an Analytical Performance Model to Fuse and Tile Stencil Codes in One Shot](http://unixer.de/publications/img/gysi-absinthe.pdf) - Tobias Gysi, Tobias Grosser, and Torsten Hoefler. PACT 2019.
- 22-pages [Predicting new workload or CPU performance by analyzing public datasets](https://yuemmawang.github.io/publications/wang-taco2019.pdf) - Yu Wang, Victor Lee, Gu-Yeon Wei, and David Brooks. ACM Transactions on Architecture and Code Optimization (TACO), 2019.
- 10-pages [Automatic creation of tile size selection models](http://people.rennes.inria.fr/Tomofumi.Yuki/papers/yuki-cgo2010.pdf) - Tomofumi Yuki, Lakshminarayanan Renganarayanan, Sanjay Rajopadhye, Charles Anderson, Alexandre E. Eichenberger, and Kevin O'Brien. CGO 2010.
- 13-pages [Microarchitecture sensitive empirical models for compiler optimizations](https://www.csa.iisc.ac.in/~srikant/papers-theses/kapil-CGO-2007.pdf) - Kapil Vaswani, Matthew J. Thazhuthaveetil, Y. N. Srikant, and P. J. Joseph. CGO 2007.
- 12-pages [Accurate static estimators for program optimization](https://dl.acm.org/doi/abs/10.1145/178243.178251) - Tim A. Wagner, Vance Maverick, Susan L. Graham, and Michael A. Harrison. PLDI 1994.

#### Learning Program Representation
- 13-pages [Performance Embeddings: A Similarity-Based Transfer Tuning Approach to Performance Optimization](https://dl.acm.org/doi/abs/10.1145/3533767.3534383) - L Trümper, T Ben-Nun, P Schaad, A Calotoiu, T Hoefler. ICS 2023.
- 13-pages [Improving cross-platform binary analysis using representation learning via graph alignment](https://dl.acm.org/doi/abs/10.1145/3533767.3534383) - Geunwoo Kim, Sanghyun Hong, Michael Franz, Dokyung Song. ISSTA 2022.
- - 46-pages [Program Representations for Predictive Compilation: State of Affairs in the Early 20's](https://homepages.dcc.ufmg.br/~fernando/publications/papers/FaustinoJCL22.pdf) - Anderson Faustino da Silva, Edson Borin, Fernando Magno Quintao Pereira, Nilton Luiz Queiroz Junior and Otavio Oliveira Napoli. JCL 2022. [Code and Data](https://github.com/otavioon/COLA-2022-Tools)
- 11-pages [Comparative Code Structure Analysis using Deep Learning for Performance Prediction](https://arxiv.org/pdf/2102.07660.pdf) - DNathan Pinnow, Tarek Ramadan, Tanzima Z. Islam, Chase Phelps, Jayaraman J. Thiagarajan. ISPASS 2021.
- 18-pages [GraphCodeBERT: Pre-training Code Representations with Data Flow ](https://arxiv.org/pdf/2009.08366.pdf) - Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie LIU, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, Michele Tufano, Shao Kun Deng, Colin Clement, Dawn Drain, Neel Sundaresan, Jian Yin, Daxin Jiang, Ming Zhou. ICLR 2021.
- 12-pages [CodeBERT:A Pre-Trained Model for Programming and Natural Languages](https://www.aclweb.org/anthology/2020.findings-emnlp.139.pdf) - Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, Ming Zhou. EMNLP 2020.
- 27-pages [IR2VEC: LLVM IR Based Scalable Program Embeddings](https://dl.acm.org/doi/10.1145/3418463) - S. VenkataKeerthy, Rohit Aggarwal, Shalini Jain, Maunendra Sankar Desarkar, Ramakrishna Upadrasta and Y. N. Srikant. TACO 2020.
- 13-pages [Deep Program Structure Modeling Through Multi-Relational Graph-based Learning](https://zwang4.github.io/publications/pact20.pdf) - Guixin Ye, Zhanyong Tang, Huanting Wang, Jianbin Fang, Songfang Huang and Zheng Wang. PACT 2020.
- 12-pages [Global Relational Models of Source Code](https://openreview.net/pdf?id=B1lnbRNtwr) - Vincent J. Hellendoorn, Charles Sutton, Rishabh Singh, Petros Maniatis, David Bieber, ICLR 2020. ([Data and Code](https://github.com/VHellendoorn/ICLR20-Great))
- 45-pages [Learning Semantic Program Embeddings with Graph Interval Neural Network](https://arxiv.org/pdf/2005.09997.pdf) - Yu Wang, Ke Wang, Fengjuan Gao, and Linzhang Wang. OOPSLA 2020.
- 27-pages [Flow2Vec: Value-Flow-Based Precise Code Embedding](https://yuleisui.github.io/publications/oopsla20.pdf) - Yulei Sui, Xiao Cheng, Guanqin Zhang and Haoyu Wang. OOPSLA 2020.
- 23-pages [MISIM: An End-to-End Neural Code Similarity System](https://arxiv.org/pdf/2006.05265.pdf) - Fangke Ye, Shengtian Zhou, Anand Venkat, Ryan Marcus, Nesime Tatbul, Jesmin Jahan Tithi, Paul Petersen, Timothy Mattson, Tim Kraska, Pradeep Dubey, Vivek Sarkar and Justin Gottschlich . arXiv 2020.
- 14-pages [Blended, precise semantic program embeddings](https://dl.acm.org/doi/abs/10.1145/3385412.3385999) - Ke Wang and Zhendong Su. PLDI 2020.
- 11-pages [LambdaNet: Probabilistic Type Inference using Graph Neural Networks](https://arxiv.org/pdf/2005.02161.pdf) - Jiayi Wei, Maruth Goyal, Greg Durrett, and Isil Dillig. ICLR 2020.
- 11-pages [Compiler-based graph representations for deep learning models of code](https://cfaed.tu-dresden.de/files/Images/people/chair-cc/publications/2002_Brauckmann_CC.pdf) - Alexander Brauckmann, Andrés Goens, Sebastian Ertel, and Jeronimo Castrillon. CC 2020.
- 24-pages [Generative Code Modeling with Graphs](https://arxiv.org/pdf/1805.08490.pdf) - Marc Brockschmidt, Miltos Allamanis, Alexander L. Gaunt, and Oleksandr Polozov. ICLR 2019.
- 22-pages [code2seq: Generating sequences from structured representations of code](https://arxiv.org/pdf/1808.01400) - Uri Alon, Shaked Brody, Omer Levy, and Eran Yahav. ICLR 2019.
- 29-pages [code2vec: Learning distributed representations of code](http://www.cs.technion.ac.il/~mbs/publications/code2vec-popl19.pdf) - Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. POPL 2019.
- 10-pages [COSET: A Benchmark for Evaluating Neural Program Embeddings](https://arxiv.org/pdf/1905.11445.pdf) - Ke Wang, Mihai Christodorescu. arXiv 2019.
- 16-pages [Learning to Represent Programs with Graphs](https://www.microsoft.com/en-us/research/wp-content/uploads/2017/11/programGraphs.pdf) - Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. ICLR 2018.
- 13-pages [Neural Code Comprehension: A Learnable Representation of Code Semantics](https://papers.nips.cc/paper/7617-neural-code-comprehension-a-learnable-representation-of-code-semantics.pdf) - Tal Ben-Nun, Alice Shoshana Jakobovits, and Torsten Hoefler. NeurIPS 2018.
- 14-pages [End-to-end deep learning of optimization heuristics](http://homepages.inf.ed.ac.uk/hleather/publications/2017-deepopt-pact.pdf) - Chris Cummins, Pavlos Petoumenos, Zheng Wang, and Hugh Leather ([slides](https://speakerdeck.com/chriscummins/end-to-end-deep-learning-of-optimization-heuristics-pact-17)). PACT 2017.
- 6-pages [Semantic-aware program sampling](https://www.microsoft.com/en-us/research/wp-content/uploads/2017/11/nips_2017.pdf) - Pratiksha Thaker, Daniel Tarlow, and Marc Brockschmidt. NeurIPS 2017.
- 20-pages [DeepCoder: Learning to write programs](https://www.microsoft.com/en-us/research/uploads/prod/2017/03/main.pdf) - Matej Balog, Alexander L. Gaunt, Marc Brockschmidt,
Sebastian Nowozin, and Daniel Tarlow. ICLR 2017.
- 7-pages [Convolutional neural networks over tree structures for programming language processing](http://sei.pku.edu.cn/~zhanglu/Download/AAAI16.pdf) - Lili Mou, Ge Li, Lu Zhang, Tao Wang, and Zhi Jin. AAAI 2016.
- 10-pages [A Convolutional Attention Network
for Extreme Summarization of Source Code](http://proceedings.mlr.press/v48/allamanis16.pdf) - Miltos Allamanis, Hao Peng, and Charles Sutton. ICML 2016.
- 9-pages [Structured Generative Models of Natural Source Code](http://proceedings.mlr.press/v32/maddison14.pdf) - Chris Maddison and Daniel Tarlow. ICML 2014.
- 11-pages [Using graph-based program characterization for predictive modeling](https://www.eecis.udel.edu/~cavazos/cgo-2012.pdf) - Eunjung Park, John Cavazos, and Marco A. Alvarez. CGO 2011.
- 11-pages [Automatic feature generation for machine learning based optimizing compilation](http://homepages.inf.ed.ac.uk/hleather/publications/2009_autofeatures_cgo.pdf) - Hugh Leather, Edwin Bonilla, and Michael O'Boyle. CGO 2009.
- 14-pages [A Game-Based Framework to Compare Program Classifiers and Evaders](https://homepages.dcc.ufmg.br/~fernando/publications/papers/CGO23_ThaisDamasio.pdf) - Thais Damasio, Michael Canesche, Vinicius Pacheco, Anderson Faustino da Silva, Marcus Botacin and Fernando Magno Quintao Pereira. CGO 2023. [Code and Data](https://zenodo.org/record/7374649)

#### Enabling ML in Compilers and Systems Optimisation
- 33-pages [Meta Large Language Model Compiler: Foundation Models of Compiler Optimization](https://arxiv.org/abs/2407.02524) - Chris Cummins, Volker Seeker, Dejan Grubisic, Baptiste Roziere, Jonas Gehring, Gabriel Synnaeve, Hugh Leather. Arxiv 2024.
- 13-pages [The Next 700 ML-Enabled Compiler Optimizations](https://arxiv.org/pdf/2311.10800.pdf) - S. VenkataKeerthy, Siddharth Jain, Umesh Kalvakuntla, Pranav Sai Gorantla, Rajiv S Chitale, Eugene Brevdo, Albert Cohen, Mircea Trofin, Ramakrishna Upadrasta. CC 2024.
- 12-pages [BenchPress: A Deep Active Benchmark Generator
](https://arxiv.org/pdf/2208.06555.pdf) - Foivos Tsimpourlas, Pavlos Petoumenos, Min Xu, Chris Cummins, Kim Hazelwood, Ajitha Rajan, Hugh Leather. PACT 2022 ([code](https://github.com/fivosts/BenchPress))
- 14-pages [Automating Reinforcement Learning Architecture Design for Code Optimization](https://zwang4.github.io/publications/cc22.pdf) - Huanting Wang, Zhanyong Tang, Cheng Zhang, Jiaqi Zhao, Chris Cummins, Hugh Leather, Zheng Wang. CC 2022 ([code](https://github.com/HuantWang/SUPERSONIC))
- 14-pages [Learning Semantic Representations to Verify Hardware Designs]([https://zwang4.github.io/publications/cc22.pdf](https://proceedings.neurips.cc/paper/2021/file/c5aa65949d20f6b20e1a922c13d974e7-Paper.pdf)) - Shobha Vasudevan, Wenjie (Joe) Jiang, David Bieber, Rishabh Singh, hamid shojaei, C. Richard Ho, Charles Sutton. NeurIPS 2021
- 43-pages [Composable and Modular Code Generation in MLIR: A Structured and Retargetable Approach to Tensor Compiler Construction](https://arxiv.org/pdf/2202.03293.pdf) - Nicolas Vasilache, Oleksandr Zinenko, Aart J.C. Bik, Mahesh Ravishankar, Thomas Raoux, Alexander Belyaev, Matthias Springer, Tobias Gysi, Diego Caballero, Stephan Herhut, Stella Laurenzo, Albert Cohen. arXiV 2022
- 12-pages [Deep NLP-based co-evolvement for synthesizing code analysis from natural language](https://dl.acm.org/doi/10.1145/3446804.3446852) - Zifan Nan, Hui Guan, Xipeng Shen, Chunhua Liao. CC 2021
- 12-pages [MLGO: a Machine Learning Guided Compiler Optimizations Framework](https://arxiv.org/abs/2101.04808) - Mircea Trofin, Yundi Qian, Eugene Brevdo, Zinan Lin, Krzysztof Choromanski, David Li. arXiv. [Code](https://github.com/google/ml-compiler-opt)
- 16-pages [Towards Better Understanding of Black-box Auto-tuning: A Comparative Analysis for Storage Systems](https://www.usenix.org/system/files/conference/atc18/atc18-cao.pdf) - Zhen Cao, Vasily Tarasov, Sachin Tiwari, and Erez Zadok. ATC 2018.
- 12-pages [Synthesizing Benchmarks for Predictive Modeling](https://www.pure.ed.ac.uk/ws/files/29479104/2017_cgo_1.pdf) - Chris Cummins, Pavlos Petoumenos, Zheng Wang, and Hugh Leather ([slides](https://speakerdeck.com/chriscummins/synthesizing-benchmarks-for-predictive-modelling-cgo-17)). CGO 2017.
- 12-pages [Minimizing the cost of iterative compilation with active learning](http://homepages.inf.ed.ac.uk/hleather/publications/2017-minimitercomp-cgo.pdf) - William Ogilvie, Pavlos Petoumenos, Zheng Wang, and Hugh Leather. CGO 2017.
- 28-pages [VESPA: static profiling for binary optimization](https://dl.acm.org/doi/abs/10.1145/3485521) - Angelica Aparecida Moreira, Guilherme Ottoni, and Fernando Magno Quintao Pereira. OOPSLA 2021. [Code and Data](https://zenodo.org/record/5502310)
- 35-pages [Mapping Computations in Heterogeneous Multicore Systems with Statistical Regression on Program Inputs](https://homepages.dcc.ufmg.br/~fernando/publications/papers/JunioTECS21.pdf) - Junio Cezar Ribeiro Da Silva, Lorena Leao, Vinicius Petrucci, Abdoulaye Gamatie and Fernando Magno Quintao Pereira. TECS 2021.

### Memory/Cache Modeling/Analysis
- 25-pages [Optimizing Memory Mapping Using Deep Reinforcement Learning](https://arxiv.org/pdf/2305.07440.pdf) - Pengming Wang, Mikita Sazanovich, Berkin Ilbeyi, Phitchaya Mangpo Phothilimthana, Manish Purohit, Han Yang Tay, Ngân Vũ, Miaosen Wang, Cosmin Paduraru, Edouard Leurent, Anton Zhernov, Julian Schrittwieser, Thomas Hubert, Robert Tung, Paula Kurylowicz, Kieran Milan, Oriol Vinyals, Daniel J. Mankowitz. arxiv 2023.
- 10-pages [Learning Memory Access Patterns](http://proceedings.mlr.press/v80/hashemi18a/hashemi18a.pdf) - Milad Hashemi, Kevin Swersky, Jamie A. Smith, Grant Ayers, Heiner Litz, Jichuan Chang, Christos Kozyrakis, Parthasarathy Ranganathan. ICML 2018
- 26-pages [Static Prediction of Silent Stores](https://dl.acm.org/doi/10.1145/3280848) - Fernando Magno Quintao Pereira, Guilherme Vieira Leobas and Abdoulaye Gamatie. TACO 2019. [Code and Data](https://www.lirmm.fr/continuum-project/pages/s3a.html)

## Books
- 118-pages [Automatic Tuning of Compilers Using Machine Learning](https://link.springer.com/book/10.1007/978-3-319-71489-9) - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- 377-pages [Software Automatic Tuning - From Concepts to State-of-the-Art Results](https://www.springer.com/gp/book/9781441969347) - K Naono, K Teranishi, J Cavazos, and R Suda. Springer 2010.

## Talks and Tutorials
- Saman Amarasinghe, [Compiler 2.0: Using Machine Learning to Modernize Compiler Technology](https://www.youtube.com/watch?v=a1w_NKDVdkI). LCTES 2020.
- Amir Ashouri, [Compiler Autotuning using Machine Learning: A State-of-the-art Review](https://youtu.be/xNixKfDxDZE) ([slides](http://amirashouri.ca/resources/Amir_CompileAutotuning_Talk_2019_Google.pdf)). Polytechnic University of Milan 2018.

## Software
- [ML-Compiler-Bridge](https://github.com/IITH-Compilers/ML-Compiler-Bridge) - Library to interface Compilers and ML models for ML-Enabled Compiler Optimizations ([paper](https://arxiv.org/pdf/2311.10800.pdf)).
- [Supersonic](https://github.com/HuantWang/SUPERSONIC) - Automate reinforcement learning architecture design ([paper](https://zwang4.github.io/publications/cc22.pdf)).
- [CompilerGym](https://github.com/facebookresearch/CompilerGym) - Reinforcement learning environments for compiler optimizations ([paper](https://arxiv.org/pdf/2109.08267.pdf)).
- [CodeBert](https://github.com/microsoft/CodeBERT) - pre-trained DNN models for programming languages ([paper](https://arxiv.org/pdf/2002.08155.pdf)).
- [IR2Vec](https://github.com/IITH-Compilers/IR2Vec) - LLVM IR based program embeddings for machine learning ([paper](https://arxiv.org/pdf/1909.06228.pdf)).
- [programl](https://github.com/ChrisCummins/ProGraML) - LLVM and XLA IR program representation for machine learning ([paper](https://arxiv.org/pdf/2003.10536.pdf)).
- [NeuroVectorizer](https://github.com/intel/neuro-vectorizer) - Using deep reinforcement learning (RL) to predict optimal vectorization compiler pragmas ([paper](https://arxiv.org/pdf/1909.13639)).
- [TVM](https://tvm.apache.org/) - Open Deep Learning Compiler Stack for cpu, gpu and specialized accelerators ([paper](https://www.usenix.org/system/files/osdi18-chen.pdf); [slides](https://www.usenix.org/sites/default/files/conference/protected-files/osdi18_slides_chen.pdf)).
- [clgen](https://github.com/ChrisCummins/clgen) - Benchmark generator using LSTMs ([paper](https://chriscummins.cc/pub/2017-cgo.pdf); [slides](https://speakerdeck.com/chriscummins/synthesizing-benchmarks-for-predictive-modelling-cgo-17)).
- [COBAYN](https://github.com/amirjamez/COBAYN) - Compiler Autotuning using BNs ([paper](http://amirashouri.ca/resources/COBAYN-ashouri_taco16.pdf)).
- [OpenTuner](https://github.com/jansel/opentuner) - Framework for building domain-specific multi-objective program autotuners ([paper](http://groups.csail.mit.edu/commit/papers/2014/ansel-pact14-opentuner.pdf); [slides](http://groups.csail.mit.edu/commit/papers/2014/ansel-pact14-opentuner-slides.pdf))
- [ONNX-MLIR](http://onnx.ai/onnx-mlir/) - Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure ([paper](https://arxiv.org/pdf/2008.08272.pdf)).
- [IREE](https://github.com/openxla/iree) - A retargetable MLIR-based machine learning compiler and runtime toolkit.

## Benchmarks and Datasets
- [TenSet: A Large-scale Program Performance Dataset for Learned Tensor Compilers](https://github.com/tlc-pack/tenset) - A dataset of tensor program performance records for six commonly used hardware platforms ([paper](https://openreview.net/pdf?id=aIfp8kLuvc9)).
- [The Alberta Workloads for the SPEC CPU® 2017 Benchmark Suite](https://webdocs.cs.ualberta.ca/~amaral/AlbertaWorkloadsForSPECCPU2017/) - Additional workloads for the SPEC CPU2017 Benchmark Suite.
- [Project CodeNet](https://github.com/IBM/Project_CodeNet) - Code samples written in 50+ programming languages, annotated with info, such as code size, memory footprint, CPU run time, and status (acceptance/error types)
- [CodeXGLUE](https://github.com/microsoft/CodeXGLUE) - A Machine Learning Benchmark Dataset for Code
Understanding and Generation ([paper](https://arxiv.org/pdf/2102.04664.pdf))
- [ANGHABENCH](http://cuda.dcc.ufmg.br/angha/benchmarks) - A suite with One Million Compilable C Benchmarks ([paper](https://homepages.dcc.ufmg.br/~fernando/publications/papers/FaustinoCGO21.pdf))
- [BHive](https://github.com/ithemal/bhive) - A Benchmark Suite and Measurement Framework for Validating x86-64 Basic Block Performance Models ([paper](https://groups.csail.mit.edu/commit/papers/19/ithemal-measurement.pdf)).
- [cBench](https://ctuning.org/wiki/index.php/CTools:CBench) - 32 C benchmarks with datasets and driver scripts.
- [PolyBench](http://web.cs.ucla.edu/~pouchet/software/polybench/) - 30 Stencil and Linear-algebra benchmarks with datasets and driver scripts. See also: [GPU version](https://github.com/cavazos-lab/PolyBench-ACC), [pre-computed datasets](https://github.com/stefanocereda/polybench_data) ([paper](https://dl.acm.org/doi/abs/10.1145/3372799.3394361)).
- [DeepDataFlow](https://github.com/ChrisCummins/ProGraML/blob/master/programl/Documentation/DataflowDataset.md) - 469k LLVM-IR files and 8.6B data-flow analysis labels for classification ([paper](https://arxiv.org/pdf/2003.10536.pdf)).
- [devmap](https://github.com/ChrisCummins/paper-end2end-dl) - 650 OpenCL benchmark features and CPU/GPU classification labels ([paper](https://chriscummins.cc/pub/2017-pact.pdf); [slides](https://speakerdeck.com/chriscummins/end-to-end-deep-learning-of-optimization-heuristics-pact-17)).

## Conferences
- ACM [ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI](https://www.sigplan.org/Conferences/PLDI/)
- ACM [Architectural Support for Programming Languages and Operating Systems, ASPLOS](https://asplos-conference.org/)
- ACM [ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP](https://dl.acm.org/conference/ppopp)
- ACM/IEEE [International Symposium on Code Generation and Optimization, CGO](https://dl.acm.org/conference/cgo)
- ACM/IEEE [International Conference on Parallel Architectures and Compilation Techniques, PACT](https://dl.acm.org/conference/cgo)
- ACM [Object-oriented Programming, Systems, Languages, and Applications, OOPSLA](http://www.sigplan.org/Conferences/OOPSLA/)
- ACM [International Conference on Compiler Construction, CC](https://conf.researchr.org/series/CC)
- ACM [International Conference on Supercomputing, ICS](http://dblp.uni-trier.de/db/conf/ics/)
- ACM [International Conference on High Performance and Embedded Architectures and Compilers, HiPEAC](http://dblp.uni-trier.de/db/conf/hipeac/)
- ACM [International Conference on Languages, Compilers and Tools for Embedded Systems, LCTES](http://dblp.uni-trier.de/db/conf/lctrts/)
- ACM [International Conference on Computing Frontiers, CF](http://dblp.uni-trier.de/db/conf/cf)
- IEEE [International Parallel and Distributed Processing Symposium, IPDPS](http://www.ipdps.org/)
- IEEE [International Conference for High Performance Computing, Networking, Storage, and Analysis, SC](http://supercomputing.org/)
- Workshop [Machine Learning and Programming Languages Workshop, MAPL](https://pldi20.sigplan.org/series/mapl)
- Workshop [Languages and Compilers for Parallel Computing, LCPC](https://dblp.org/db/conf/lcpc/index)
- Academic [International Conference on Learning Representations, ICLR](https://dblp1.uni-trier.de/db/conf/iclr/)
- Academic [Conference on Machine Learning and Systems, MLSys](https://mlsys.org/)

## Journals
- ACM [ACM Transactions on Architecture and Code Optimization, TACO](https://dl.acm.org/journal/taco)

## How to Contribute

See [Contribution Guidelines](CONTRIBUTING.md). TL;DR: send one of the [maintainers](MAINTAINERS) a [pull request](https://github.com/zwang4/awesome-machine-learning-in-compilers/pulls).