Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-machine-learning-in-compilers
Must read research papers and links to tools and datasets that are related to using machine learning for compilers and systems optimisation
https://github.com/zwang4/awesome-machine-learning-in-compilers
Last synced: 4 days ago
JSON representation
-
Papers
-
- Autotuning Search Space for Loop Transformations - Michael Kruse, Hal Finkel, Xingfu Wu. LLVM HPC Workshop, 2020.
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- code2seq: Generating sequences from structured representations of code - Uri Alon, Shaked Brody, Omer Levy, and Eran Yahav. ICLR 2019.
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- A Taxonomy of ML for Systems Problems - Martin Maas, IEEE Micro, 2020
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Seer: Predictive Runtime Kernel Selection for Irregular Problems - Ryan Swann, Muhammad Osama, Karthik Sangaiah, Jalal Mahmud. CGO 2024
- Machine Learning in Compiler Optimisation - Zheng Wang and Michael O'Boyle, Proceedings of the IEEE, 2018
- A survey of machine learning for big code and naturalness - Miltiadis Allamanis, Earl T. Barr, Premkumar Devanbu, and Charles Sutton, ACM Computing Surveys (CSUR), 2018
- A Taxonomy of ML for Systems Problems - Martin Maas, IEEE Micro, 2020
- The Deep Learning Compiler: A Comprehensive Survey - Mingzhen Li, Yi Liu, Xiaoyan Liu, Qingxiao Sun, Xin You, Hailong Yang, Zhongzhi Luan, Lin Gan, Guangwen Yang, Depei Qian, IEEE Transactions on Parallel and Distributed Systems, 2021
- Efficient Compiler Autotuning via Bayesian Optimization - Junjie Chen, Ningxin Xu, Peiqi Chen, Hongyu Zhang. ICSE 2021.
- Customized Monte Carlo Tree Search for LLVM/Polly's Composable Loop Optimization Transformations - Jaehoon Koo, Prasanna Balaprakash, Michael Kruse, Xingfu Wu, Paul Hovland, Mary Hall. Arxiv.org, 2021.
- Static Neural Compiler Optimization via Deep Reinforcement Learning - Rahim Mammadli, Ali Jannesari, Felix Wolf. LLVM HPC Workshop, 2020.
- Autophase: Compiler phase-ordering for hls with deep reinforcement learning - Ali, Qijing Huang, William Moses, John Xiang, Ion Stoica, Krste Asanovic, John Wawrzynek. MLSys 2020.
- Micomp: Mitigating the compiler phase-ordering problem using optimization sub-sequences and machine learning - Amir H. Ashouri, Andrea Bignoli, Gianluca Palermo, Cristina Silvano, Sameer Kulkarni, and John Cavazos. ACM Transactions on Architecture and Code Optimization (TACO) 2017.
- Learning to superoptimize programs - Rudy Bunel, Alban Desmaison, M. Pawan Kumar, Philip H.S. Torr, Pushmeet Kohlim. ICLR 2017
- Mitigating the compiler optimization phase-ordering problem using machine learning - Sameer Kulkarni and John Cavazos. OOPSLA 2012.
- An evaluation of different modeling techniques for iterative compilation - Eunjung Park, Sameer Kulkarni, and John Cavazos. CASES 2011.
- Evaluating iterative optimization across 1000 datasets - Yang Chen, Yuanjie Huang, Lieven Eeckhout, Grigori Fursin, Liang Peng, Olivier Temam, and Chengyong Wu. PLDI 2010
- Cole: compiler optimization level exploration - Kenneth Hoste and Lieven Eeckhout. CGO 2008.
- MILEPOST GCC: machine learning based research compiler - Grigori Fursin, Cupertino Miranda, Olivier Temam, Mircea Namolaru, Elad Yom-Tov, Ayal Zaks, Bilha Mendelson et al., 2008
- Evaluating heuristic optimization phase order search algorithms - J. W. Davidson, Gary S. Tyson, D. B. Whalley, and P. A. Kulkarni. CGO 2007.
- Rapidly selecting good compiler optimizations using performance counters - John Cavazos, Grigori Fursin, Felix Agakov, Edwin Bonilla, Michael FP O'Boyle, and Olivier Temam. CGO 2007.
- Method-specific dynamic compilation using logistic regression - John Cavazos and Michael FP O'boyle. OOPSLA 2005.
- RL4ReAl: Reinforcement Learning for Register Allocation - S. VenkataKeerthy, Siddharth Jain, Anilava Kundu, Rohit Aggarwal, Albert Cohen, Ramakrishna Upadrasta. CC 2023.
- Reinforcement Learning assisted Loop Distribution for Locality and Vectorization - Shalini Jain, S. VenkataKeerthy, Rohit Aggarwal, Tharun Kumar Dangeti, Dibyendu Das, Ramakrishna Upadrasta. LLVM HPC Workshop 2022.
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- AI Powered Compiler Techniques for DL Code Optimization - Sanket Tavarageri, Gagandeep Goyal, Sasikanth Avancha, Bharat Kaul, Ramakrishna Upadrasta. Arxiv.org, 2021.
- VeGen: A Vectorizer Generator for SIMD and Beyond - Yishen Chen, Charith Mendis, Michael Carbin, Saman Amarasinghe. ASPLOS 2021.
- Deep Learning-based Hybrid Graph-Coloring Algorithm for Register Allocation - Dibyendu Das, Shahid Asghar Ahmad, Kumar Venkataramanan. LLVM HPC Workshop, 2020.
- NeuroVectorizer: end-to-end vectorization with deep reinforcement learning - Ameer Haj-Ali, Nesreen K. Ahmed, Ted Willke, Yakun Sophia Shao, Krste Asanovic, and Ion Stoica. CGO 2020.
- Unleashing the Power of Learning: An Enhanced Learning-Based Approach for Dynamic Binary Translation - Changheng Song, Wenwen Wang, Pen-Chung Yew, Antonia Zhai, Weihua Zhang. USENIX ATC 2019.
- Compiler Auto-Vectorization with Imitation Learning - Charith Mendis, Cambridge Yang, Yewen Pu, Saman P. Amarasinghe, Michael Carbin. NeurIPS 2019.
- Automatic tuning of inlining heuristics - John Cavazos and Michael O'Boyle. SC 2005.
- Inducing heuristics to decide whether to schedule - John Cavazos and J. Eliot B. Moss. PLDI 2003.
- Meta optimization: Improving compiler heuristics with machine learning - Mark Stephenson, Saman Amarasinghe, Martin Martin, and Una-May O'Reilly. PLDI 2003.
- Learning to schedule straight-line code - J. Eliot B. Moss, Paul E. Utgoff, John Cavazos, Doina Precup, Darko Stefanovic, Carla E. Brodley, and David Scheeff. NeurIPS 1998.
- BaCO: A Fast and Portable Bayesian Compiler Optimization Framework - Erik Hellsten, Artur Souza, Johannes Lenfers, Rubens Lacouture, Olivia Hsu, Adel Ejjeh, Fredrik Kjolstad, Michel Steuwer, Kunle Olukotun, Luigi Nardi. ASPLOS 2024.
- A Flexible Approach to Autotuning Multi-Pass Machine Learning Compilers - Phitchaya Mangpo Phothilimthana, Amit Sabne, Nikhil Sarda, Karthik Srinivasa Murthy, Yanqi Zhou, Christof Angermueller, Mike Burrows, Sudip Roy, Ketan Mandke, Rezsa Farahani, Yu Emma Wang, Berkin Ilbeyi, Blake Hechtman, Bjarke Roune, Shen Wang, Yuanzhong Xu, and Samuel J. Kaufman. PACT 2021.
- Value Learning for Throughput Optimization of Deep Neural Workloads - Benoit Steiner, Chris Cummins, Horace He, Hugh Leather. MLSys 2021.
- DynaTune: Dynamic Tensor Program Optimization in Deep Neural NetworkCompilation - Minjia Zhang, Menghao Li, Chi Wang, Mingqin Li. ICLR 2021.
- Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning - Shauharda Khadka, Estelle Aflalo, Mattias Mardar, Avrech Ben-David, Santiago Miret, Shie Mannor, Tamir Hazan, Hanlin Tang, Somdeb Majumdar. ICLR 2021.
- Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation - Byung Hoon Ahn, Prannoy Pilligundla, Amir Yazdanbakhsh, Hadi Esmaeilzadeh. ICLR 2020.
- Ansor: Generating High-Performance Tensor Programs for Deep Learning - Lianmin Zheng, Chengfan Jia, Minmin Sun, Zhao Wu, Cody Hao Yu, Ameer Haj-Ali, Yida Wang, Jun Yang, Danyang Zhuo, Koushik Sen, Joseph E. Gonzalez, Ion Stoica. OSDI 2020. ([slides](https://www.usenix.org/sites/default/files/conference/protected-files/osdi20_slides_zheng.pdf), [presentation](https://www.youtube.com/watch?v=A2hJ_Mj02zk))
- A Pattern Based Algorithmic Autotuner for Graph Processing on GPUs - Ke Meng, Jiajia Li, Guangming Tan, Ninghui Sun. PPoPP 2019.
- FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search - Bichen Wu, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Peter Vajda, Yangqing Jia, Kurt Keutzer. CVPR 2019.
- TVM: An automated end-to-end optimizing compiler for deep learning - Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan et al., OSDI 2018
- BOAT: Building auto-tuners with structured Bayesian optimization - Valentin Dalibard, Michael Schaarschmidt, and Eiko Yoneki, WWW 2017.
- Autotuning algorithmic choice for input sensitivity - Yufei Ding, Jason Ansel, Kalyan Veeramachaneni, Xipeng Shen, Una-May O'Reilly, and Saman Amarasinghe. PLDI 2015
- Fast: A fast stencil autotuning framework based on an optimal-solution space model - Yulong Luo, Guangming Tan, Zeyao Mo, and
- Reinforcement learning-based inter-and intra-application thermal optimization for lifetime improvement of multicore systems - Anup K Das, Rishad Ahmed Shafik, Geoff V Merrett, Bashir M Al-Hashimi, Akash Kumar, Bharadwaj Veeravalli. DAC 2014
- Taming parallel I/O complexity with auto-tuning - Babak Behzad, Huong Vu Thanh Luu, Joseph Huchette, Surendra Byna, Ruth Aydt, Quincey Koziol, and Marc Snir. SC 2013.
- A multi-objective auto-tuning framework for parallel codes - Herbert Jordan, Peter Thoman, Juan J. Durillo, Simone Pellegrini, Philipp Gschwandtner, Thomas Fahringer, and Hans Moritsch. SC 2012.
- Bandit-based optimization on graphs with application to library performance tuning - Frédéric De Mesmay, Arpad Rimmel, Yevgen Voronenko, and Markus Püschel. ICML 2009.
- Combining models and guided empirical search to optimize for multiple levels of the memory hierarchy - Chun Chen, Jacqueline Chame, and Mary Hall. CGO 2005
- Active harmony: towards automated performance tuning - Cristian Tapus , I-Hsin Chung , Jeffrey K. Hollingsworth. SC 2002
- Exploration of Convolutional Neural Network models for source code classification - Francesco Barchi, Emanuele Parisi, Gianvito Urgese, Elisa Ficarra, and Andrea Acquaviva. Engineering Applications of Artificial Intelligence, January 2021.
- Poise: Balancing thread-level parallelism and memory system performance in GPUs using machine learning - Saumay Dublish, Vijay Nagarajan, and Nigel Tophama. HPCA 2019.
- Data and thread placement in NUMA architectures: A statistical learning approach - Nicolas Denoyelle, Brice Goglin, Emmanuel Jeannot, and Thomas Ropars. ICPP 2019.
- Code Mapping in Heterogeneous Platforms Using Deep Learning and LLVM-IR - Francesco Barchi, Gianvito Urgese, Enrico Macii, and Andrea Acquaviva. DAC 2019.
- Adaptive optimization for OpenCL programs on embedded heterogeneous systems - Ben Taylor, Vicent Sanz Marco, and Zheng Wang. LCTES 2017.
- Improving spark application throughput via memory aware task co-location: A mixture of experts approach - Vicent Sanz Marco, Ben Taylor, Barry Porter, and Zheng Wang. Middleware 2017.
- Smart multi-task scheduling for OpenCL programs on CPU/GPU heterogeneous platforms - Yuan Wen, Zheng Wang, and Michael FP O'Boyle. HiPC 2015.
- Quasar: resource-efficient and QoS-aware cluster management - Christina Delimitrou, and Christos Kozyrakis. ASPLOS 2014.
- Automatic and portable mapping of data parallel programs to opencl for gpu-based heterogeneous systems - Zheng Wang, Dominik Grewe, and Michael O'boyle. ACM Transactions on Architecture and Code Optimization (TACO), 2014.
- Portable Performance on Heterogeneous Architectures - Phitchaya Mangpo Phothilimthana, Jason Ansel, Jonathan Ragan-Kelley, Saman Amarasinghe. ASPLOS 2013.
- Partitioning streaming parallelism for multi-cores: a machine learning based approach - Zheng Wang and Michael O'Boyle. PACT 2010.
- Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping - Chi-Keung Luk, Sunpyo Hong, and Hyesoon Kim. MICRO 2009.
- Mapping parallelism to multi-cores: a machine learning based approach - Zheng Wang and Michael O'Boyle. PPoPP 2009.
- Tensor Program Optimization with Probabilistic Programs - Junru Shao, Xiyou Zhou, Siyuan Feng, Bohan Hou, Ruihang Lai, Hongyi Jin, Wuwei Lin, Masahiro Masuda, Cody Hao Yu, Tianqi Chen. NeurIPS 2022
- Learning Nonlinear Loop Invariants with Gated Continuous Logic Networks - J. Yao, G. Ryan, J. Wong, S. Jana, and R. Gu. PLDI 2020.
- Learning-based Memory Allocation for C++ Server Workloads - Maas, Martin, David G. Andersen, Michael Isard, Mohammad Mahdi Javanmard, Kathryn S. McKinley, and Colin Raffel. ASPLOS 2020. [presetnation](https://www.youtube.com/watch?v=gs8m5W-xdDM&feature=emb_title)
- Bridging the gap between deep learning and sparse matrix format selection - Yue Zhao, Jiajia Li, Chunhua Liao and Xipeng Shen. PPoPP 2018.
- Camel: Smart, Adaptive Energy Optimization for Mobile Web Interactions - Jie Ren, Y. Lu, Petteri Nurmi, Xiaoming Wang, Miao Ma, Ling Gao, Zhanyong Tang, Jie Zheng, and Zheng Wang. INFOCOM 2020.
- Optimizing sorting with genetic algorithms - Xiaoming Li, Maria Jesus Garzaran, and David Padua. CGO 2005.
- PetaBricks: a language and compiler for algorithmic choice - Jason Ansel, Cy Chan, Yee Lok Wong, Marek Olszewski, Qin Zhao, Alan Edelman, and Saman Amarasinghe. PLDI 2009.
- Learning Compiler Pass Orders using Coreset and Normalized Value Prediction - Youwei Liang, Kevin Stone, Ali Shameli, Chris Cummins, Mostafa Elhoushi, Jiadong Guo, Benoit Steiner, Xiaomeng Yang, Pengtao Xie, Hugh Leather, Yuandong Tian. ICML 2023.
- POSET-RL: Phase ordering for Optimizing Size and Execution Time using Reinforcement Learning - Shalini Jain, Yashas Andaluri, S. VenkataKeerthy, Ramakrishna Upadrasta. ISPASS 2022.
- MLGO: a Machine Learning Guided Compiler Optimizations Framework - Mircea Trofin, Yundi Qian, Eugene Brevdo, Zinan Lin, Krzysztof Choromanski, David Li. arXiv. [Code](https://github.com/google/ml-compiler-opt)
- - Anderson Faustino da Silva, Bruno Conde Kind, Jose Wesley de Souza Magalhaes, Jeronimo Nunes Rocha, Breno Campos Ferreira Guimaraes, Fernando Magno Quintao Pereira. CGO 2021. [Code and Data](http://cuda.dcc.ufmg.br/angha/home)
- Reinforcement Learning Guided Software Debloating - Nham Le Van, Ashish Gehani, Arie Gurfinkel, Susmit Jha, and Jorge A. Navas. MLSys 2019.
- Optimizing for reduced code space using genetic algorithms - Keith D. Cooper, Philip J. Schielke, and Devika Subramanian. LCTES 1999.
- TLP: A Deep Learning-Based Cost Model for Tensor Program Tuning - Yi Zhai, Yu Zhang, Shuo Liu, Xiaomeng Chu, Jie Peng, Jianmin Ji, Yanyong Zhang, ASPLOS, 2023.
- Performance-Detective: Automatic Deduction of Cheap and Accurate Performance Models - Larissa Schmid, Marcin Copik, Alexandru Calotoiu, Dominik Werle, Andreas Reiter, Michael Selzer, Anne Koziolek, Torsten Hoefler, ICS, 2022.
- A Deep Learning Based Cost Model for Automatic Code Optimization - Riyadh Baghdadi, Massinissa Merouani, Mohamed-Hicham LEGHETTAS, Kamel Abdous, Taha Arbaoui, Karima BENATCHBA, Saman amarasinghe, MLSys 2021
- Comparative Code Structure Analysis using Deep Learning for Performance Prediction - Nathan Pinnow, Tarek Ramadan, Tanzima Z. Islam, Chase Phelps, Jayaraman J. Thiagarajan, ISPASS 2021
- PMEvo: Portable Inference of Port Mappings for Out-of-Order Processors by Evolutionary Optimization - Fabian Ritter, Sebastian Hack. PLDI 2020.
- Ithemal: Accurate, portable and fast basic block throughput estimation using deep neural networks - Charith Mendis, Alex Renda, Saman Amarasinghe, and Michael Carbin. ICML 2019.
- Absinthe: Learning an Analytical Performance Model to Fuse and Tile Stencil Codes in One Shot - Tobias Gysi, Tobias Grosser, and Torsten Hoefler. PACT 2019.
- Predicting new workload or CPU performance by analyzing public datasets - Yu Wang, Victor Lee, Gu-Yeon Wei, and David Brooks. ACM Transactions on Architecture and Code Optimization (TACO), 2019.
- Automatic creation of tile size selection models - Tomofumi Yuki, Lakshminarayanan Renganarayanan, Sanjay Rajopadhye, Charles Anderson, Alexandre E. Eichenberger, and Kevin O'Brien. CGO 2010.
- Microarchitecture sensitive empirical models for compiler optimizations - Kapil Vaswani, Matthew J. Thazhuthaveetil, Y. N. Srikant, and P. J. Joseph. CGO 2007.
- Program Representations for Predictive Compilation: State of Affairs in the Early 20's - Anderson Faustino da Silva, Edson Borin, Fernando Magno Quintao Pereira, Nilton Luiz Queiroz Junior and Otavio Oliveira Napoli. JCL 2022. [Code and Data](https://github.com/otavioon/COLA-2022-Tools)
- Comparative Code Structure Analysis using Deep Learning for Performance Prediction - DNathan Pinnow, Tarek Ramadan, Tanzima Z. Islam, Chase Phelps, Jayaraman J. Thiagarajan. ISPASS 2021.
- CodeBERT:A Pre-Trained Model for Programming and Natural Languages - Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, Ming Zhou. EMNLP 2020.
- Deep Program Structure Modeling Through Multi-Relational Graph-based Learning - Guixin Ye, Zhanyong Tang, Huanting Wang, Jianbin Fang, Songfang Huang and Zheng Wang. PACT 2020.
- Global Relational Models of Source Code - Vincent J. Hellendoorn, Charles Sutton, Rishabh Singh, Petros Maniatis, David Bieber, ICLR 2020. ([Data and Code](https://github.com/VHellendoorn/ICLR20-Great))
- Learning Semantic Program Embeddings with Graph Interval Neural Network - Yu Wang, Ke Wang, Fengjuan Gao, and Linzhang Wang. OOPSLA 2020.
- Flow2Vec: Value-Flow-Based Precise Code Embedding - Yulei Sui, Xiao Cheng, Guanqin Zhang and Haoyu Wang. OOPSLA 2020.
- MISIM: An End-to-End Neural Code Similarity System - Fangke Ye, Shengtian Zhou, Anand Venkat, Ryan Marcus, Nesime Tatbul, Jesmin Jahan Tithi, Paul Petersen, Timothy Mattson, Tim Kraska, Pradeep Dubey, Vivek Sarkar and Justin Gottschlich . arXiv 2020.
- LambdaNet: Probabilistic Type Inference using Graph Neural Networks - Jiayi Wei, Maruth Goyal, Greg Durrett, and Isil Dillig. ICLR 2020.
- Generative Code Modeling with Graphs - Marc Brockschmidt, Miltos Allamanis, Alexander L. Gaunt, and Oleksandr Polozov. ICLR 2019.
- code2seq: Generating sequences from structured representations of code - Uri Alon, Shaked Brody, Omer Levy, and Eran Yahav. ICLR 2019.
- code2vec: Learning distributed representations of code - Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. POPL 2019.
- COSET: A Benchmark for Evaluating Neural Program Embeddings - Ke Wang, Mihai Christodorescu. arXiv 2019.
- Learning to Represent Programs with Graphs - Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. ICLR 2018.
- Neural Code Comprehension: A Learnable Representation of Code Semantics - Tal Ben-Nun, Alice Shoshana Jakobovits, and Torsten Hoefler. NeurIPS 2018.
- End-to-end deep learning of optimization heuristics - Chris Cummins, Pavlos Petoumenos, Zheng Wang, and Hugh Leather ([slides](https://speakerdeck.com/chriscummins/end-to-end-deep-learning-of-optimization-heuristics-pact-17)). PACT 2017.
- Semantic-aware program sampling - Pratiksha Thaker, Daniel Tarlow, and Marc Brockschmidt. NeurIPS 2017.
- DeepCoder: Learning to write programs - Matej Balog, Alexander L. Gaunt, Marc Brockschmidt,
- Convolutional neural networks over tree structures for programming language processing - Lili Mou, Ge Li, Lu Zhang, Tao Wang, and Zhi Jin. AAAI 2016.
- Structured Generative Models of Natural Source Code - Chris Maddison and Daniel Tarlow. ICML 2014.
- Using graph-based program characterization for predictive modeling - Eunjung Park, John Cavazos, and Marco A. Alvarez. CGO 2011.
- Automatic feature generation for machine learning based optimizing compilation - Hugh Leather, Edwin Bonilla, and Michael O'Boyle. CGO 2009.
- A Game-Based Framework to Compare Program Classifiers and Evaders - Thais Damasio, Michael Canesche, Vinicius Pacheco, Anderson Faustino da Silva, Marcus Botacin and Fernando Magno Quintao Pereira. CGO 2023. [Code and Data](https://zenodo.org/record/7374649)
- - Foivos Tsimpourlas, Pavlos Petoumenos, Min Xu, Chris Cummins, Kim Hazelwood, Ajitha Rajan, Hugh Leather. PACT 2022 ([code](https://github.com/fivosts/BenchPress))
- Automating Reinforcement Learning Architecture Design for Code Optimization - Huanting Wang, Zhanyong Tang, Cheng Zhang, Jiaqi Zhao, Chris Cummins, Hugh Leather, Zheng Wang. CC 2022 ([code](https://github.com/HuantWang/SUPERSONIC))
- Composable and Modular Code Generation in MLIR: A Structured and Retargetable Approach to Tensor Compiler Construction - Nicolas Vasilache, Oleksandr Zinenko, Aart J.C. Bik, Mahesh Ravishankar, Thomas Raoux, Alexander Belyaev, Matthias Springer, Tobias Gysi, Diego Caballero, Stephan Herhut, Stella Laurenzo, Albert Cohen. arXiV 2022
- Towards Better Understanding of Black-box Auto-tuning: A Comparative Analysis for Storage Systems - Zhen Cao, Vasily Tarasov, Sachin Tiwari, and Erez Zadok. ATC 2018.
- Synthesizing Benchmarks for Predictive Modeling - Chris Cummins, Pavlos Petoumenos, Zheng Wang, and Hugh Leather ([slides](https://speakerdeck.com/chriscummins/synthesizing-benchmarks-for-predictive-modelling-cgo-17)). CGO 2017.
- Minimizing the cost of iterative compilation with active learning - William Ogilvie, Pavlos Petoumenos, Zheng Wang, and Hugh Leather. CGO 2017.
- Mapping Computations in Heterogeneous Multicore Systems with Statistical Regression on Program Inputs - Junio Cezar Ribeiro Da Silva, Lorena Leao, Vinicius Petrucci, Abdoulaye Gamatie and Fernando Magno Quintao Pereira. TECS 2021.
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- A multi-objective auto-tuning framework for parallel codes - Herbert Jordan, Peter Thoman, Juan J. Durillo, Simone Pellegrini, Philipp Gschwandtner, Thomas Fahringer, and Hans Moritsch. SC 2012.
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- A Collaborative Filtering Approach for the Automatic Tuning of Compiler Optimisations - Stefano Cereda, Gianluca Palermo, Paolo Cremonesi, and Stefano Doni, LCTES 2020.
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Neural Network-based Performance Prediction for Task Migration on S-NUCA Many-Cores - Martin Rapp, Anuj Pathania, Tulika Mitra, Jörg Henkel, IEEE Transactions on Computers, 2021.
- GraphCodeBERT: Pre-training Code Representations with Data Flow - Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie LIU, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, Michele Tufano, Shao Kun Deng, Colin Clement, Dawn Drain, Neel Sundaresan, Jian Yin, Daxin Jiang, Ming Zhou. ICLR 2021.
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- (De/Re)-Compositions Expressed Systematically via MDH-Based Schedules - Ari Rasch , Richard Schulze , Denys Shabalin , Anne Elster , Sergei Gorlatch , Mary Hall. CC 2023.
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- The Next 700 ML-Enabled Compiler Optimizations - S. VenkataKeerthy, Siddharth Jain, Umesh Kalvakuntla, Pranav Sai Gorantla, Rajiv S Chitale, Eugene Brevdo, Albert Cohen, Mircea Trofin, Ramakrishna Upadrasta. CC 2024.
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Iterative Compilation Optimization Based on Metric Learning and Collaborative Filtering - Hongzhi Liu, Jie Luo, Ying Li, Zhonghai Wu. ACM TACO 2022.
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Revealing Compiler Heuristics through Automated Discovery and Optimization - Volker Seeker, Chris Cummins, Murray Cole, Björn Franke, Kim Hazelwood, Hugh Leather. CGO 2024.
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Using machine learning to focus iterative optimization - Felix Agakov, Edwin Bonilla, John Cavazos, Björn Franke, Grigori Fursin, Michael FP O'Boyle, John Thomson, Marc Toussaint, and Christopher KI Williams. CGO 2006.
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Iterative optimization in the polyhedral model: Part II, multidimensional time - Louis-Noël Pouchet, Cédric Bastoul, Albert Cohen, and John Cavazos. PLDI 2008.
- Compiler-based graph representations for deep learning models of code - Alexander Brauckmann, Andrés Goens, Sebastian Ertel, and Jeronimo Castrillon. CC 2020.
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Rapidly selecting good compiler optimizations using performance counters - John Cavazos, Grigori Fursin, Felix Agakov, Edwin Bonilla, Michael FP O'Boyle, and Olivier Temam. CGO 2007.
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- TLP: A Deep Learning-Based Cost Model for Tensor Program Tuning - Yi Zhai, Yu Zhang, Shuo Liu, Xiaomeng Chu, Jie Peng, Jianmin Ji, Yanyong Zhang, ASPLOS, 2023.
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Meta Large Language Model Compiler: Foundation Models of Compiler Optimization - Chris Cummins, Volker Seeker, Dejan Grubisic, Baptiste Roziere, Jonas Gehring, Gabriel Synnaeve, Hugh Leather. Arxiv 2024.
- Fast searches for effective optimization phase sequences - Prasad Kulkarni, Stephen Hines, Jason Hiser, David Whalley, Jack Davidson, and Douglas Jones. PLDI 2004.
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- A Reinforcement Learning Environment for Polyhedral Optimizations - Alexander Brauckmann, Andrés Goens, Jeronimo Castrillon. PACT, 2021.
- Integrating Profile-Driven Parallelism Detection and Machine-Learning-Based Mapping - Zheng Wang, Georgios Tournavitis, Björn Franke, and Michael FP O'boyle. ACM Transactions on Architecture and Code Optimization (TACO), 2014.
- Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines - Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe, PLDI 2013.
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Transfer-Tuning: Reusing Auto-Schedules for Efficient Tensor Program Code Generation - Perry Gibson, Jose Cano. PACT 2022.
- An Active Learning Method for Empirical Modeling in Performance Tuning - Jiepeng Zhang, Jingwei Sun, Wenju Zhou, Guangzhong Sun. IPDPS 2020.
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Continuous learning of compiler heuristics - Michele Tartara and Stefano Crespi Reghizzi. ACM Transactions on Architecture and Code Optimization (TACO), 2013.
- Bliss: auto-tuning complex applications using a pool of diverse lightweight learning models - RB Roy, T Patel, V Gadepally, D Tiwari. PLDI 2021.
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Opentuner: An extensible framework for program autotuning - Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom, Una-May O'Reilly, and Saman Amarasinghe. PACT 2014
- Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
- Accelerated Auto-Tuning of GPU Kernels for Tensor Computations - Chendi Li and Yufan Xu and Sina Mahdipour Saravani and P. Sadayappan. ICS 2024.
- Autotuning Search Space for Loop Transformations - Michael Kruse, Hal Finkel, Xingfu Wu. LLVM HPC Workshop, 2020.
-
Memory/Cache Modeling/Analysis
- Learning Memory Access Patterns - Milad Hashemi, Kevin Swersky, Jamie A. Smith, Grant Ayers, Heiner Litz, Jichuan Chang, Christos Kozyrakis, Parthasarathy Ranganathan. ICML 2018
- Optimizing Memory Mapping Using Deep Reinforcement Learning - Pengming Wang, Mikita Sazanovich, Berkin Ilbeyi, Phitchaya Mangpo Phothilimthana, Manish Purohit, Han Yang Tay, Ngân Vũ, Miaosen Wang, Cosmin Paduraru, Edouard Leurent, Anton Zhernov, Julian Schrittwieser, Thomas Hubert, Robert Tung, Paula Kurylowicz, Kieran Milan, Oriol Vinyals, Daniel J. Mankowitz. arxiv 2023.
-
-
Books
-
Memory/Cache Modeling/Analysis
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Software Automatic Tuning - From Concepts to State-of-the-Art Results - K Naono, K Teranishi, J Cavazos, and R Suda. Springer 2010.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
- Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
-
-
Talks and Tutorials
-
Software
-
Memory/Cache Modeling/Analysis
- TVM - Open Deep Learning Compiler Stack for cpu, gpu and specialized accelerators ([paper](https://www.usenix.org/system/files/osdi18-chen.pdf); [slides](https://www.usenix.org/sites/default/files/conference/protected-files/osdi18_slides_chen.pdf)).
- ONNX-MLIR - Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure ([paper](https://arxiv.org/pdf/2008.08272.pdf)).
-
-
Benchmarks and Datasets
-
Memory/Cache Modeling/Analysis
- paper
- ANGHABENCH - A suite with One Million Compilable C Benchmarks ([paper](https://homepages.dcc.ufmg.br/~fernando/publications/papers/FaustinoCGO21.pdf))
- cBench - 32 C benchmarks with datasets and driver scripts.
- PolyBench - 30 Stencil and Linear-algebra benchmarks with datasets and driver scripts. See also: [GPU version](https://github.com/cavazos-lab/PolyBench-ACC), [pre-computed datasets](https://github.com/stefanocereda/polybench_data) ([paper](https://dl.acm.org/doi/abs/10.1145/3372799.3394361)).
- DeepDataFlow - 469k LLVM-IR files and 8.6B data-flow analysis labels for classification ([paper](https://arxiv.org/pdf/2003.10536.pdf)).
- The Alberta Workloads for the SPEC CPU® 2017 Benchmark Suite - Additional workloads for the SPEC CPU2017 Benchmark Suite.
-
-
Conferences
-
Memory/Cache Modeling/Analysis
- ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI
- Architectural Support for Programming Languages and Operating Systems, ASPLOS
- Object-oriented Programming, Systems, Languages, and Applications, OOPSLA
- International Conference on Compiler Construction, CC
- International Conference on Supercomputing, ICS
- International Conference on High Performance and Embedded Architectures and Compilers, HiPEAC
- International Conference on Languages, Compilers and Tools for Embedded Systems, LCTES
- International Conference on Computing Frontiers, CF
- International Parallel and Distributed Processing Symposium, IPDPS
- Machine Learning and Programming Languages Workshop, MAPL
- Languages and Compilers for Parallel Computing, LCPC
- International Conference on Learning Representations, ICLR
- Conference on Machine Learning and Systems, MLSys
- EEE/ACM International Symposium on Microarchitecture, Micro
- International Conference on Compilers, Architectures, and Synthesis for Embedded Systems, CASES
- USENIX Annul Technical Conference, ATC
- USENIX Symposium on Operating Systems Design and Implementation, OSDI
- International Conference on High Performance Computing, Data and Analytics, HiPC
- International Conference on Virtual Execution Environments, VEE
- European Conference on Computer Systems, EuroSys
- ACM Symposium on Parallelism in Algorithms and Architectures, SPAA
- International Conference on Parallel Processing, ICPP
- International Middleware Conference, Middleware
- European Conference on Parallel Processing, Euro-Par - -->
-
-
Journals
-
Memory/Cache Modeling/Analysis
-
Categories
Sub Categories