Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/patternex/awesome-ml-for-threat-detection

A curated list of resources to deep dive into the intersection of applied machine learning and threat detection.
https://github.com/patternex/awesome-ml-for-threat-detection

List: awesome-ml-for-threat-detection

applied-machine-learning awesome-list cybersecurity machine-learning machine-learning-operations machine-learning-systems papers threat-detection

Last synced: about 1 month ago
JSON representation

A curated list of resources to deep dive into the intersection of applied machine learning and threat detection.

Awesome Lists containing this project

README

        

[![Awesome](https://awesome.re/badge.svg)](https://awesome.re)
![](https://img.shields.io/github/stars/patternex/awesome-ml-for-threat-detection)

# Awesome ML for Threat Detection

A curated list of resources to deep dive into the intersection of applied machine learning and threat detection.

## Table of Contents

- [Threat detection papers](#threat-detection-papers)
- [Threat characterization papers](#threat-characterization-papers)
- [Machine learning systems and operationalization papers](#machine-learning-systems-and-operationalization-papers)
- [PatternEx papers](#patternex-papers)
- [Other machine learning for cybersecurity repos](#other-machine-learning-for-cybersecurity-repos)

### Threat detection papers
* **Malicious URL Detection using Machine Learning: A Survey**. Doyen Sahoo, Chenghao Liu and Steven C.H. Hoi. *arXiv, 2017*. [[PDF]](https://arxiv.org/pdf/1701.07179)
* **SoK: Applying Machine Learning in Security - A Survey**. Heju Jiang, Jasvir Nagra, Parvez Ahammad. *arXiv, 2016*. [[PDF]](https://arxiv.org/pdf/1611.03186)
* **Predicting Domain Generation Algorithms with Long Short-Term Memory Networks**. Jonathan Woodbridge, Hyrum S. Anderson, Anjum Ahuja and Daniel Grant. *arXiv, 2016*. [[PDF]](https://arxiv.org/pdf/1611.00791)
* **Network connectivity graph for malicious traffic dissection**. Enrico Bocchi, Luigi Grimaudo, Marco Mellia, Elena Baralis, Sabyasachi Saha, Stanislav Miskovic, Gaspar Modelo-Howard, Sung-Ju Lee. *24th International Conference on Computer Communication and Networks (ICCCN), 2015*. [[PDF]](https://iris.polito.it/retrieve/handle/11583/2625360/76615/connectivity_graph.pdf)
* **Detecting malicious domains via graph inference**. Pratyusa K. Manadhata, Sandeep Yadav, Prasad Rao, William Horne. *ACM Conference on Computer and Communications Security, 2014*. [[PDF]](https://link.springer.com/content/pdf/10.1007/978-3-319-11203-9_1.pdf)
* **Nazca: Detecting Malware Distribution in Large-Scale Networks.** Luca Invernizzi, Stanislav Miskovic, Ruben Torres, Sabyasachi Saha, Sung-ju Lee, Marco Mellia, Christopher Kruegel and Giovanni Vigna. *NDSS Symposium, 2014*. [[PDF]](https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.438.2760&rep=rep1&type=pdf)
* **Machine learning for identifying botnet network traffic**. Matija Stevanovic and Jens Myrup Pedersen. *Aalborg University (Technical report), 2013*. [[PDF]](https://vbn.aau.dk/ws/portalfiles/portal/75720938/paper.pdf)
* **Survey on network‐based botnet detection methods**. Sebastián García, Alejandro Zunino and Marcelo Campo. *Security and Communication Networks, 2013*. [[PDF]](https://onlinelibrary.wiley.com/doi/pdf/10.1002/sec.800)
* **Detecting insider threats in a real corporate database of computer usage activity**. Ted E. Senator et al. *19th ACM SIGKDD International Conference on Knowledge Discovery and Data mining (KDD), 2013*. [[PDF]](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.480.1037&rep=rep1&type=pdf)
* **Botnet detection based on traffic behavior analysis and flow intervals**. David Zhao, Issa Traore, Bassam Sayed, Wei Lu, Sherif Saad, Ali Ghorbani, Dan Garant. *Computers & Security, 2013*. [[PDF]](https://www.researchgate.net/profile/Sherif_Saad/publication/259117704_Botnet_detection_based_on_traffic_behavior_analysis_and_flow_intervals/links/5a303435aca27271ec89f8e5/Botnet-detection-based-on-traffic-behavior-analysis-and-flow-intervals.pdf)

### Threat characterization papers
* **A Taxonomy of Network Threats and the Effect of Current Datasets on Intrusion Detection Systems**. Hanan Hindy, David Brosset, Ethan Bayne, Amar Seeam, Christos Tachtatzis, Robert Atkinson, Xavier Bellekens. *IEEE Access, 2020*. [[PDF]](https://ieeexplore.ieee.org/iel7/6287639/8948470/09108270.pdf)
* **A lustrum of malware network communication: Evolution and insights**. Chaz Lever, Platon Kotzias, Davide Balzarotti, Juan Caballero and Manos Antonakakis. *IEEE Symposium on Security and Privacy, 2017*. [[PDF]](http://www.ieee-security.org/TC/SP2017/papers/409.pdf)
* **A comprehensive measurement study of domain generating malware**. Daniel Plohmann, Khaled Yakdan, Michael Klatt, Johannes Bader, Elmar Gerhards-Padilla. *25th USENIX Security Symposium, 2016*. [[PDF]](https://www.usenix.org/system/files/conference/usenixsecurity16/sec16_paper_plohmann.pdf)
* **A Survey on Botnet Architectures, Detection and Defences.** Muhammad Mahmoud, Manjinder Nir and Ashraf Matrawy. *International Journal of Network Security, 2015*. [[PDF]](http://ijns.jalaxy.com.tw/contents/ijns-v17-n3/ijns-v17-n3.pdf#page=48)
* **Practical Comprehensive Bounds on Surreptitious Communication over DNS**. Vern Paxson, Mihai Christodorescu, Mobin Javed, Josyula Rao, Reiner Sailer, Douglas Lee Schales, and Marc Ph. Stoecklin, Kurt Thomas, Wietse Venema and Nicholas Weaver. *22nd USENIX Security Symposium, 2013*. [[PDF]](https://www.usenix.org/system/files/conference/usenixsecurity13/sec13-paper_paxson.pdf)
* **Analysis of security data from a large computing organization**. A. Sharma, Z. Kalbarczyk, J. Barlow and R. Iyer. *IEEE/IFIP 41st International Conference on Dependable Systems & Networks (DSN), 2011.* [[PDF]](http://www.academia.edu/download/40319777/Analysis_of_security_data_from_a_large_c20151123-15766-14wy5bo.pdf)

### Machine learning systems and operationalization papers
* **A survey of methods for explaining black box models**. Riccardo Guidotti profile imageRiccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, Dino Pedreschi. *ACM Computing Surveys, 2018*. [[PDF]](https://dl.acm.org/doi/pdf/10.1145/3236009)
* **Hidden Technical Debt in Machine Learning Systems**. D. Sculley, Gary Holt, Daniel Golovin, Eugene Davydov, Todd Phillips, Dietmar Ebner, Vinay Chaudhary, Michael Young, Jean-François Crespo, Dan Dennison. *Advances in Neural Information Processing Systems (NIPS), 2015*. [[PDF]](http://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf)
* **Local Outlier Detection with Interpretation**. Xuan Hong Dang, Barbora Micenková, Ira Assent and Raymond T. Ng. *European Conference on Machine Learning and Knowledge Discovery in Databases, 2013.* [[PDF]](https://link.springer.com/content/pdf/10.1007/978-3-642-40994-3_20.pdf)
* **Interpreting and unifying outlier scores**. Hans-Peter Kriegel, Peer Kroger, Erich Schubert and Arthur Zimek. *SIAM International Conference on Data Mining, 2011*. [[PDF]](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.232.2719&rep=rep1&type=pdf)
* **Outside the Closed World: On Using Machine Learning for Network Intrusion Detection**. Robin Sommer and Vern Paxson. *IEEE Symposium on Security and Privacy, 2010*. [[PDF]](https://www.icir.org/robin/papers/oakland10-ml.pdf)
* **Converting output scores from outlier detection algorithms into probability estimates**. Jing Gao and Pang-ning Tan. *International Conference on Data Mining (ICDM), 2006.* [[PDF]](https://core.ac.uk/download/pdf/193238184.pdf)

### PatternEx papers
* **The Holy Grail of “Systems for Machine Learning”: Teaming humans and machine learning for detecting cyber threats**. Ignacio Arnaldo and Kalyan Veeramachaneni. *ACM SIGKDD Explorations Newsletter 21, 2019*. [[PDF]](https://www.kdd.org/exploration_files/5._CR_18._The_challenges_in_teaming_humans_-_Final.pdf)
* **Shooting the moving target: machine learning in cybersecurity**. Ankit Arun and Ignacio Arnaldo. *USENIX Conference on Operational Machine Learning (OpML), 2019.* [[PDF]](https://www.usenix.org/system/files/opml19papers-arun.pdf)
* **eX2: a framework for interactive anomaly detection**. Ignacio Arnaldo, Kalyan Veeramachaneni, Mei Lam. *Intelligent User Interfaces Workshops, 2019*. [[PDF]](http://ceur-ws.org/Vol-2327/IUI19WS-ESIDA-2.pdf)
* **Acquire, adapt, and anticipate: continuous learning to block malicious domains**. Ignacio Arnaldo, Ankit Arun, Sumeeth Kyathanahalli, Kalyan Veeramachaneni. *IEEE international conference on Big Data, 2018*. [[IEEE Link]](https://ieeexplore.ieee.org/document/8622197)
* **Learning representations for log data in cybersecurity**. Ignacio Arnaldo, Alfredo Cuesta-Infante, Ankit Arun, Mei Lam, Costas Bassias and Kalyan Veeramachaneni. *International Conference on Cyber Security Cryptography and Machine Learning, 2017*. [[PDF]](https://dai.lids.mit.edu/wp-content/uploads/2018/02/2017_CSCML_Learning_log_representations_camera_ready_v2-3-1-1.pdf)
* **AI2: Training a Big Data Machine to Defend**. Kalyan Veeramachaneni, Ignacio Arnaldo, Vamsi Korrapati, Constantinos Bassias and Ke Li. *2nd IEEE International Conference on Big Data Security on Cloud, 2016*. [[PDF]](https://dai.lids.mit.edu/wp-content/uploads/2017/10/AI2_Paper.pdf)

### Other machine learning for cybersecurity repos
* [Awesome Machine Learning for Cyber Security](https://github.com/jivoi/awesome-ml-for-cybersecurity)
* [Awesome Machine Learning And Cybersecurity](https://github.com/mebiux/Awesome-ML-Cybersecurity)
* [Machine Learning for Cyber Security](https://github.com/wtsxDev/Machine-Learning-for-Cyber-Security)
* [Machine Learning and Cyber Security Resources](https://github.com/dleyanlin/Machine-Learning-and-Cyber-Security-Resources)

## Note

The intial intent was to create a repo pointing to our own papers only (PatternEx papers) but we thought it made sense to also include papers that shaped our understanding of this space, enjoy!