Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/eric-erki/awesome-ml-for-cybersecurity

Machine Learning for Cyber Security
https://github.com/eric-erki/awesome-ml-for-cybersecurity

List: awesome-ml-for-cybersecurity

Last synced: 3 months ago
JSON representation

Machine Learning for Cyber Security

Awesome Lists containing this project

README

        

# Awesome Machine Learning for Cyber Security [![Awesom](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome)

[](https://github.com/jivoi/awesome-ml-for-cybersecurity)

A curated list of amazingly awesome tools and resources related to the use of machine learning for cyber security.

## Table of Contents

- [Datasets](#-datasets)
- [Papers](#-papers)
- [Books](#-books)
- [Talks](#-talks)
- [Tutorials](#-tutorials)
- [Courses](#-courses)
- [Miscellaneous](#-miscellaneous)

## [↑](#table-of-contents) Contributing

Please read [CONTRIBUTING](./CONTRIBUTING.md) if you wish to add tools or resources.

## [↑](#table-of-contents) Datasets

* [Samples of Security Related Data](http://www.secrepo.com/)
* [DARPA Intrusion Detection Data Sets](https://www.ll.mit.edu/r-d/datasets) [ [1998](https://www.ll.mit.edu/r-d/datasets/1998-darpa-intrusion-detection-evaluation-dataset) / [1999](https://www.ll.mit.edu/r-d/datasets/1999-darpa-intrusion-detection-evaluation-dataset) ]
* [Stratosphere IPS Data Sets](https://stratosphereips.org/category/dataset.html)
* [Open Data Sets](http://csr.lanl.gov/data/)
* [Data Capture from National Security Agency](http://www.westpoint.edu/crc/SitePages/DataSets.aspx)
* [The ADFA Intrusion Detection Data Sets](https://www.unsw.adfa.edu.au/australian-centre-for-cyber-security/cybersecurity/ADFA-IDS-Datasets/)
* [NSL-KDD Data Sets](https://github.com/defcom17/NSL_KDD)
* [Malicious URLs Data Sets](http://sysnet.ucsd.edu/projects/url/)
* [Multi-Source Cyber-Security Events](http://csr.lanl.gov/data/cyber1/)
* [KDD Cup 1999 Data](http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html)
* [Web Attack Payloads](https://github.com/foospidy/payloads)
* [WAF Malicious Queries Data Sets](https://github.com/faizann24/Fwaf-Machine-Learning-driven-Web-Application-Firewall)
* [Malware Training Data Sets](https://github.com/marcoramilli/MalwareTrainingSets)
* [Aktaion Data Sets](https://github.com/jzadeh/Aktaion/tree/master/data)
* [CRIME Database from DeepEnd Research](https://www.dropbox.com/sh/7fo4efxhpenexqp/AADHnRKtL6qdzCdRlPmJpS8Aa/CRIME?dl=0)
* [Publicly available PCAP files](http://www.netresec.com/?page=PcapFiles)
* [2007 TREC Public Spam Corpus](https://plg.uwaterloo.ca/~gvcormac/treccorpus07/)
* [Drebin Android Malware Dataset](https://www.sec.cs.tu-bs.de/~danarp/drebin/)
* [PhishingCorpus Datset](https://monkey.org/~jose/phishing/)
* [EMBER](https://github.com/endgameinc/ember)
* [Vizsec Research](https://vizsec.org/data/)
* [SHERLOCK](http://bigdata.ise.bgu.ac.il/sherlock/index.html#/)
* [Probing / Port Scan - Dataset ](https://github.com/gubertoli/ProbingDataset)
* [Aegean Wireless Intrusion Dataset (AWID)](http://icsdweb.aegean.gr/awid/)

## [↑](#table-of-contents) Papers

* [Fast, Lean, and Accurate: Modeling Password Guessability Using Neural Networks](https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/melicher)
* [Outside the Closed World: On Using Machine Learning for Network Intrusion Detection](http://ieeexplore.ieee.org/document/5504793/?reload=true)
* [Anomalous Payload-Based Network Intrusion Detection](https://link.springer.com/chapter/10.1007/978-3-540-30143-1_11)
* [Malicious PDF detection using metadata and structural features](http://dl.acm.org/citation.cfm?id=2420987)
* [Adversarial support vector machine learning](https://dl.acm.org/citation.cfm?id=2339697)
* [Exploiting machine learning to subvert your spam filter](https://dl.acm.org/citation.cfm?id=1387709.1387716)
* [CAMP – Content Agnostic Malware Protection](http://www.covert.io/research-papers/security/CAMP%20-%20Content%20Agnostic%20Malware%20Protection.pdf)
* [Notos – Building a Dynamic Reputation System for DNS](http://www.covert.io/research-papers/security/Notos%20-%20Building%20a%20dynamic%20reputation%20system%20for%20dns.pdf)
* [Kopis – Detecting malware domains at the upper dns hierarchy](http://www.covert.io/research-papers/security/Kopis%20-%20Detecting%20malware%20domains%20at%20the%20upper%20dns%20hierarchy.pdf)
* [Pleiades – From Throw-away Traffic To Bots – Detecting The Rise Of DGA-based Malware](http://www.covert.io/research-papers/security/From%20throw-away%20traffic%20to%20bots%20-%20detecting%20the%20rise%20of%20dga-based%20malware.pdf)
* [EXPOSURE – Finding Malicious Domains Using Passive DNS Analysis](http://www.covert.io/research-papers/security/Exposure%20-%20Finding%20malicious%20domains%20using%20passive%20dns%20analysis.pdf)
* [Polonium – Tera-Scale Graph Mining for Malware Detection](http://www.covert.io/research-papers/security/Polonium%20-%20Tera-Scale%20Graph%20Mining%20for%20Malware%20Detection.pdf)
* [Nazca – Detecting Malware Distribution in Large-Scale Networks](http://www.covert.io/research-papers/security/Nazca%20-%20%20Detecting%20Malware%20Distribution%20in%20Large-Scale%20Networks.pdf)
* [PAYL – Anomalous Payload-based Network Intrusion Detection](http://www.covert.io/research-papers/security/PAYL%20-%20Anomalous%20Payload-based%20Network%20Intrusion%20Detection.pdf)
* [Anagram – A Content Anomaly Detector Resistant to Mimicry Attacks](http://www.covert.io/research-papers/security/Anagram%20-%20A%20Content%20Anomaly%20Detector%20Resistant%20to%20Mimicry%20Attack.pdf)
* [Applications of Machine Learning in Cyber Security](https://www.researchgate.net/publication/283083699_Applications_of_Machine_Learning_in_Cyber_Security)
* [Data Mining для построения систем обнаружения сетевых атак (RUS)](http://vak.ed.gov.ru/az/server/php/filer.php?table=att_case&fld=autoref&key%5B%5D=100003407)
* [Выбор технологий Data Mining для систем обнаружения вторжений в корпоративную сеть (RUS)](http://engjournal.ru/articles/987/987.pdf)
* [Нейросетевой подход к иерархическому представлению компьютерной сети в задачах информационной безопасности (RUS)](http://engjournal.ru/articles/534/534.pdf)
* [Методы интеллектуального анализа данных и обнаружение вторжений (RUS)](http://vestnik.sibsutis.ru/uploads/1459329553_3576.pdf)
* [Dimension Reduction in Network Attacks Detection Systems](http://elib.bsu.by/bitstream/123456789/120105/1/v17no3p284.pdf)
* [Rise of the machines: Machine Learning & its cyber security applications](https://www.nccgroup.trust/globalassets/our-research/uk/whitepapers/2017/rise-of-the-machines-preliminaries-wp-new-template-final_web.pdf)
* [Machine Learning in Cyber Security: Age of the Centaurs](https://go.recordedfuture.com/hubfs/white-papers/machine-learning.pdf)
* [Automatically Evading Classifiers A Case Study on PDF Malware Classifiers](https://www.cs.virginia.edu/~evans/pubs/ndss2016/)
* [Weaponizing Data Science for Social Engineering — Automated E2E Spear Phishing on Twitter](https://www.blackhat.com/docs/us-16/materials/us-16-Seymour-Tully-Weaponizing-Data-Science-For-Social-Engineering-Automated-E2E-Spear-Phishing-On-Twitter.pdf)
* [Machine Learning: A Threat-Hunting Reality Check](https://s3-eu-central-1.amazonaws.com/evermade-fsecure-assets/wp-content/uploads/2019/09/17153425/countercept-whitepaper-machine-learning.pdf)
* [Neural Network-based Graph Embedding for Cross-Platform Binary Code Similarity Detection](https://arxiv.org/abs/1708.06525)
* [Practical Secure Aggregation for Privacy-Preserving Machine Learning](https://eprint.iacr.org/2017/281.pdf)
* [DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning](https://acmccs.github.io/papers/p1285-duA.pdf)
* [eXpose: A Character-Level Convolutional Neural Network with Embeddings For Detecting Malicious URLs, File Paths and Registry Keys](https://arxiv.org/pdf/1702.08568.pdf)
* [Big Data Technologies for Security Event Correlation Based on Event Type Accounting (RUS)](http://cyberrus.com/wp-content/uploads/2018/02/2-16-524-17_1.-Kotenko.pdf)
* [Investigation of The Use of Neural Networks for Detecting Low-Intensive Ddоs-Atak of Applied Level (RUS)](http://cyberrus.com/wp-content/uploads/2018/02/23-29-524-17_3.-Tarasov.pdf)
* [Detecting Malicious PowerShell Commands using Deep Neural Networks](https://arxiv.org/pdf/1804.04177.pdf)
* [Machine Learning DDoS Detection for Consumer Internet of Things Devices](https://arxiv.org/pdf/1804.04159.pdf)
* [Anomaly Detection in Computer System
by Intellectual Analysis of System Journals (RUS)](http://cyberrus.com/wp-content/uploads/2018/06/33-43-226-18_4.-Sheluhin.pdf)
* [EMBER: An Open Dataset for Training Static PE Malware Machine Learning Models](https://arxiv.org/pdf/1804.04637.pdf)
* [A state-of-the-art survey of malware detection approaches using data mining techniques.](https://link.springer.com/article/10.1186/s13673-018-0125-x)
* [Investigation of malicious portable executable file detection on network using supervised learning techniques.](https://www.researchgate.net/publication/318665164_Investigation_of_malicious_portable_executable_file_detection_on_the_network_using_supervised_learning_techniques)
* [Machine Learning in Cybersecurity: A Guide](https://resources.sei.cmu.edu/library/asset-view.cfm?assetid=633583)
* [Outside the Closed World: On Using Machine Learning For Network Intrusion Detection](https://personal.utdallas.edu/~muratk/courses/dmsec_files/oakland10-ml.pdf)

## [↑](#table-of-contents) Books

* [Data Mining and Machine Learning in Cybersecurity](https://www.amazon.com/Data-Mining-Machine-Learning-Cybersecurity/dp/1439839425)
* [Machine Learning and Data Mining for Computer Security](https://www.amazon.com/Machine-Learning-Mining-Computer-Security/dp/184628029X)
* [Network Anomaly Detection: A Machine Learning Perspective](https://www.amazon.com/Network-Anomaly-Detection-Learning-Perspective/dp/1466582081)
* [Machine Learning and Security: Protecting Systems with Data and Algorithms](https://www.amazon.com/Machine-Learning-Security-Protecting-Algorithms/dp/1491979909)
* [Introduction To Artificial Intelligence For Security Professionals](https://github.com/cylance/IntroductionToMachineLearningForSecurityPros/blob/master/IntroductionToArtificialIntelligenceForSecurityProfessionals_Cylance.pdf)
* [Mastering Machine Learning for Penetration Testing](https://www.packtpub.com/networking-and-servers/mastering-machine-learning-penetration-testing)
* [Malware Data Science: Attack Detection and Attribution](https://nostarch.com/malwaredatascience)

## [↑](#table-of-contents) Talks

* [Using Machine Learning to Support Information Security](https://www.youtube.com/watch?v=tukidI5vuBs)
* [Defending Networks with Incomplete Information](https://www.youtube.com/watch?v=36IT9VgGr0g)
* [Applying Machine Learning to Network Security Monitoring](https://www.youtube.com/watch?v=vy-jpFpm1AU)
* [Measuring the IQ of your Threat Intelligence Feeds](https://www.youtube.com/watch?v=yG6QlHOAWiE)
* [Data-Driven Threat Intelligence: Metrics On Indicator Dissemination And Sharing](https://www.youtube.com/watch?v=6JMEKnes-w0)
* [Applied Machine Learning for Data Exfil and Other Fun Topics](https://www.youtube.com/watch?v=dGwH7m4N8DE)
* [Secure Because Math: A Deep-Dive on ML-Based Monitoring](https://www.youtube.com/watch?v=TYVCVzEJhhQ)
* [Machine Duping 101: Pwning Deep Learning Systems](https://www.youtube.com/watch?v=JAGDpJFFM2A)
* [Delta Zero, KingPhish3r – Weaponizing Data Science for Social Engineering](https://www.youtube.com/watch?v=l7U0pDcsKLg)
* [Defeating Machine Learning What Your Security Vendor Is Not Telling You](https://www.youtube.com/watch?v=oiuS1DyFNd8)
* [CrowdSource: Crowd Trained Machine Learning Model for Malware Capability Det](https://www.youtube.com/watch?v=u6a7afsD39A)
* [Defeating Machine Learning: Systemic Deficiencies for Detecting Malware](https://www.youtube.com/watch?v=sPtbDUJjhbk)
* [Packet Capture Village – Theodora Titonis – How Machine Learning Finds Malware](https://www.youtube.com/watch?v=2cQRSPFSY-s)
* [Build an Antivirus in 5 Min – Fresh Machine Learning #7. A fun video to watch](https://www.youtube.com/watch?v=iLNHVwSu9EA&t=245s)
* [Hunting for Malware with Machine Learning](https://www.youtube.com/watch?v=zT-4zdtvR30)
* [Machine Learning for Threat Detection](https://www.youtube.com/watch?v=qVwktOa-F34)
* [Machine Learning and the Cloud: Disrupting Threat Detection and Prevention](https://www.youtube.com/watch?v=fRklX97iGIw)
* [Fraud detection using machine learning & deep learning](https://www.youtube.com/watch?v=gHtN4jU69W0)
* [The Applications Of Deep Learning On Traffic Identification](https://www.youtube.com/watch?v=yZ-Y1WCM0lc)
* [Defending Networks With Incomplete Information: A Machine Learning Approach](https://www.youtube.com/watch?v=_0CRSF6yPB4)
* [Machine Learning & Data Science](https://vimeo.com/112702666)
* [Advances in Cloud-Scale Machine Learning for Cyber-Defense](https://www.youtube.com/watch?v=skSIIvvZFIk)
* [Applied Machine Learning: Defeating Modern Malicious Documents](https://www.youtube.com/watch?v=ZAuCEgA3itI)
* [Automated Prevention of Ransomware with Machine Learning and GPOs](https://www.rsaconference.com/writable/presentations/file_upload/spo2-t11_automated-prevention-of-ransomware-with-machine-learning-and-gpos.pdf)
* [Learning to Detect Malware by Mining the Security Literature](https://www.usenix.org/conference/enigma2017/conference-program/presentation/dumitras)
* [Clarence Chio and Anto Joseph - Practical Machine Learning in Infosecurity](https://conference.hitb.org/hitbsecconf2017ams/materials/D1T3%20-%20Clarence%20Chio%20and%20Anto%20Joseph%20-%20Practical%20Machine%20Learning%20in%20Infosecurity.pdf)
* [Advances in Cloud-Scale Machine Learning for Cyberdefense](https://www.youtube.com/watch?v=6Slj2FV9CLA)
* [Machine Learning-Based Techniques For Network Intrusion Detection](https://www.youtube.com/watch?v=-EUJgpiJ8Jo)
* [Practical Machine Learning in Infosec](https://www.youtube.com/watch?v=YF2dm6GZf2U)
* [AI and Security](https://www.microsoft.com/en-us/research/wp-content/uploads/2017/07/AI_and_Security_Dawn_Song.pdf)
* [AI in InfoSec](https://vimeo.com/230502013)
* [Beyond the Blacklists: Detecting Malicious URL Through Machine Learning](https://www.youtube.com/watch?v=Kd3svc9HZ0Y)
* [Machine Learning Fueled Cyber Threat Hunting](https://www.youtube.com/watch?v=c-c-IQ5pFXw)
* [Weaponizing Machine Learning: Humanity Was Overrated](https://www.youtube.com/watch?v=QbX7BhjOOvY)
* [Machine Learning, Offense, and the future of Automation](https://www.youtube.com/watch?v=BWFdxAG_TGk)
* [Bringing Red vs. Blue to Machine Learning](https://www.youtube.com/watch?v=e5O0Oxt5dYI)
* [Explaining Machine Learning with Azure and the Titanic Dataset](https://www.youtube.com/watch?v=x1DfjUEYm0k)
* [Using Machines to exploit Machines](https://www.youtube.com/watch?v=VuLvzL-WbBQ)
* [Analyze active directory event logs using visualize and ML](https://www.youtube.com/watch?v=ISbbzaCGBns)
* [Hardening Machine Learning Defenses Against Adversarial Attacks](https://www.youtube.com/watch?v=CAwua_lugV8)
* [Deep Neural Networks for Hackers: Methods, Applications, and Open Source Tools](https://www.youtube.com/watch?v=fKJ8sTi6H88)
* [ML in the daily work of a threat hunter](https://www.youtube.com/watch?v=vWMRVhDCpao)

## [↑](#table-of-contents) Tutorials

* [Machine Learning based Password Strength Classification](http://web.archive.org/web/20170606022743/http://fsecurify.com/machine-learning-based-password-strength-checking/)
* [Using Machine Learning to Detect Malicious URLs](http://web.archive.org/web/20170514093208/http://fsecurify.com/using-machine-learning-detect-malicious-urls/)
* [Using deep learning to break a Captcha system](https://deepmlblog.wordpress.com/2016/01/03/how-to-break-a-captcha-system/)
* [Data mining for network security and intrusion detection](https://www.r-bloggers.com/data-mining-for-network-security-and-intrusion-detection/)
* [Applying Machine Learning to Improve Your Intrusion Detection System](https://securityintelligence.com/applying-machine-learning-to-improve-your-intrusion-detection-system/)
* [Analyzing BotNets with Suricata & Machine Learning](http://blogs.splunk.com/2017/01/30/analyzing-botnets-with-suricata-machine-learning/)
* [fWaf – Machine learning driven Web Application Firewall](http://web.archive.org/web/20170706222016/http://fsecurify.com/fwaf-machine-learning-driven-web-application-firewall/)
* [Deep Session Learning for Cyber Security](https://blog.cyberreboot.org/deep-session-learning-for-cyber-security-e7c0f6804b81#.eo2m4alid)
* [DMachine Learning for Malware Detection](http://resources.infosecinstitute.com/machine-learning-malware-detection/)
* [ShadowBrokers Leak: A Machine Learning Approach](https://marcoramilli.blogspot.ru/2017/04/shadowbrokers-leak-machine-learning.html)
* [Practical Machine Learning in Infosec - Virtualbox Image and Stuff](https://docs.google.com/document/d/1v4plS1EhLBfjaz-9GHBqspTH7vnrJfqLrLjeP9k9i9A/edit)
* [A Machine-Learning Toolkit for Large-scale eCrime Forensics](http://blog.trendmicro.com/trendlabs-security-intelligence/defplorex-machine-learning-toolkit-large-scale-ecrime-forensics/)
* [WebShells Detection by Machine Learning](https://github.com/lcatro/WebShell-Detect-By-Machine-Learning)
* [Building Machine Learning Models for the SOC](https://www.fireeye.com/blog/threat-research/2018/06/build-machine-learning-models-for-the-soc.html)
* [Detecting Web Attacks With Recurrent Neural Networks](https://aivillage.org/posts/detecting-web-attacks-rnn/)
* [Machine Learning for Red Teams, Part 1](https://silentbreaksecurity.com/machine-learning-for-red-teams-part-1/)
* [Detecting Reverse Shell with Machine Learning](https://www.cyberbit.com/blog/endpoint-security/detecting-reverse-shell-with-machine-learning/)
* [Obfuscated Command Line Detection Using Machine Learning](https://www.fireeye.com/blog/threat-research/2018/11/obfuscated-command-line-detection-using-machine-learning.html)
* [Обнаружение веб-атак с помощью рекуррентных нейронных сетей (RUS)](https://habr.com/ru/company/pt/blog/439202/)
* [Clear and Creepy Danger of Machine Learning: Hacking Passwords](https://towardsdatascience.com/clear-and-creepy-danger-of-machine-learning-hacking-passwords-a01a7d6076d5)

## [↑](#table-of-contents) Courses

* [Data Mining for Cyber Security by Stanford](http://web.stanford.edu/class/cs259d/)
* [Data Science and Machine Learning for Infosec](http://www.pentesteracademy.com/course?id=30)
* [Cybersecurity Data Science on Udemy](https://www.udemy.com/cybersecurity-data-science)

## [↑](#table-of-contents) Miscellaneous

* [System predicts 85 percent of cyber-attacks using input from human experts](http://news.mit.edu/2016/ai-system-predicts-85-percent-cyber-attacks-using-input-human-experts-0418)
* [A list of open source projects in cyber security using machine learning](http://www.mlsecproject.org/#open-source-projects)
* [Source code about machine learning and security](https://github.com/13o-bbr-bbq/machine_learning_security)
* [Source code for Mastering Machine Learning for Penetration Testing](https://github.com/PacktPublishing/Mastering-Machine-Learning-for-Penetration-Testing)
* [Convolutional neural network for analyzing pentest screenshots](https://github.com/BishopFox/eyeballer)
* [Big Data and Data Science for Security and Fraud Detection](http://www.kdnuggets.com/2015/12/big-data-science-security-fraud-detection.html)
* [StringSifter - a machine learning tool that ranks strings based on their relevance for malware analysis](https://github.com/fireeye/stringsifter)

## License

![cc license](http://i.creativecommons.org/l/by-sa/4.0/88x31.png)

This work is licensed under a [Creative Commons Attribution-ShareAlike 4.0 International](http://creativecommons.org/licenses/by-sa/4.0/) license.