awesome-machine-learning-datascience_resources
Curated Collection of Online and Free Resources for serious learning of Machine Learning and Data Science.
https://github.com/rohan-paul/awesome-machine-learning-datascience_resources
Last synced: 2 days ago
JSON representation
-
Most Important Deep Learning Papers
- Generative Adversarial Networks
- DenseNet
- ResNet
- Fast R-CNN
- Xavier Initialization
- Dynamic Routing Between Capsules
- GoogLeNet
- R-CNN
- Spatial Transformer Networks
- DCGAN
- Synthetic Gradients
- Neural Turing Machines
- A3C
- Gradient Descent by Gradient Descent
- Rethinking Generalization
- Densely Connected CNNs
- Style Transfer
- Pixel RNN
- Dynamic Coattention Networks
- Convolutional Seq2Seq Learning
- Adam
- ReLu
- Saddle Points and Non-convexity of Neural Networks
- Quasi-Recurrent Neural Networks
- Unsupervised Machine Translation with Monolingual Corpora
- Learned Index Structures
- Visualizing Loss Landscapes
- Wasserstein GAN
- Relational Networks
- Influence Functions
- Large Batch Training
- Learning from Imbalanced Data
- AlexNet
- VGGNet
- ZFNet
- R-CNN
- Adversarial Images
- Synthetic Gradients
- Memory Networks
- Mixture of Experts
- Alpha Go
- Atari DQN
- Word2Vec
- GloVe
- A3C
- Rethinking Generalization
- EBGAN
- Style Transfer
- Dynamic Coattention Networks
- Seq2Seq
- Dropout
- Batch Norm
- Speech Recognition
- ReLu
- Xavier Initialization
- Saddle Points and Non-convexity of Neural Networks
- Overcoming Catastrophic Forgetting in NNs
- Quasi-Recurrent Neural Networks
- Escaping Saddle Points Efficiently
- Attention is All You Need
- Unsupervised Machine Translation with Monolingual Corpora
- Population Based Training of NN's
- Learned Index Structures
- Visualizing Loss Landscapes
- SqueezeNet
- WaveNet
- Hidden Technical Debt in ML Systems
- MobileNets
- EBGAN
-
General Datasets
- Institute of Computer Graphics and Vision
- LIMSI-CNRS/CHM/IMM/vision
- LIMSI-CNRS
- Computational Vision Lab
- OSU/SAMPL Database: Range Images, 3D Models, Stills, Motion Sequences - Over 1000 range images, 3D object models, still images and motion sequences (Formats: gif, ppm, vrml, homebrew)
- Signal Analysis and Machine Perception Laboratory
- The AR Face Database - Contains over 4,000 color images corresponding to 126 people's faces (70 men and 56 women). Frontal views with variations in facial expressions, illumination, and occlusions. (Formats: RAW (RGB 24-bit))
- enigma.com - Navigate the world of public data - Quickly search and analyze billions of public records published by governments, companies and organizations.
- United States Census Bureau
- datacite.org
- GeoLite Legacy Downloadable Databases
- A Deep Catalog of Human Genetic Variation
- World Bank Data
- El Salvador Atlas of Gastrointestinal VideoEndoscopy - Images and Videos of his-res of studies taken from Gastrointestinal Video endoscopy. (Formats: jpg, mpg, gif)
- Purdue Robot Vision Lab
- MNIST
- Google House Numbers
- CIFAR-10 and CIFAR-100
- hadoopilluminated.com
- data.gov - The home of the U.S. Government's open data
- Tiny Images
- Flickr Data
- Berkeley Segmentation Dataset 500
- Flickr 30k
- Microsoft COCO
- datahub.io
- Image QA
- AT&T Laboratories Cambridge face database
- AVHRR Pathfinder
- usgovxml.com
- Amsterdam Library of Object Images - ALOI is a color image collection of one-thousand small objects, recorded for scientific purposes. In order to capture the sensory variation in object recordings, we systematically varied viewing angle, illumination angle, and illumination color for each object, and additionally captured wide-baseline stereo images. We recorded over a hundred images of each object, yielding a total of 110,250 images for the collection. (Formats: png)
- Annotated face, hand, cardiac & meat images - Most images & annotations are supplemented by various ASM/AAM analyses using the AAM-API. (Formats: bmp,asf)
- Image Analysis and Computer Graphics
- Brown University Stimuli - A variety of datasets including geons, objects, and "greebles". Good for testing recognition algorithms. (Formats: pict)
- CCITT Fax standard images - 8 images (Formats: gif)
- CMU PIE Database - A database of 41,368 face images of 68 people captured under 13 poses, 43 illuminations conditions, and with 4 different expressions.
- CMU VASC Image Database - Images, sequences, stereo pairs (thousands of images) (Formats: Sun Rasterimage)
- Caltech Image Database - about 20 images - mostly top-down views of small objects and toys. (Formats: GIF)
- Columbia-Utrecht Reflectance and Texture Database - Texture and reflectance measurements for over 60 samples of 3D texture, observed with over 200 different combinations of viewing and illumination directions. (Formats: bmp)
- Computational Colour Constancy Data - A dataset oriented towards computational color constancy, but useful for computer vision in general. It includes synthetic data, camera sensor data, and over 700 images. (Formats: tiff)
- Content-based image retrieval database - 11 sets of color images for testing algorithms for content-based retrieval. Most sets have a description file with names of objects in each image. (Formats: jpg)
- Densely Sampled View Spheres - Densely sampled view spheres - upper half of the view sphere of two toy objects with 2500 images each. (Formats: tiff)
- Digital Embryos - Digital embryos are novel objects which may be used to develop and test object recognition systems. They have an organic appearance. (Formats: various formats are available on request)
- Univerity of Minnesota Vision Lab
- FG-NET Facial Aging Database - Database contains 1002 face images showing subjects at different ages. (Formats: jpg)
- FVC2000 Fingerprint Databases - FVC2000 is the First International Competition for Fingerprint Verification Algorithms. Four fingerprint databases constitute the FVC2000 benchmark (3520 fingerprints in all).
- Biometric Systems Lab - University of Bologna
- German Fingerspelling Database - The database contains 35 gestures and consists of 1400 image sequences that contain gestures of 20 different persons recorded under non-uniform daylight lighting conditions. (Formats: mpg,jpg)
- Language Processing and Pattern Recognition
- Groningen Natural Image Database - 4000+ 1536x1024 (16 bit) calibrated outdoor images (Formats: homebrew)
- ICG Testhouse sequence - 2 turntable sequences from ifferent viewing heights, 36 images each, resolution 1000x750, color (Formats: PPM)
- IEN Image Library - 1000+ images, mostly outdoor sequences (Formats: raw, ppm)
- INRIA's Syntim images database - 15 color image of simple objects (Formats: gif)
- INRIA's Syntim stereo databases - 34 calibrated color stereo pairs (Formats: gif)
- Image Analysis Laboratory - Images obtained from a variety of imaging modalities -- raw CFA images, range images and a host of "medical images". (Formats: homebrew)
- Image Database - An image database including some textures
- JAFFE Facial Expression Image Database - The JAFFE database consists of 213 images of Japanese female subjects posing 6 basic facial expressions as well as a neutral pose. Ratings on emotion adjectives are also available, free of charge, for research purposes. (Formats: TIFF Grayscale images.)
- ATR Research, Kyoto, Japan
- MIT Vision Texture - Image archive (100+ images) (Formats: ppm)
- Machine Vision - Images from the textbook by Jain, Kasturi, Schunck (20+ images) (Formats: GIF TIFF)
- Mammography Image Databases - 100 or more images of mammograms with ground truth. Additional images available by request, and links to several other mammography databases are provided. (Formats: homebrew)
- Middlebury Stereo Data Sets with Ground Truth - Six multi-frame stereo data sets of scenes containing planar regions. Each data set contains 9 color images and subpixel-accuracy ground-truth data. (Formats: ppm)
- Middlebury Stereo Vision Research Page - Middlebury College
- Modis Airborne simulator, Gallery and data set - High Altitude Imagery from around the world for environmental modeling in support of NASA EOS program (Formats: JPG and HDF)
- National Design Repository - Over 55,000 3D CAD and solid models of (mostly) mechanical/machined engineering designs. (Formats: gif,vrml,wrl,stp,sat)
- Geometric & Intelligent Computing Laboratory
- OSU (MSU) 3D Object Model Database - several sets of 3D object models collected over several years to use in object recognition research (Formats: homebrew, vrml)
- OSU (MSU/WSU) Range Image Database - Hundreds of real and synthetic images (Formats: gif, homebrew)
- Otago Optical Flow Evaluation Sequences - Synthetic and real sequences with machine-readable ground truth optical flow fields, plus tools to generate ground truth for new sequences. (Formats: ppm,tif,homebrew)
- Vision Research Group
- Photometric 3D Surface Texture Database - This is the first 3D texture database which provides both full real surface rotations and registered photometric stereo data (30 textures, 1680 images). (Formats: TIFF)
- SEQUENCES FOR OPTICAL FLOW ANALYSIS (SOFA) - 9 synthetic sequences designed for testing motion analysis applications, including full ground truth of motion and camera parameters. (Formats: gif)
- Computer Vision Group
- Stereo Images with Ground Truth Disparity and Occlusion - a small set of synthetic images of a hallway with varying amounts of noise added. Use these images to benchmark your stereo algorithm. (Formats: raw, viff (khoros), or tiff)
- Stuttgart Range Image Database - A collection of synthetic range images taken from high-resolution polygonal models available on the web (Formats: homebrew)
- The MIT-CSAIL Database of Objects and Scenes - Database for testing multiclass object detection and scene recognition algorithms. Over 72,000 images with 2873 annotated frames. More than 50 annotated object classes. (Formats: jpg)
- The RVL SPEC-DB (SPECularity DataBase) - A collection of over 300 real images of 100 objects taken under three different illuminaiton conditions (Diffuse/Ambient/Directed). -- Use these images to test algorithms for detecting and compensating specular highlights in color images. (Formats: TIFF )
- Robot Vision Laboratory
- The Xm2vts database - The XM2VTSDB contains four digital recordings of 295 people taken over a period of four months. This database contains both image and video data of faces.
- Centre for Vision, Speech and Signal Processing
- Traffic Image Sequences and 'Marbled Block' Sequence - thousands of frames of digitized traffic image sequences as well as the 'Marbled Block' sequence (grayscale images) (Formats: GIF)
- IAKS/KOGS
- U Oulu wood and knots database - Includes classifications - 1000+ color images (Formats: ppm)
- UCID - an Uncompressed Colour Image Database - a benchmark database for image retrieval with predefined ground truth. (Formats: tiff)
- UMass Vision Image Archive - Large image database with aerial, space, stereo, medical images and more. (Formats: homebrew)
- USF Range Image Data with Segmentation Ground Truth - 80 image sets (Formats: Sun rasterimage)
- University of Oulu Physics-based Face Database - contains color images of faces under different illuminants and camera calibration conditions as well as skin spectral reflectance measurements of each person.
- Machine Vision and Media Processing Unit
- University of Oulu Texture Database - Database of 320 surface textures, each captured under three illuminants, six spatial resolutions and nine rotation angles. A set of test suites is also provided so that texture segmentation, classification, and retrieval algorithms can be tested in a standard manner. (Formats: bmp, ras, xv)
- Machine Vision Group
- View Sphere Database - Images of 8 objects seen from many different view points. The view sphere is sampled using a geodesic with 172 images/sphere. Two sets for training and testing are available. (Formats: ppm)
- PRIMA, GRAVIR
- Wiry Object Recognition Database - Thousands of images of a cart, ladder, stool, bicycle, chairs, and cluttered scenes with ground truth labelings of edges and regions. (Formats: jpg)
- 3D Vision Group
- Yale Face Database - 165 images (15 individuals) with different lighting, expression, and occlusion configurations.
- Yale Face Database B - 5760 single light source images of 10 subjects each seen under 576 viewing conditions (9 poses x 64 illumination conditions). (Formats: PGM)
- Center for Computational Vision and Control
- DeepMind QA Corpus - Textual QA corpus from CNN and DailyMail. More than 300K documents in total. [Paper](http://arxiv.org/abs/1506.03340) for reference.
- YouTube-8M Dataset - YouTube-8M is a large-scale labeled video dataset that consists of 8 million YouTube video IDs and associated labels from a diverse vocabulary of 4800 visual entities.
- Open Images dataset - Open Images is a dataset of ~9 million URLs to images that have been annotated with labels spanning over 6000 categories.
- Visual Object Classes Challenge 2012 (VOC2012) - VOC2012 dataset containing 12k images with 20 annotated classes for object detection and segmentation.
- Fashion-MNIST - MNIST like fashion product dataset consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes.
- Large-scale Fashion (DeepFashion) Database - Contains over 800,000 diverse fashion images. Each image in this dataset is labeled with 50 categories, 1,000 descriptive attributes, bounding box and clothing landmarks
- FakeNewsCorpus - Contains about 10 million news articles classified using [opensources.co](http://opensources.co) types
- databib.org
- quandl.com - Get the data you need in the form you want; instant download, API or direct to your app.
- figshare.com
- Quora's Big Datasets Answer
- Houston Data Portal
- Kaggle Data Sources
- NYC Taxi data
- A list of useful sources
- grouplens.org
- Image Analysis Laboratory
-
Logistic Regression Tutorials
- Stanford Logistic Regression Overview
- Guide to an in-depth understanding of logistic regression
- Coursera Course - Logistic Regression and Classification
- Logistic Regression - An Introduction
- Stanford Logistic Regression Overview
- Logistic Regression Wiki
- Siraj Raval Logistic Regression Tutorial
- Geometric Intuition of Logistic Regression
-
Best Courses
- Intro to Neural Nets and ML (Univ of Toronto)
- CMU Neural Networks for NLP
- Stanford CS 231N - CNNs
- Stanford CS 224D - Deep Learning for NLP
- David Silver's Reinforcement Learning Course
- UC Berkeley Kaggle Decal
- Short MIT Intro to DL Course
- Stanford CS 231N - CNNs
- Stanford CS 224D - Deep Learning for NLP
- David Silver's Reinforcement Learning Course
- Andrew Ng Machine Learning Course
- Stanford CS 229 - Pretty much the same as the Coursera course
- Short MIT Intro to DL Course
- Udacity Deep Learning
- Intro to Neural Nets and ML (Univ of Toronto)
- Deep RL Bootcamp
- CMU Neural Networks for NLP
- Bay Area Deep Learning School Day 1 2016
- Caltech CS 156 - Machine Learning
- Berkeley EE 227C - Convex Optimization
-
Numpy
- Sorting
- numpy - dimensional arrays and matrices, along with a large library of high-level mathematical functions to operate on these arrays. |
- Introduction-to-NumPy
- Understanding-Data-Types
- The-Basics-Of-NumPy-Arrays
- Computation-on-arrays-ufuncs
- Computation-on-arrays-aggregates
- Computation-on-arrays-broadcasting
- Boolean-Arrays-and-Masks
- Fancy-Indexing
- Structured-Data-NumPy
- numpy - dimensional arrays and matrices, along with a large library of high-level mathematical functions to operate on these arrays. |
- Introduction-to-NumPy
- Understanding-Data-Types
- The-Basics-Of-NumPy-Arrays
- Computation-on-arrays-ufuncs
- Computation-on-arrays-aggregates
- Computation-on-arrays-broadcasting
- Boolean-Arrays-and-Masks
- Fancy-Indexing
- Sorting
- Structured-Data-NumPy
-
Kaggle Competitions WriteUp
-
Linear Regression Tutorials
- Applying and Interpreting Linear Regression
- Coursera Course - Linear Regression with One Variable
- Simple Linear Regression
- Linear Regression for Machine Learning
- Assumptions of Linear Regression - is-a-complete-list-of-the-usual-assumptions-for-linear-regression)
- Is linear regression valid when the dependant variable is not normally distributed?
- Dummy Variable Trap | Multicollinearity
- Dealing with multicollinearity using VIFs
- Elastic Net - [Regularization and Variable Selection via the
-
Best Deep Learning Courses
-
If you are new to Data Science
- <img src="http://i.imgur.com/0OoLaa5.png" width="150" /> - differences-of-a-data-scientist-vs-data-engineer) |
- <img src="http://i.imgur.com/W2t2Roz.png" width="150" />
- <img src="http://i.imgur.com/XBgKF2l.png" width="150" />
- <img src="http://i.imgur.com/l9ZGtal.jpg" width="150" />
- <img src="http://i.imgur.com/b9xYdZB.jpg" width="150" /> - to-become-a-data-scientist-before-you-graduate/) by Berkeley Science Review. |
- <img src="http://i.imgur.com/TWkB4X6.png" width="150" />
- <img src="https://cloud.githubusercontent.com/assets/182906/19517857/604f88d8-960c-11e6-97d6-16c9738cb824.png" width="150" />
- <img src="http://i.imgur.com/rb9ruaa.png" width="150" /> - a-data-scientist/). |
-
Blogs and other Community based resources
- Microsoft Machine Learning Blog
- Applied AI for Business: Industry Newsletter for Executives Applying Machine Learning to Enterprises - TOPBOTS
- Data Science Community Newsletter – NYU Center for Data Science
- The Week in Data – The ODI
- Import AI
- Yannic Kilcher
- algorithmia.com/blog
- Towards AI — Multidisciplinary Science Journal – Medium
- Towards Data Science
- KD Nuggets
- Andrew Ng
- I Am Trask
- colah
- Most Viewed Machine Learning writers
- Data Science Topic on Quora
- William Chen's Answers
- Michael Hochster's Answers
- Ricardo Vladimiro's Answers
- Storytelling with Statistics
- Data Science FAQs on Quora
- Machine Learning FAQs on Quora
- Great Machine Learnings Blog
- Microsoft Machine Learning Blog
- News on Artificial Intelligence and Machine Learning
- Machine Learning | Latest News, Photos & Videos | WIRED
- Analytics India Magazine | Artificial Intelligence & Data Science stories
- Blogs on Artificial Intelligence, Machine Learning, Python & Data Science
- Applied AI for Business: Industry Newsletter for Executives Applying Machine Learning to Enterprises - TOPBOTS
- The Week in Data – The ODI
- Import AI
- Data Elixir - News and resources for data science practitioners
- Dataconomy Home - Dataconomy
- Yannic Kilcher
- Towards AI — Multidisciplinary Science Journal – Medium
- Newest 'machine-learning' Questions - Stack Overflow
- Weekly topics - Dataquest Community
- Reddit - Machine Learning
- Reddit - Learn Machine Learning
- Reddit - Deep Learning
- Reddit - A Powerful Machine Intelligence Library
- Reddit GoogleColabNotebooks
- machinelearningmastery.com/blog/
- Machine Learning(Theory)
- Machine Learning Blog, ML@CMU, Carnegie Mellon University
- Amazon Science Blog
- AWS Machine Learning Blog
- distill
- BAIR - Berkeley Artificial Intelligence Research
- openai.com/
- MIT Machine Learning
- Facebook AI
-
ML Math
- CS229 Notes on Linear Algebra
- AWS Math for Machine Learning
- Essential Math for Machine Learning - Python Edition
- Introduction to Linear Algebra
- Introduction to Linear Algebra
- Linear Algebra - Hefferon
- Deep Learning Math
- CS229 Notes on Linear Algebra
- AWS Math for Machine Learning
- Computational Linear Algebra
-
Inspirational Stories of people breaking into Machine Learning and Data-Science
- Krish Naik
- Success Story of Saugata Paul : A career break from Infosys to Data Scientist at SPi Global
- Success Story of Ramesh Vellanki: A non-programmer to Data Scientist @ CitiBank
- Success Story of Sriganesh: From 10+ years @ IT Services to a Data Scientist @ Amazon
- How I Got 4 Data Science Offers and Doubled my Income 2 Months after being Laid Off
- I got 7 job offers during the worst job market in history. Here’s the data
- Brian Ross - from Carpenter to Data-Engineer
- Krish Naik
- Success Story of Saugata Paul : A career break from Infosys to Data Scientist at SPi Global
- Success Story of Ramesh Vellanki: A non-programmer to Data Scientist @ CitiBank
- Success Story of Sriganesh: From 10+ years @ IT Services to a Data Scientist @ Amazon
- How I Got 4 Data Science Offers and Doubled my Income 2 Months after being Laid Off
- I got 7 job offers during the worst job market in history. Here’s the data
-
Finance Related Datasets
- BIS Statistics - BIS statistics, compiled in cooperation with central
- Complete FAANG Stock data - This data set contains all the stock data
- Yahoo Finance
- OANDA
- BIS Statistics - BIS statistics, compiled in cooperation with central
- Blockmodo Coin Registry - A registry of JSON formatted information files
- CBOE Futures Exchange
- Complete FAANG Stock data - This data set contains all the stock data
- NASDAQ
-
Super Large Kaggle Datasets
- Sketches and Strokes from the QuickDraw Game
- GloVe Reddit Comments
- melanoma 2019 orig
- OPENML - An open science datasets for machine learning
- Spoken Language Identification
- COVID19-Engineering-Books-NLP-Dataset
- Sketches and Strokes from the QuickDraw Game
- GloVe Reddit Comments
- melanoma 2019 orig
- Microsoft Malware Prediction
- LANL Earthquake Prediction
- APTOS 2019 Blindness Detection
- IEEE's Signal Processing Society - Camera Model Identification
- OPENML - An open science datasets for machine learning
- Spoken Language Identification
- COVID19-Engineering-Books-NLP-Dataset
- Lyft 3D Object Detection for Autonomous Vehicles
-
TensorFlow
-
K Nearest Neighbors Tutorials
-
Top Machine Learning Podcasts
- Upskilling from Home · Data Crunch Corporation
- O’Reilly Data Show Podcast – O’Reilly
- Introduction · Data Skeptic
- Spotify – Data Stories | Podcast on Spotify
- Spotify – DataFramed | Podcast on Spotify
- Hunting for the Higgs · Linear Digressions
- Spotify – Not So Standard Deviations | Podcast on Spotify
- Spotify – Making Data Simple | Podcast on Spotify
- Spotify – Data Engineering Podcast | Podcast on Spotify
- Spotify – SuperDataScience | Podcast on Spotify
- Spotify – Data Science at Home | Podcast on Spotify
- Spotify – The Digital Analytics Power Hour | Podcast on Spotify
- Home - HumAIn Podcast
- Spotify – HumAIn Podcast - Artificial Intelligence, Data Science, and Developer Education | Podcast on Spotify
- Let's Reflect · Talking Machines
- Spotify – The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) | Podcast on Spotify
- Spotify – Lex Fridman Podcast | Artificial Intelligence (AI) | Podcast on Spotify
- AI Today Podcast: The 2020 State of AI – Interview with Wilson Pang, CTO at Appen · AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion
- Spotify – AI in Business | Podcast on Spotify
-
Natural Language Project Ideas
- Chat-bot architecture
- Reddit Dataset
- Essays with human graded scores
- Quora question pairs
- Toxic comments on Kaggle
- Facebook's FAIR
- NCERT books - 12/school students in India, [NarrativeQA by Google DeepMind](https://github.com/deepmind/narrativeqa) and [SQuAD by Stanford](https://rajpurkar.github.io/SQuAD-explorer/)
- Chat-bot architecture
- new speeches in Obama's style
- @realDonaldTrump
- personal website
- English Transcript of Modi speeches
-
Kaggle Strategies and skills
- Methods to tackle class imbalance
- shuffle for augmentation
- synthetic samples in the dataset
- Signal denoising
- missing data
- encoding techniques for categorical data
- shuffling of data
- Methods to tackle class imbalance
- Synthetic Minority Oversampling Technique
- shuffle for augmentation
- synthetic samples in the dataset
- Signal denoising
- missing data
- encoding techniques for categorical data
- model to predict missing values
- shuffling of data
- data loading with pandas
- reduce the size of data by 70%
- reducing the size of some attributes
- Dask to read and manipulate the data
- cudf
- parquet
- feather
- optimizing RAM
- malware detection
- EDA for malware detection
- EDA for home credit loan prediction
- VSB Power Line Fault Detection
- encoding cross validation
- handle categories
- cyclic features for deep learning
- feature engineering methods
- using featuretools
- microsoft malware detection
- feature extraction.
- RAPIDS framework.
- processing features using LGBM.
- Principal component analysis
- microsoft malware detection
- frequency features
- Aggregate time series features
- Home default risk competition.
- Santander Transaction Prediction.
- features selection using sklearn.
- Permutation feature importance.
- Adversarial feature validation.
- null importances.
- Random forest classifier.
- XGBoost : Gradient boosted decision trees.
- LightGBM for distributed and faster training.
- Naive bayes classifier.
- Gaussian naive bayes model.
- LGBM + CNN model used in 3rd place solution of Santander Customer Transaction Prediction
- Knowledge distillation in Neural Network.
- Follow the regularized leader method.
- LGB boosting methods
- NN + focal loss experiment.
- Keras NN with timeseries splitter.
- 5th place NN architecture with code for Santander Transaction prediction.
-
Natural Language Processing (NLP)
-
Statistics
-
Machine Learning & Deep Learning Tutorials
-
Interview-Related-Links
-
Genetic Algorithms
- Genetic Algorithms Wikipedia Page
- Genetic Algorithms Explained in Plain English
- Genetic Programming
- Genetic Programming in Python (GitHub)
- Genetic Alogorithms vs Genetic Programming (Quora) - are-the-differences-between-genetic-algorithms-and-genetic-programming)
-
List of Most Starred Github Projects related to Deep Learning
- tensorflow
- keras
- opencv
- pytorch
- tesseract
- face_recognition
- faceswap
- transformers - of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0. |
- 100-Days-Of-ML-Code
- julia
- awesome-scalability - Scale Systems |
- basics
- bert - trained models for BERT |
- xgboost
- Real-Time-Voice-Cloning - time |
- openpose - time multi-person keypoint detection library for body, face, hands, and foot estimation |
- Qix
- spleeter
- Virgilio - Learning. |
-
Deep Learning Resources
-
Awesome Deep Learning Projects
- 15 AI and Machine Learning Events
- 188 examples of artificial intelligence in action
- A curated list of automated machine learning papers, articles, tutorials, slides and projects
- A curated list of awesome Machine Learning frameworks, libraries and software.
- A curated list of awesome places to learn and/or practice algorithms.
- A curated list of awesome R packages and tools
- A curated list of awesome SLAM tutorials, projects and communities.
- A curated list of resources dedicated to bridge between coginitive science and deep learning
- A curated list of resources dedicated to Natural Language Processing (NLP)
- A curated list of resources for NLP (Natural Language Processing) for Chinese
- Another curated list of deep learning resources
- A list of artificial intelligence tools you can use today
- A list of deep learning implementations in biology
- Awesome-2vec
- Awesome Action Recognition
-
Project Ideas for deep learning and general machine learning
-
Forecasting Project Ideas
-
Recommendation systems Project Ideas
-
Some of the Best Kaggle Competitions for Beginners
Categories
General Datasets
114
Most Important Deep Learning Papers
69
Kaggle Strategies and skills
59
Blogs and other Community based resources
51
Numpy
22
Best Courses
20
List of Most Starred Github Projects related to Deep Learning
19
Top Machine Learning Podcasts
19
Super Large Kaggle Datasets
17
Awesome Deep Learning Projects
15
Inspirational Stories of people breaking into Machine Learning and Data-Science
13
Best Deep Learning Courses
13
Natural Language Project Ideas
12
ML Math
10
Linear Regression Tutorials
9
Finance Related Datasets
9
Logistic Regression Tutorials
8
If you are new to Data Science
8
Kaggle Competitions WriteUp
6
Project Ideas for deep learning and general machine learning
6
Genetic Algorithms
5
TensorFlow
5
Deep Learning Resources
5
Interview-Related-Links
5
K Nearest Neighbors Tutorials
4
Forecasting Project Ideas
3
Recommendation systems Project Ideas
2
Some of the Best Kaggle Competitions for Beginners
2
Natural Language Processing (NLP)
2
Statistics
1
Machine Learning & Deep Learning Tutorials
1
Sub Categories
Keywords
machine-learning
23
deep-learning
20
python
12
tensorflow
10
awesome
8
awesome-list
7
nlp
7
natural-language-processing
5
computer-vision
5
neural-network
4
pytorch
4
data-science
4
deep-neural-networks
3
distributed-systems
3
scikit-learn
3
jax
2
neural-networks
2
deeplearning
2
resources
2
cpp
2
pretrained-models
2
gpu
2
tutorial
2
opencv
2
data-analysis
2
list
2
pose-estimation
2
pandas
2
lstm
2
numpy
2
seq2seq
1
text-mining
1
pytorch-transformers
1
nlp-library
1
model-hub
1
language-models
1
action-classification
1
language-model
1
action-detection
1
action-recognition
1
activity-recognition
1
language
1
flax
1
activity-understanding
1
bert
1
tensor
1
voice-cloning
1
tts
1
scientific
1
automated-feature-engineering
1