awesome-machine-learning-datascience_resources
Curated Collection of Online and Free Resources for serious learning of Machine Learning and Data Science.
https://github.com/rohan-paul/awesome-machine-learning-datascience_resources
Last synced: 8 days ago
JSON representation
-
Awesome Deep Learning Projects
- A curated list of awesome R packages and tools
- A curated list of awesome Machine Learning frameworks, libraries and software.
- A curated list of awesome places to learn and/or practice algorithms.
- 188 examples of artificial intelligence in action
- A list of artificial intelligence tools you can use today
- 15 AI and Machine Learning Events
- A curated list of resources dedicated to Natural Language Processing (NLP)
- A curated list of automated machine learning papers, articles, tutorials, slides and projects
- Awesome Action Recognition
- Another curated list of deep learning resources
- A curated list of awesome SLAM tutorials, projects and communities.
- A curated list of resources dedicated to bridge between coginitive science and deep learning
- A list of deep learning implementations in biology
- Awesome-2vec
- A curated list of resources for NLP (Natural Language Processing) for Chinese
-
Deep Learning Resources
-
TensorFlow
-
Machine Learning & Deep Learning Tutorials
-
General Datasets
- figshare.com
- YouTube-8M Dataset - YouTube-8M is a large-scale labeled video dataset that consists of 8 million YouTube video IDs and associated labels from a diverse vocabulary of 4800 visual entities.
- Flickr 30k
- MIT Vision Texture - Image archive (100+ images) (Formats: ppm)
- Microsoft COCO
- Image QA
- datahub.io
- CIFAR-10 and CIFAR-100
- Flickr Data
- Berkeley Segmentation Dataset 500
- AT&T Laboratories Cambridge face database
- AVHRR Pathfinder
- Amsterdam Library of Object Images - ALOI is a color image collection of one-thousand small objects, recorded for scientific purposes. In order to capture the sensory variation in object recordings, we systematically varied viewing angle, illumination angle, and illumination color for each object, and additionally captured wide-baseline stereo images. We recorded over a hundred images of each object, yielding a total of 110,250 images for the collection. (Formats: png)
- Annotated face, hand, cardiac & meat images - Most images & annotations are supplemented by various ASM/AAM analyses using the AAM-API. (Formats: bmp,asf)
- Brown University Stimuli - A variety of datasets including geons, objects, and "greebles". Good for testing recognition algorithms. (Formats: pict)
- CCITT Fax standard images - 8 images (Formats: gif)
- CMU VASC Image Database - Images, sequences, stereo pairs (thousands of images) (Formats: Sun Rasterimage)
- Caltech Image Database - about 20 images - mostly top-down views of small objects and toys. (Formats: GIF)
- Columbia-Utrecht Reflectance and Texture Database - Texture and reflectance measurements for over 60 samples of 3D texture, observed with over 200 different combinations of viewing and illumination directions. (Formats: bmp)
- Densely Sampled View Spheres - Densely sampled view spheres - upper half of the view sphere of two toy objects with 2500 images each. (Formats: tiff)
- FG-NET Facial Aging Database - Database contains 1002 face images showing subjects at different ages. (Formats: jpg)
- German Fingerspelling Database - The database contains 35 gestures and consists of 1400 image sequences that contain gestures of 20 different persons recorded under non-uniform daylight lighting conditions. (Formats: mpg,jpg)
- Groningen Natural Image Database - 4000+ 1536x1024 (16 bit) calibrated outdoor images (Formats: homebrew)
- ICG Testhouse sequence - 2 turntable sequences from ifferent viewing heights, 36 images each, resolution 1000x750, color (Formats: PPM)
- IEN Image Library - 1000+ images, mostly outdoor sequences (Formats: raw, ppm)
- INRIA's Syntim images database - 15 color image of simple objects (Formats: gif)
- INRIA's Syntim stereo databases - 34 calibrated color stereo pairs (Formats: gif)
- Image Analysis Laboratory - Images obtained from a variety of imaging modalities -- raw CFA images, range images and a host of "medical images". (Formats: homebrew)
- Image Database - An image database including some textures
- JAFFE Facial Expression Image Database - The JAFFE database consists of 213 images of Japanese female subjects posing 6 basic facial expressions as well as a neutral pose. Ratings on emotion adjectives are also available, free of charge, for research purposes. (Formats: TIFF Grayscale images.)
- ATR Research, Kyoto, Japan
- Machine Vision - Images from the textbook by Jain, Kasturi, Schunck (20+ images) (Formats: GIF TIFF)
- Mammography Image Databases - 100 or more images of mammograms with ground truth. Additional images available by request, and links to several other mammography databases are provided. (Formats: homebrew)
- Middlebury Stereo Data Sets with Ground Truth - Six multi-frame stereo data sets of scenes containing planar regions. Each data set contains 9 color images and subpixel-accuracy ground-truth data. (Formats: ppm)
- Middlebury Stereo Vision Research Page - Middlebury College
- Modis Airborne simulator, Gallery and data set - High Altitude Imagery from around the world for environmental modeling in support of NASA EOS program (Formats: JPG and HDF)
- National Design Repository - Over 55,000 3D CAD and solid models of (mostly) mechanical/machined engineering designs. (Formats: gif,vrml,wrl,stp,sat)
- Geometric & Intelligent Computing Laboratory
- Otago Optical Flow Evaluation Sequences - Synthetic and real sequences with machine-readable ground truth optical flow fields, plus tools to generate ground truth for new sequences. (Formats: ppm,tif,homebrew)
- SEQUENCES FOR OPTICAL FLOW ANALYSIS (SOFA) - 9 synthetic sequences designed for testing motion analysis applications, including full ground truth of motion and camera parameters. (Formats: gif)
- Computer Vision Group
- Stereo Images with Ground Truth Disparity and Occlusion - a small set of synthetic images of a hallway with varying amounts of noise added. Use these images to benchmark your stereo algorithm. (Formats: raw, viff (khoros), or tiff)
- Stuttgart Range Image Database - A collection of synthetic range images taken from high-resolution polygonal models available on the web (Formats: homebrew)
- The RVL SPEC-DB (SPECularity DataBase) - A collection of over 300 real images of 100 objects taken under three different illuminaiton conditions (Diffuse/Ambient/Directed). -- Use these images to test algorithms for detecting and compensating specular highlights in color images. (Formats: TIFF )
- Robot Vision Laboratory
- The Xm2vts database - The XM2VTSDB contains four digital recordings of 295 people taken over a period of four months. This database contains both image and video data of faces.
- Centre for Vision, Speech and Signal Processing
- Traffic Image Sequences and 'Marbled Block' Sequence - thousands of frames of digitized traffic image sequences as well as the 'Marbled Block' sequence (grayscale images) (Formats: GIF)
- IAKS/KOGS
- U Oulu wood and knots database - Includes classifications - 1000+ color images (Formats: ppm)
- UCID - an Uncompressed Colour Image Database - a benchmark database for image retrieval with predefined ground truth. (Formats: tiff)
- UMass Vision Image Archive - Large image database with aerial, space, stereo, medical images and more. (Formats: homebrew)
- USF Range Image Data with Segmentation Ground Truth - 80 image sets (Formats: Sun rasterimage)
- University of Oulu Physics-based Face Database - contains color images of faces under different illuminants and camera calibration conditions as well as skin spectral reflectance measurements of each person.
- Machine Vision and Media Processing Unit
- University of Oulu Texture Database - Database of 320 surface textures, each captured under three illuminants, six spatial resolutions and nine rotation angles. A set of test suites is also provided so that texture segmentation, classification, and retrieval algorithms can be tested in a standard manner. (Formats: bmp, ras, xv)
- Machine Vision Group
- View Sphere Database - Images of 8 objects seen from many different view points. The view sphere is sampled using a geodesic with 172 images/sphere. Two sets for training and testing are available. (Formats: ppm)
- PRIMA, GRAVIR
- Wiry Object Recognition Database - Thousands of images of a cart, ladder, stool, bicycle, chairs, and cluttered scenes with ground truth labelings of edges and regions. (Formats: jpg)
- 3D Vision Group
- data.gov - The home of the U.S. Government's open data
- Yale Face Database - 165 images (15 individuals) with different lighting, expression, and occlusion configurations.
- Yale Face Database B - 5760 single light source images of 10 subjects each seen under 576 viewing conditions (9 poses x 64 illumination conditions). (Formats: PGM)
- Center for Computational Vision and Control
- Quora's Big Datasets Answer
- grouplens.org
- hadoopilluminated.com
- usgovxml.com
- Houston Data Portal
- Kaggle Data Sources
- A list of useful sources
- CMU PIE Database - A database of 41,368 face images of 68 people captured under 13 poses, 43 illuminations conditions, and with 4 different expressions.
- Fashion-MNIST - MNIST like fashion product dataset consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes.
- Open Images dataset - Open Images is a dataset of ~9 million URLs to images that have been annotated with labels spanning over 6000 categories.
- quandl.com - Get the data you need in the form you want; instant download, API or direct to your app.
- DeepMind QA Corpus - Textual QA corpus from CNN and DailyMail. More than 300K documents in total. [Paper](http://arxiv.org/abs/1506.03340) for reference.
- MNIST
- Google House Numbers
- Image Analysis and Computer Graphics
- Content-based image retrieval database - 11 sets of color images for testing algorithms for content-based retrieval. Most sets have a description file with names of objects in each image. (Formats: jpg)
- Univerity of Minnesota Vision Lab
- FVC2000 Fingerprint Databases - FVC2000 is the First International Competition for Fingerprint Verification Algorithms. Four fingerprint databases constitute the FVC2000 benchmark (3520 fingerprints in all).
- Biometric Systems Lab - University of Bologna
- Language Processing and Pattern Recognition
- OSU (MSU) 3D Object Model Database - several sets of 3D object models collected over several years to use in object recognition research (Formats: homebrew, vrml)
- OSU (MSU/WSU) Range Image Database - Hundreds of real and synthetic images (Formats: gif, homebrew)
- Vision Research Group
- Photometric 3D Surface Texture Database - This is the first 3D texture database which provides both full real surface rotations and registered photometric stereo data (30 textures, 1680 images). (Formats: TIFF)
- The MIT-CSAIL Database of Objects and Scenes - Database for testing multiclass object detection and scene recognition algorithms. Over 72,000 images with 2873 annotated frames. More than 50 annotated object classes. (Formats: jpg)
- Visual Object Classes Challenge 2012 (VOC2012) - VOC2012 dataset containing 12k images with 20 annotated classes for object detection and segmentation.
- Large-scale Fashion (DeepFashion) Database - Contains over 800,000 diverse fashion images. Each image in this dataset is labeled with 50 categories, 1,000 descriptive attributes, bounding box and clothing landmarks
- FakeNewsCorpus - Contains about 10 million news articles classified using [opensources.co](http://opensources.co) types
- Digital Embryos - Digital embryos are novel objects which may be used to develop and test object recognition systems. They have an organic appearance. (Formats: various formats are available on request)
- Tiny Images
- Computational Colour Constancy Data - A dataset oriented towards computational color constancy, but useful for computer vision in general. It includes synthetic data, camera sensor data, and over 700 images. (Formats: tiff)
- databib.org
- NYC Taxi data
- Computational Vision Lab
- The AR Face Database - Contains over 4,000 color images corresponding to 126 people's faces (70 men and 56 women). Frontal views with variations in facial expressions, illumination, and occlusions. (Formats: RAW (RGB 24-bit))
- OSU/SAMPL Database: Range Images, 3D Models, Stills, Motion Sequences - Over 1000 range images, 3D object models, still images and motion sequences (Formats: gif, ppm, vrml, homebrew)
- Signal Analysis and Machine Perception Laboratory
- World Bank Data
- Institute of Computer Graphics and Vision
- Image Analysis Laboratory
- El Salvador Atlas of Gastrointestinal VideoEndoscopy - Images and Videos of his-res of studies taken from Gastrointestinal Video endoscopy. (Formats: jpg, mpg, gif)
- Purdue Robot Vision Lab
- LIMSI-CNRS/CHM/IMM/vision
- LIMSI-CNRS
- Flickr 8k
- Department Image Understanding
- enigma.com - Navigate the world of public data - Quickly search and analyze billions of public records published by governments, companies and organizations.
- datacite.org
- A Deep Catalog of Human Genetic Variation
-
Finance Related Datasets
- NASDAQ
- BIS Statistics - BIS statistics, compiled in cooperation with central
- Blockmodo Coin Registry - A registry of JSON formatted information files
- CBOE Futures Exchange
- Complete FAANG Stock data - This data set contains all the stock data
- OANDA
- Complete FAANG Stock data - This data set contains all the stock data
- Yahoo Finance
-
ML Math
- Computational Linear Algebra
- Introduction to Linear Algebra
- Linear Algebra - Hefferon
- Deep Learning Math
- CS229 Notes on Linear Algebra
- AWS Math for Machine Learning
- Introduction to Linear Algebra
- CS229 Notes on Linear Algebra
- AWS Math for Machine Learning
- Essential Math for Machine Learning - Python Edition
-
Best Deep Learning Courses
- CMU
- Machine Learning, Data Science and Deep Learning with Python
- CUHK
- Theano Tutorial
- UFLDL Tutorial 1
- Deep Learning with R in Motion
- Grokking Deep Learning in Motion
- Deep Learning from the Bottom up
- UFLDL Tutorial 2
- CMU
- Berkeley
- Berkeley
- A Deep Learning Tutorial: From Perceptrons to Deep Networks
- Berkeley
-
Best Courses
- Deep RL Bootcamp
- Andrew Ng Machine Learning Course
- Udacity Deep Learning
- Bay Area Deep Learning School Day 1 2016
- Stanford CS 231N - CNNs
- Stanford CS 224D - Deep Learning for NLP
- David Silver's Reinforcement Learning Course
- Stanford CS 229 - Pretty much the same as the Coursera course
- Short MIT Intro to DL Course
- Intro to Neural Nets and ML (Univ of Toronto)
- CMU Neural Networks for NLP
- Caltech CS 156 - Machine Learning
- Berkeley EE 227C - Convex Optimization
- Intro to Neural Nets and ML (Univ of Toronto)
- CMU Neural Networks for NLP
- Stanford CS 231N - CNNs
- Stanford CS 224D - Deep Learning for NLP
- David Silver's Reinforcement Learning Course
- UC Berkeley Kaggle Decal
- Short MIT Intro to DL Course
-
Blogs and other Community based resources
- openai.com/
- distill
- Machine Learning(Theory)
- machinelearningmastery.com/blog/
- KD Nuggets
- Data Elixir - News and resources for data science practitioners
- Most Viewed Machine Learning writers
- William Chen's Answers
- Data Science FAQs on Quora
- Machine Learning FAQs on Quora
- I Am Trask
- Dataconomy Home - Dataconomy
- colah
- AWS Machine Learning Blog
- Machine Learning Blog, ML@CMU, Carnegie Mellon University
- MIT Machine Learning
- Andrew Ng
- Storytelling with Statistics
- Towards Data Science
- Facebook AI
- Data Science Topic on Quora
- Michael Hochster's Answers
- Ricardo Vladimiro's Answers
- Great Machine Learnings Blog
- Microsoft Machine Learning Blog
- News on Artificial Intelligence and Machine Learning
- Machine Learning | Latest News, Photos & Videos | WIRED
- Analytics India Magazine | Artificial Intelligence & Data Science stories
- Blogs on Artificial Intelligence, Machine Learning, Python & Data Science
- Applied AI for Business: Industry Newsletter for Executives Applying Machine Learning to Enterprises - TOPBOTS
- The Week in Data – The ODI
- Import AI
- Yannic Kilcher
- Towards AI — Multidisciplinary Science Journal – Medium
- Newest 'machine-learning' Questions - Stack Overflow
- Weekly topics - Dataquest Community
- Reddit - Machine Learning
- Reddit - Learn Machine Learning
- Reddit - Deep Learning
- Reddit - A Powerful Machine Intelligence Library
- Reddit GoogleColabNotebooks
- Amazon Science Blog
- BAIR - Berkeley Artificial Intelligence Research
- Towards AI — Multidisciplinary Science Journal – Medium
- Microsoft Machine Learning Blog
- Applied AI for Business: Industry Newsletter for Executives Applying Machine Learning to Enterprises - TOPBOTS
- Data Science Community Newsletter – NYU Center for Data Science
- The Week in Data – The ODI
- Import AI
- Yannic Kilcher
- algorithmia.com/blog
-
Most Important Deep Learning Papers
- Dropout
- GloVe
- WaveNet
- ZFNet
- VGGNet
- Alpha Go
- Atari DQN
- Mixture of Experts
- Population Based Training of NN's
- Word2Vec
- Seq2Seq
- Speech Recognition
- Adversarial Images
- SqueezeNet
- Overcoming Catastrophic Forgetting in NNs
- AlexNet
- Hidden Technical Debt in ML Systems
- MobileNets
- Memory Networks
- Escaping Saddle Points Efficiently
- Attention is All You Need
- Batch Norm
- R-CNN
- Synthetic Gradients
- A3C
- Rethinking Generalization
- EBGAN
- Style Transfer
- Dynamic Coattention Networks
- ReLu
- Xavier Initialization
- Saddle Points and Non-convexity of Neural Networks
- Quasi-Recurrent Neural Networks
- Unsupervised Machine Translation with Monolingual Corpora
- Learned Index Structures
- Visualizing Loss Landscapes
- Influence Functions
- Wasserstein GAN
- Relational Networks
- Fast R-CNN
- ResNet
- Xavier Initialization
- Dynamic Routing Between Capsules
- DenseNet
- Generative Adversarial Networks
- EBGAN
- GoogLeNet
- R-CNN
- Spatial Transformer Networks
- DCGAN
- Synthetic Gradients
- Neural Turing Machines
- A3C
- Gradient Descent by Gradient Descent
- Rethinking Generalization
- Densely Connected CNNs
- Style Transfer
- Pixel RNN
- Dynamic Coattention Networks
- Convolutional Seq2Seq Learning
- Adam
- ReLu
- Saddle Points and Non-convexity of Neural Networks
- Quasi-Recurrent Neural Networks
- Unsupervised Machine Translation with Monolingual Corpora
- Learned Index Structures
- Visualizing Loss Landscapes
- Learning from Imbalanced Data
-
Top Machine Learning Podcasts
- O’Reilly Data Show Podcast – O’Reilly
- Spotify – Making Data Simple | Podcast on Spotify
- Spotify – The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) | Podcast on Spotify
- Introduction · Data Skeptic
- Spotify – Data Stories | Podcast on Spotify
- Spotify – DataFramed | Podcast on Spotify
- Hunting for the Higgs · Linear Digressions
- Spotify – Not So Standard Deviations | Podcast on Spotify
- Spotify – Data Engineering Podcast | Podcast on Spotify
- Spotify – SuperDataScience | Podcast on Spotify
- Spotify – Data Science at Home | Podcast on Spotify
- Spotify – The Digital Analytics Power Hour | Podcast on Spotify
- Home - HumAIn Podcast
- Spotify – HumAIn Podcast - Artificial Intelligence, Data Science, and Developer Education | Podcast on Spotify
- Let's Reflect · Talking Machines
- Spotify – Lex Fridman Podcast | Artificial Intelligence (AI) | Podcast on Spotify
- AI Today Podcast: The 2020 State of AI – Interview with Wilson Pang, CTO at Appen · AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion
- Spotify – AI in Business | Podcast on Spotify
- Upskilling from Home · Data Crunch Corporation
-
Project Ideas for deep learning and general machine learning
-
Natural Language Project Ideas
- Quora question pairs
- Toxic comments on Kaggle
- new speeches in Obama's style
- personal website
- Essays with human graded scores
- NCERT books - 12/school students in India, [NarrativeQA by Google DeepMind](https://github.com/deepmind/narrativeqa) and [SQuAD by Stanford](https://rajpurkar.github.io/SQuAD-explorer/)
- Facebook's FAIR
- @realDonaldTrump
- English Transcript of Modi speeches
- Chat-bot architecture
- Chat-bot architecture
- Reddit Dataset
-
Forecasting Project Ideas
-
Recommendation systems Project Ideas
-
Kaggle Strategies and skills
- LightGBM for distributed and faster training.
- XGBoost : Gradient boosted decision trees.
- cudf
- parquet
- Methods to tackle class imbalance
- Synthetic Minority Oversampling Technique
- shuffle for augmentation
- synthetic samples in the dataset
- Signal denoising
- missing data
- encoding techniques for categorical data
- model to predict missing values
- shuffling of data
- data loading with pandas
- reduce the size of data by 70%
- reducing the size of some attributes
- Dask to read and manipulate the data
- feather
- optimizing RAM
- malware detection
- EDA for malware detection
- EDA for home credit loan prediction
- VSB Power Line Fault Detection
- encoding cross validation
- handle categories
- cyclic features for deep learning
- feature engineering methods
- using featuretools
- microsoft malware detection
- feature extraction.
- RAPIDS framework.
- processing features using LGBM.
- Principal component analysis
- microsoft malware detection
- frequency features
- Aggregate time series features
- Home default risk competition.
- Santander Transaction Prediction.
- features selection using sklearn.
- Permutation feature importance.
- Adversarial feature validation.
- null importances.
- Random forest classifier.
- Naive bayes classifier.
- Gaussian naive bayes model.
- LGBM + CNN model used in 3rd place solution of Santander Customer Transaction Prediction
- Knowledge distillation in Neural Network.
- Follow the regularized leader method.
- LGB boosting methods
- NN + focal loss experiment.
- Keras NN with timeseries splitter.
- 5th place NN architecture with code for Santander Transaction prediction.
- Tree explainer using SHAP.
- Methods to tackle class imbalance
- shuffle for augmentation
- synthetic samples in the dataset
- Signal denoising
- missing data
- encoding techniques for categorical data
- shuffling of data
- reduce the size of data by 70%
- reducing the size of some attributes
- Dask to read and manipulate the data
- optimizing RAM
- malware detection
- EDA for malware detection
- EDA for home credit loan prediction
- EDA for Santader prediction
- VSB Power Line Fault Detection
- handle categories
- cyclic features for deep learning
- feature engineering methods
- using featuretools
- microsoft malware detection
- feature extraction.
- RAPIDS framework.
- frequency features
- different train and test distribution.
- normalize with sklearn.
- features selection using sklearn.
- Adversarial feature validation.
- null importances.
- CatBoost to handle categorical data.
- Naive bayes classifier.
- Gaussian naive bayes model.
- LGBM + CNN model used in 3rd place solution of Santander Customer Transaction Prediction
- Knowledge distillation in Neural Network.
- NN + focal loss experiment.
- Keras NN with timeseries splitter.
-
Interview-Related-Links
-
Genetic Algorithms
- Genetic Algorithms Wikipedia Page
- Genetic Programming
- Genetic Alogorithms vs Genetic Programming (Quora) - are-the-differences-between-genetic-algorithms-and-genetic-programming)
- Genetic Programming in Python (GitHub)
- Genetic Algorithms Explained in Plain English
-
Kaggle Competitions WriteUp
-
Linear Regression Tutorials
- Is linear regression valid when the dependant variable is not normally distributed?
- Dummy Variable Trap | Multicollinearity
- Dealing with multicollinearity using VIFs
- Elastic Net - [Regularization and Variable Selection via the
- Simple Linear Regression
- Assumptions of Linear Regression - is-a-complete-list-of-the-usual-assumptions-for-linear-regression)
- Coursera Course - Linear Regression with One Variable
- Linear Regression for Machine Learning
- Applying and Interpreting Linear Regression
-
Logistic Regression Tutorials
- Logistic Regression Wiki
- Geometric Intuition of Logistic Regression
- Coursera Course - Logistic Regression and Classification
- Logistic Regression - An Introduction
- Stanford Logistic Regression Overview
- Siraj Raval Logistic Regression Tutorial
- Stanford Logistic Regression Overview
- Guide to an in-depth understanding of logistic regression
-
If you are new to Data Science
- <img src="https://cloud.githubusercontent.com/assets/182906/19517857/604f88d8-960c-11e6-97d6-16c9738cb824.png" width="150" />
- <img src="http://i.imgur.com/rb9ruaa.png" width="150" /> - a-data-scientist/). |
- <img src="http://i.imgur.com/0OoLaa5.png" width="150" /> - differences-of-a-data-scientist-vs-data-engineer) |
- <img src="http://i.imgur.com/W2t2Roz.png" width="150" />
- <img src="http://i.imgur.com/XBgKF2l.png" width="150" />
- <img src="http://i.imgur.com/l9ZGtal.jpg" width="150" />
- <img src="http://i.imgur.com/b9xYdZB.jpg" width="150" /> - to-become-a-data-scientist-before-you-graduate/) by Berkeley Science Review. |
- <img src="http://i.imgur.com/TWkB4X6.png" width="150" />
-
List of Most Starred Github Projects related to Deep Learning
- bert - trained models for BERT |
- spleeter
- awesome-scalability - Scale Systems |
- tensorflow
- keras
- opencv
- transformers - of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0. |
- julia
- Real-Time-Voice-Cloning - time |
- pytorch
- tesseract
- Virgilio - Learning. |
- Qix
- openpose - time multi-person keypoint detection library for body, face, hands, and foot estimation |
- xgboost
- faceswap
- face_recognition
- 100-Days-Of-ML-Code
- basics
-
Natural Language Processing (NLP)
-
Statistics
-
Inspirational Stories of people breaking into Machine Learning and Data-Science
- Brian Ross - from Carpenter to Data-Engineer
- Krish Naik
- Success Story of Saugata Paul : A career break from Infosys to Data Scientist at SPi Global
- Success Story of Ramesh Vellanki: A non-programmer to Data Scientist @ CitiBank
- Success Story of Sriganesh: From 10+ years @ IT Services to a Data Scientist @ Amazon
- How I Got 4 Data Science Offers and Doubled my Income 2 Months after being Laid Off
- I got 7 job offers during the worst job market in history. Here’s the data
- I got 7 job offers during the worst job market in history. Here’s the data
- Krish Naik
- Success Story of Saugata Paul : A career break from Infosys to Data Scientist at SPi Global
- Success Story of Ramesh Vellanki: A non-programmer to Data Scientist @ CitiBank
- Success Story of Sriganesh: From 10+ years @ IT Services to a Data Scientist @ Amazon
- How I Got 4 Data Science Offers and Doubled my Income 2 Months after being Laid Off
-
Super Large Kaggle Datasets
- Sketches and Strokes from the QuickDraw Game
- GloVe Reddit Comments
- melanoma 2019 orig
- Microsoft Malware Prediction
- LANL Earthquake Prediction
- APTOS 2019 Blindness Detection
- IEEE's Signal Processing Society - Camera Model Identification
- OPENML - An open science datasets for machine learning
- Spoken Language Identification
- COVID19-Engineering-Books-NLP-Dataset
- Lyft 3D Object Detection for Autonomous Vehicles
- Sketches and Strokes from the QuickDraw Game
- GloVe Reddit Comments
- melanoma 2019 orig
- OPENML - An open science datasets for machine learning
- Spoken Language Identification
- COVID19-Engineering-Books-NLP-Dataset
-
Numpy
- numpy - dimensional arrays and matrices, along with a large library of high-level mathematical functions to operate on these arrays. |
- Introduction-to-NumPy
- Understanding-Data-Types
- The-Basics-Of-NumPy-Arrays
- Computation-on-arrays-ufuncs
- Computation-on-arrays-aggregates
- Computation-on-arrays-broadcasting
- Boolean-Arrays-and-Masks
- Fancy-Indexing
- Sorting
- Structured-Data-NumPy
- Sorting
- numpy - dimensional arrays and matrices, along with a large library of high-level mathematical functions to operate on these arrays. |
- Introduction-to-NumPy
- Understanding-Data-Types
- The-Basics-Of-NumPy-Arrays
- Computation-on-arrays-ufuncs
- Computation-on-arrays-aggregates
- Computation-on-arrays-broadcasting
- Boolean-Arrays-and-Masks
- Fancy-Indexing
- Structured-Data-NumPy
-
K Nearest Neighbors Tutorials
-
Some of the Best Kaggle Competitions for Beginners
Categories
General Datasets
114
Kaggle Strategies and skills
89
Most Important Deep Learning Papers
68
Blogs and other Community based resources
51
Numpy
22
Best Courses
20
List of Most Starred Github Projects related to Deep Learning
19
Top Machine Learning Podcasts
19
Super Large Kaggle Datasets
17
Awesome Deep Learning Projects
15
Best Deep Learning Courses
14
Inspirational Stories of people breaking into Machine Learning and Data-Science
13
Natural Language Project Ideas
12
ML Math
10
Linear Regression Tutorials
9
Logistic Regression Tutorials
8
Finance Related Datasets
8
If you are new to Data Science
8
Kaggle Competitions WriteUp
6
Project Ideas for deep learning and general machine learning
6
Genetic Algorithms
5
TensorFlow
5
Deep Learning Resources
5
Interview-Related-Links
5
K Nearest Neighbors Tutorials
4
Forecasting Project Ideas
3
Recommendation systems Project Ideas
2
Some of the Best Kaggle Competitions for Beginners
2
Natural Language Processing (NLP)
2
Statistics
1
Machine Learning & Deep Learning Tutorials
1
Sub Categories
Keywords
machine-learning
23
deep-learning
20
python
12
tensorflow
9
awesome
8
nlp
7
awesome-list
7
natural-language-processing
5
computer-vision
5
data-science
4
neural-network
4
pytorch
4
scikit-learn
3
distributed-systems
3
deep-neural-networks
3
pretrained-models
2
resources
2
deeplearning
2
neural-networks
2
tutorial
2
opencv
2
pose-estimation
2
cpp
2
gpu
2
numpy
2
lstm
2
pandas
2
data-analysis
2
list
2
science
1
deep-networks
1
programming-language
1
numerical
1
julialang
1
julia-language
1
julia
1
hpc
1
vlm
1
transformer
1
speech-recognition
1
face-images
1
qwen
1
pytorch-transformers
1
model-hub
1
llm
1
recurrent-networks
1
glm
1
gemma
1
activity-understanding
1
activity-recognition
1