The Data Science Toolbox

Algorithms
 Conditional Random Field (CRF)
Deep Learning Packages
 Sonnet
 TRFL
 TensorLight
 Keras
 altair
 addepar
 amcharts
 anychart
 bokeh
 Comet
 slemma
 d3plus
 DataDriven Documents(D3js)
 dygraphs
 ECharts
 exhibit
 gephi
 ggplot2
 Glue
 Google Chart Gallery
 highcarts
 import.io
 jqplot
 Matplotlib
 nvd3
 Openrefine
 plot.ly
 raw
 Seaborn
 techanjs
 Timeline
 variancecharts
 vida
 Wrangler
 r2d3
 NetworkX
 Redash
 C3
 geomap
 Keras
 PyTorch
 torchvision
 torchtext
 torchaudio
 ignite
 PyTorchNet
 PyToune
 skorch
 PyVarInf
 pytorch_geometric
 GPyTorch
 pyro
 Catalyst
 pytorch_tabular
 Yolov3
 Yolov5
 Yolov8
 TensorFlow
 TensorLayer
 TFLearn
 tensorpack
 Polyaxon
 NeuPy
 tfdeploy
 TensorFlow Fold
 tensorlm
 Mesh TensorFlow
 Ludwig
 TFAgents
 TensorForce
 kerascontrib
 Hyperas
 Elephas
 Hera
 Spektral
 qkeras
 kerasrl
 Talos
 cartodb
 Cube
 Resseract Lite
 vizzu
Comparison
 Regression
 Stacking
 Conditional Random Field (CRF)
 datacompy  DataComPy is a package to compare two Pandas DataFrames.
 C4.5
 Latent Dirichlet Allocation (LDA)
 Linear Regression
 Ordinary Least Squares
 Logistic Regression
 Stepwise Regression
 Multivariate Adaptive Regression Splines
 Softmax Regression
 Locally Estimated Scatterplot Smoothing
 Decision Trees
 ID3 algorithm
 Ensemble Learning
 Boosting
 Bagging
 Random Forest
 AdaBoost
 Fuzzy clustering
 Mixture models
 Dimension Reduction
 Neural Networks
 Adaptive resonance theory
 Hidden Markov Models (HMM)
 Q Learning
 SARSA (StateActionRewardStateAction) algorithm
 Temporal difference learning
 kMeans
 Apriori
 EM (ExpectationMaximization)
 PageRank
 Naive Bayes
 CART (Classification and Regression Trees)
 Multilayer Perceptron
 Convolutional Neural Network (CNN)
 Recurrent Neural Network (RNN)
 Boltzmann Machines
 Autoencoder
 Generative Adversarial Network (GAN)
 Transformer
 SVM (Support Vector Machine)
 Heuristic approaches
 Densitybased clustering
 SelfOrganized Maps
 KNN (KNearest Neighbors)
 ML System Designs)

General Machine Learning Packages
 scikitlearn
 Shogun
 hyperlearn
 scikitsurvival
 scikitmultilearn
 sklearnexpertsys
 scikitfeature
 scikitrebate
 seqlearn
 sklearnbayes
 sklearncrfsuite
 sklearndeap
 sigopt_sklearn
 sklearnevaluation
 scikitimage
 scikitopt
 scikitposthocs
 pystruct
 xLearn
 cuML
 causalml
 mlpack
 MLxtend
 modAL
 Sparkitlearn
 dlib
 imodels
 RuleFit
 pyGAM
 Deepchecks
 interpretable

Miscellaneous Tools
 Data Science Lifecycle Template Repo
 Neptune.ai  friendly platform supporting data scientists in creating and sharing machine learning models. Neptune facilitates teamwork, infrastructure management, models comparison and reproducibility. 
 Datalab from Google
 Hortonworks Sandbox
 R
 Tidyverse
 RStudio
 Python  Pandas  Anaconda  ready Python distribution for largescale data processing, predictive analytics, and scientific computing 
 ScikitLearn
 NumPy  dimensional arrays and matrices and includes an assortment of highlevel mathematical functions to operate on these arrays. 
 Vaex
 SciPy
 Data Science Toolbox
 Datadog  scale data science. 
 Variance
 Kite Development Kit
 Domino Data Labs
 Apache Flink  purpose data processing. 
 Apache Hama  Level open source project, allowing you to do advanced analytics beyond MapReduce. 
 Weka
 Octave  level interpreted language, primarily intended for numerical computations.(Free Matlab) 
 Apache Spark  fast cluster computing 
 Data Mechanics  friendly and costeffective. 
 Caffe
 Torch
 Aerosolve
 Datawrapper
 Tensor Flow
 Natural Language Toolkit
 nlptoolkit for node.js
 Julia  level, highperformance dynamic programming language for technical computing 
 Apache Zeppelin  based notebook that enables datadriven, interactive data analytics and collaborative documents with SQL, Scala and more 
 LightTag
 UBIAI  touse text annotation tool for teams with most comprehensive autoannotation features. Supports NER, relations and document classification as well as OCR annotation for invoice labeling 
 AWS Data Wrangler  source Python package that extends the power of Pandas library to AWS connecting DataFrames and AWS data related services (Amazon Redshift, AWS Glue, Amazon Athena, Amazon EMR, etc). 
 Amazon Rekognition
 Amazon Textract
 Amazon Lookout for Vision
 Amazon CodeGuru  powered recommendations.
 Dask
 Statsmodels  based inferential statistics, hypothesis testing and regression framework 
 Gensim  source library for topic modeling of natural language text 
 spaCy
 DAGsHub
 Deepnote  compatible, with realtime collaboration and running in the cloud. 
 Valohai
 PyMC3
 PyStan
 hmmlearn
 Nimblebox  stack MLOps platform designed to help data scientists and machine learning practitioners around the world discover, create, and launch multicloud apps from their web browser. 
 Explore Data Science Libraries
 MLflow
 AutoGluon  series, and multimodal data 
 Arize AI  causing issues such as data quality and performance drift. 
 Aureo.io  code platform that focuses on building artificial intelligence. It provides users with the capability to create pipelines, automations and integrate them with artificial intelligence models – all with their basic data. 
 ERD Lab
 ArizePhoenix  uncover insights, surface problems, monitor, and fine tune your models. 
 Synthical  powered collaborative environment for research. Find relevant papers, create collections to manage bibliography, and summarize content — all in one place 
 Synthical  powered collaborative environment for research. Find relevant papers, create collections to manage bibliography, and summarize content — all in one place 
 Domino Data Labs
 UBIAI  touse text annotation tool for teams with most comprehensive autoannotation features. Supports NER, relations and document classification as well as OCR annotation for invoice labeling 
 DAGsHub
 Synthical  powered collaborative environment for research. Find relevant papers, create collections to manage bibliography, and summarize content — all in one place 
 Annotation Lab  toEnd NoCode platform for text annotation and DL model training/tuning. Outofthebox support for Named Entity Recognition, Classification, Relation extraction and Assertion Status Spark NLP models. Unlimited support for users, teams, projects, documents. 
 Explore Data Science Libraries
 Data Mechanics  friendly and costeffective. 


Training Resources

Free Courses
 Data Scientist with R
 Data Scientist with Python
 Genetic Algorithms OCW Course
 Convex Optimization  Convex Optimization (basics of convex analysis; leastsquares, linear and quadratic programs, semidefinite programming, minimax, extremal volume, and other problems; optimality conditions, duality theory...)
 Skillcombo  Data Science  1000+ free online Data Science courses
 Learning from Data  Introduction to machine learning covering basic theory, algorithms and applications
 Kaggle  Learn about Data Science, Machine Learning, Python etc
 ML Observability Fundamentals  Learn how to monitor and rootcause production ML issues.
 Weights & Biases Effective MLOps: Model Development  Free Course and Certification for building an endtoend machine using W&B
 Python for Data Science by Scaler  This course is designed to empower beginners with the essential skills to excel in today's datadriven world. The comprehensive curriculum will give you a solid foundation in statistics, programming, data visualization, and machine learning.
 MLSysNYU2022  Slides, scripts and materials for the Machine Learning in Finance course at NYU Tandon, 2022.
 Prompt Engineering for Vision Models  Learn to prompt cuttingedge computer vision models with natural language, coordinate points, bounding boxes, segmentation masks, and even other images in this free course from DeepLearning.AI.
 AI Expert Roadmap  Roadmap to becoming an Artificial Intelligence Expert
 Handson Train and Deploy ML  A handson course to train and deploy a serverless API that predicts crypto prices.
 LLMOps: Building RealWorld Applications With Large Language Models  Learn to build modern software with LLMs using the newest tools and techniques in the field.

Tutorials
 1000 Data Science Projects
 How To Label Data
 Your Guide to Latent Dirichlet Allocation
 Over 1000 Data Science Online Courses at Classpert Online Search Engine
 Python for Data Science: A Beginner’s Guide
 12 free Data Science projects to practice Python and Pandas
 #tidytuesday
 Data science your way
 PySpark Cheatsheet
 Tutorials of source code from the book Genetic Algorithms with Python by Clinton Sheppard
 Tutorials to get started on signal processing for machine learning
 Minimum Viable Study Plan for Machine Learning Interviews
 Machine Learning, Data Science and Deep Learning with Python
MOOC's
 Coursera Introduction to Data Science
 Data Science  9 Steps Courses, A Specialization on Coursera
 Data Mining  5 Steps Courses, A Specialization on Coursera
 Machine Learning – 5 Steps Courses, A Specialization on Coursera
 OpenIntro
 CS 171 Visualization
 Process Mining: Data science in Action
 Oxford Deep Learning
 Oxford Deep Learning  video
 Oxford Machine Learning
 UBC Machine Learning  video
 Coursera Big Data Specialization
 Statistical Thinking for Data Science and Analytics by Edx
 Cognitive Class AI by IBM
 Udacity  Deep Learning
 Keras in Motion
 Microsoft Professional Program for Data Science
 COMP3222/COMP6246  Machine Learning Technologies
 CS 231  Convolutional Neural Networks for Visual Recognition
 Coursera Tensorflow in practice
 Coursera Deep Learning Specialization
 365 Data Science Course
 Coursera Natural Language Processing Specialization
 Coursera GAN Specialization
 Codecademy's Data Science
 Linear Algebra  Linear Algebra course by Gilbert Strang
 A 2020 Vision of Linear Algebra (G. Strang)
 Python for Data Science Foundation Course
 Data Science: Statistics & Machine Learning
 Machine Learning Engineering for Production (MLOps)
 Recommender Systems Specialization from University of Minnesota
 Stanford Artificial Intelligence Professional Program
 Programming with Julia
 Scaler Data Science & Machine Learning Program
 CS 109 Data Science
 Data Science Specialization
Intensive Programs

Colleges
 Data Science Degree @ Berkeley
 Data Science Degree @ UVA
 Data Science Degree @ Wisconsin
 BS in Data Science & Applications
 MS in Computer Information Systems @ Boston University
 MS in Applied Data Science @ Syracuse
 M.S. Management & Data Science @ Leuphana
 Master of Data Science @ Melbourne University
 Msc in Data Science @ The University of Edinburgh
 Master of Management Analytics @ Queen's University
 Master of Data Science @ Illinois Institute of Technology
 Master of Applied Data Science @ The University of Michigan
 Master Data Science and Artificial Intelligence @ Eindhoven University of Technology
 Master's Degree in Data Science and Computer Engineering @ University of Granada
 A list of colleges and universities offering degrees in data science.
 MS in Business Analytics @ ASU Online


Literature and Media

Journals, Publications and Magazines
 ICML  International Conference on Machine Learning
 GECCO  The Genetic and Evolutionary Computation Conference (GECCO)
 epjdatascience
 Journal of Data Science  an international journal devoted to applications of statistical methods at large
 Big Data Research
 Journal of Big Data
 Big Data & Society
 Data Science Journal
 datatau.com/news  Like Hacker News, but for data
 Data Science Trello Board
 Medium Data Science Topic  Data Science related publications on medium
Bloggers
 datascopeanalytics
 Wes McKinney  Wes McKinney Archives.
 Matthew Russell  Mining The Social Web.
 Greg Reda  Greg Reda Personal Blog
 Julia Evans  Recurse Center alumna
 Hakan Kardas  Personal Web Page
 Sean J. Taylor  Personal Web Page
 Drew Conway  Personal Web Page
 Hilary Mason  Personal Web Page
 Noah Iliinsky  Personal Blog
 Matt Harrison  Personal Blog
 Vamshi Ambati  AllThings Data Sciene
 Prash Chan  Tech Blog on Master Data Management And Every Buzz Surrounding It
 Clare Corthell  The Open Source Data Science Masters
 Paul Miller
 Data Science London  profit organization dedicated to the free, open, dissemination of data science.
 Datawrangling
 Quora Data Science  Data Science Questions and Answers from experts
 Siah
 Machine Learning Mastery
 Daniel Forsyth  Personal Blog
 Data Science Weekly  Weekly News Blog
 Revolution Analytics  Data Science Blog
 R Bloggers  R Bloggers
 The Practical Quant
 Yet Another Data Blog
 Spenczar  building to reporting.
 KD Nuggets
 Meta Brown  Personal Blog
 Data Scientist
 WhatSTheBigData
 Tevfik Kosar  Magnus Notitia
 New Data Scientist
 Harvard Data Science  Thoughts on Statistical Computing and Visualization
 Data Science 101  Learning To Be A Data Scientist
 Kaggle Past Solutions
 Adventures in Data Land
 Learning Lover
 Dataists
 DataMania
 DataMagnum
 Pvalue  Musings on data science, machine learning, and stats.
 Digital transformation
 Data Mania Blog  [The File Drawer](https://chrissaid.io/)  Chris Said's science blog
 Emilio Ferrara's web page
 DataNews
 Reddit TextMining
 Periscopic
 Hilary Parker
 Data Science Lab
 Meaning of
 DATA MINERS BLOG
 FlowingData  Visualization and Statistics
 Calculated Risk
 O'reilly Learning Blog
 Dominodatalab
 i am trask  A Machine Learning Craftsmanship Blog
 Vademecum of Practical Data Science  Handbook and recipes for datadriven solutions of realworld problems
 Dataconomy  A blog on the newly emerging data economy
 Springboard  A blog with resources for data science learners
 Analytics Vidhya  A fullfledged website about data science and analytics study material.
 Occam's Razor  Focused on Web Analytics.
 Data School  Data science tutorials for beginners!
 Colah's Blog  Blog for understanding Neural Networks!
 Sebastian's Blog  Blog for NLP and transfer learning!
 Distill  Dedicated to clear explanations of machine learning!
 Chris Albon's Website  Data Science and AI notes
 Andrew Carr  Data Science with Esoteric programming languages
 floydhub  Blog for Evolutionary Algorithms
 Jingles  Review and extract key concepts from academic papers
 nbshare  Data Science notebooks
 Deep and Shallow  All things Deep and Shallow in Data Science
 Loic Tetrel  Data science blog
 Chip Huyen's Blog  ML Engineering, MLOps, and the use of ML in startups
 Maria Khalusova  Data science blog
 Aditi Rastogi  ML,DL,Data Science blog
 Santiago Basulto  Data Science with Python
 Akhil Soni  ML, DL and Data Science
 Akhil Soni  ML, DL and Data Science
 datascientistjourney

Books
 Deep Learning Cookbook
 Data Science From Scratch: First Principles with Python
 Artificial Intelligence with Python  Tutorialspoint
 Machine Learning from Scratch
 Probabilistic Machine Learning: An Introduction
 A Comprehensive Guide to Machine Learning
 How to Lead in Data Science  Early Access
 Fighting Churn With Data
 Data Science at Scale with Python and Dask
 The Data Science Handbook: Advice and Insights from 25 Amazing Data Scientists
 Think Like a Data Scientist
 Introducing Data Science
 Practical Data Science with R
 Everyday Data Science
 Exploring Data Science  free eBook sampler
 Exploring the Data Jungle  free eBook sampler
 Classic Computer Science Problems in Python
 Math for Programmers
 R in Action, Third Edition
 Data Science Bookcamp
 Data Science Thinking: The Next Scientific, Technological and Economic Revolution
 Applied Data Science: Lessons Learned for the DataDriven Business
 The Data Science Handbook
 Essential Natural Language Processing  Early access
 Mining Massive Datasets  free ebook comprehended by an online course
 Pandas in Action  Early access
 Genetic Algorithms and Genetic Programming
 Advances in Evolutionary Algorithms  Free Download
 Genetic Programming: New Approaches and Successful Applications  Free Download
 Evolutionary Algorithms  Free Download
 Advances in Genetic Programming, Vol. 3  Free Download
 Global Optimization Algorithms: Theory and Application  Free Download
 Genetic Algorithms and Evolutionary Computation  Free Download
 Convex Optimization  Convex Optimization book by Stephen Boyd  Free Download
 R for Data Science
 Build a Career in Data Science
 Machine Learning Bookcamp  Early access
 HandsOn Machine Learning with ScikitLearn, Keras, and TensorFlow, 2nd Edition
 Effective Data Science Infrastructure
 Practical MLOps: How to Get Ready for Production Models
 Regression, a Friendly guide  Early Access
 Streaming Systems: The What, Where, When, and How of LargeScale Data Processing
 Data Science at the Command Line: Facing the Future with TimeTested Tools
 Machine Learning  CIn UFPE
 Machine Learning with Python  Tutorialspoint
 Deep Learning
 Designing Cloud Data Platforms  Early Access
 The Elements of Statistical Learning: Data Mining, Inference, and Prediction
 Deep Learning with PyTorch
 Neural Networks and Deep Learning
 Introduction to Machine Learning with Python
 Artificial Intelligence: Foundations of Computational Agents, 2nd Edition  Free HTML version
 The Quest for Artificial Intelligence: A History of Ideas and Achievements  Free Download
 Graph Algorithms for Data Science  Early Access
 Data Mesh in Action  Early Access
 Regular Expression Puzzles and AI Coding Assistants
 Dive into Deep Learning
 Data for All
 Foundations of Data Science
 Comet for DataScience: Enhance your ability to manage and optimize the life cycle of your data science project
 Software Engineering for Data Scientists  Early Access
 Julia for Data Science  Early Access
 Machine Learning For Absolute Beginners
 eBook sale  Save up to 45% on eBooks!
 Causal Machine Learning
 Managing ML Projects
 Causal Inference for Data Science
 Data Analysis with Python and PySpark
 Casual Inference for Data Science  Early Access
 An Introduction to Statistical Learning  Download Page

Newsletters

Presentations

Podcasts
 AI at Home
 AI Today
 Adversarial Learning
 Becoming a Data Scientist
 Chai time Data Science
 Data Crunch
 Data Engineering Podcast
 Data Science at Home
 Data Science Mixer
 Data Skeptic
 Datacast
 DataFramed
 DataTalks.Club
 Gradient Dissent
 Learning Machines 101
 Let's Data (Brazil)
 Linear Digressions
 Not So Standard Deviations
 O'Reilly Data Show Podcast
 Partially Derivative
 Superdatascience
 The Data Engineering Show
 The Radical AI Podcast
 The Robot Brains Podcast
 What's The Point
 How AI Built This
 Data Stories

YouTube Videos & Channels
 What is machine learning?
 Andrew Ng: Deep Learning, SelfTaught Learning and Unsupervised Feature Learning
 Data36  Data Science for Beginners by Tomi Mester
 Deep Learning: Intelligence from Big Data
 Interview with Google's AI and Deep Learning 'Godfather' Geoffrey Hinton
 Introduction to Deep Learning with Python
 What is machine learning, and how does it work?
 Data School  Data Science Education
 Neural Nets for Newbies by Melanie Warrick (May 2015)
 Neural Networks video series by Hugo Larochelle
 Google DeepMind cofounder Shane Legg  Machine Super Intelligence
 Data Science Primer
 Data Science with Genetic Algorithms
 Data Science for Beginners
 DataTalks.Club
 mlops.community  Interviews of industry experts about production ML
 ML Street Talk  Unabashedly technical and noncommercial, so you will hear no annoying pitches.
 Neural networks from scratch by Sentdex
 Manning Publications YouTube channel
 Ask Dr Chong: How to Lead in Data Science  Part 1
 Ask Dr Chong: How to Lead in Data Science  Part 2
 Ask Dr Chong: How to Lead in Data Science  Part 3
 Ask Dr Chong: How to Lead in Data Science  Part 4
 Ask Dr Chong: How to Lead in Data Science  Part 5
 Ask Dr Chong: How to Lead in Data Science  Part 6
 Regression Models: Applying simple Poisson regression
 Deep Learning Architectures
 Time Series Modelling and Analysis
Real World

Disaster
 depremml  sourced [afet.org](https://afet.org).


What is Data Science?
 What is Data Science @ O'reilly
 The sexiest job of 21st century
 Wikipedia
 How to Become a Data Scientist
 a very short history of #datascience  computer science. The term “Data Science” has emerged only recently to specifically designate a new profession that is expected to make sense of the vast stores of big data. But making sense of data has a long history and has been discussed by scientists, statisticians, librarians, computer scientists and others for years. The following timeline traces the evolution of the term “Data Science” and its use, attempts to define it, and related terms._ 
 Software Development Resources for Data Scientists  ready code and tools._
 What is Data Science @ Quora
 Data Scientist Roadmap  driven world where approx 328.77 million terabytes of data are generated daily. And this number is only increasing day by day, which in turn increases the demand for skilled data scientists who can utilize this data to drive business growth._

Where do I Start?
 Python  generated packages. To install packages, there are two main methods: Pip (invoked as `pip install`), the package manager that comes bundled with Python, and [Anaconda](https://www.anaconda.com) (invoked as `conda install`), a powerful package manager that can install packages for Python, R, and can download executables like Git.
 ScikitLearn  purpose data science package which implements the most popular algorithms  it also includes rich documentation, tutorials, and examples of the models it implements. Even if you prefer to write your own implementations, ScikitLearn is a valuable reference to the nutsandbolts behind many of the common algorithms you'll find. With [Pandas](https://pandas.pydata.org/), one can collect and analyze their data into a convenient table format. [Numpy](https://numpy.org/) provides very fast tooling for mathematical operations, with a focus on vectors and matrices. [Seaborn](https://seaborn.pydata.org/), itself based on the [Matplotlib](https://matplotlib.org/) package, is a quick way to generate beautiful visualizations of your data, with many good defaults available out of the box, as well as a gallery showing how to produce many common visualizations of your data.

Socialize

Data Science Competitions

Slack Communities

Facebook Accounts
 Data
 Big Data Scientist
 Data Science Day
 Data Science Academy
 Facebook Data Science Page
 Data Science London
 Data Science Technology and Corporation
 Data Science  Closed Group
 Center for Data Science
 Big data hadoop NOSQL Hive Hbase
 Analytics, Data Mining, Predictive Modeling, Artificial Intelligence
 Big Data Analytics using R
 Big Data Analytics with R and Hadoop
 Big Data Learnings
 Big Data, Data Science, Data Mining & Statistics
 BigData/Hadoop Expert
 Data Mining / Machine Learning / AI
 Data Mining/Big Data  Social Network Ana
 Vademecum of Practical Data Science
 Veri Bilimi Istanbul
 The Data Science Blog
Twitter Accounts
 Big Data Combine  fire, live tryouts for data scientists seeking to monetize their models as trading strategies 
 Big Data Science
 Chris Said
 Clare Corthell
 DADI CharlesAbner
 Data Science Central
 Data Science London
 Data Science Renee
 Data Science Report
 Data Science Tips
 Data Vizzard
 DataScienceX
 DJ Patil
 Domino Data Lab
 Drew Conway
 Erin Bartolo  enjoying a love/hate relationship with its hype. @iSchoolSU #DataScience Program Mgr. 
 Greg Reda
 Gregory Piatetsky  founder, was Chief Scientist at 2 startups, parttime philosopher. 
 Hadley Wickham
 Hakan Kardas
 Hilary Mason
 Jeff Hammerbacher
 John Myles White
 Juan Miguel Lavista
 Julia Evans  Pandas  Data Analyze 
 Kenneth Cukier  author of Big Data (http://www.bigdatabook.com/). 
 Kevin Markham
 Kim Rees
 Sean J. Taylor
 Silvia K. Spiva
 Harsh B. Gupta
 Spencer Nelson
 Kirk Borne
 Luis Rei
 Matt Harrison  stack Python guy, author, instructor, currently playing Data Scientist. Occasional fathering, husbanding, organic gardening. 
 Matthew Russell
 Mert Nuhoğlu
 Monica Rogati  gamer, exmachine coder; namer. 
 Noah Iliinsky
 Paul Miller
 Peter Skomoroch  Principal Data Scientist @LinkedIn. Machine Learning, ProductRei, Networks 
 Prash Chan
 Quora Data Science
 RBloggers
 Rand Hindi
 Randy Olson
 Recep Erol
 Ryan Orban
 Tasos Skarlatidis  source. 
 Talha Oz
 Terry Timko
 Tony Baer
 Tony Ojeda  founder @DataCommunityDC. Founder @DistrictDataLab. #DataScience #BigData #DataDC 
 Vamshi Ambati
 Wes McKinney
 WileyEd  @Seagate Big Data Analytics @McKinsey Alum #BigData + #Analytics Evangelist #Hadoop, #Cloud, #Digital, & #R Enthusiast 
 WNYC Data News Team  driven journalism, making it visual, and showing our work. 
 Alexey Grigorev
 İlker Arslan
 INEVITABLE  up Company based in England, UK 

Telegram Channels

GitHub Groups


Fun

Datasets
 grouplens.org
 National Centers for Environmental Information
 ClimateData.us
 r/datasets
 UC Irvine Machine Learning Repository  contains data sets good for machine learning
 researchquality data sets
 Academic Torrents
 ADSB Exchange  Specific datasets for aircraft and Automatic Dependent SurveillanceBroadcast (ADSB) sources.
 A Deep Catalog of Human Genetic Variation
 data.gov  The home of the U.S. Government's open data
 United States Census Bureau
 usgovxml.com
 enigma.com  Navigate the world of public data  Quickly search and analyze billions of public records published by governments, companies and organizations.
 datahub.io
 aws.amazon.com/datasets
 datacite.org
 The official portal for European data
 NASDAQ:DATA  Nasdaq Data Link A premier source for financial, economic and alternative datasets.
 figshare.com
 GeoLite Legacy Downloadable Databases
 Quora's Big Datasets Answer
 Kaggle Datasets
 Google Public Data
 World Bank Data
 Open Data Philly
 MapLight  provides a variety of data free of charge for uses that are freely available to the general public. Click on a data set below to learn more
 GHDx  Institute for Health Metrics and Evaluation  a catalog of health and demographic datasets from around the world and including IHME results
 St. Louis Federal Reserve Economic Data  FRED
 New Zealand Institute of Economic Research – Data1850
 UNICEF Data
 undata
 NASA SocioEconomic Data and Applications Center  SEDAC
 The GDELT Project
 StackExchange Data Explorer  an open source tool for running arbitrary queries against public data from the Stack Exchange network.
 SocialGrep  a collection of open Reddit datasets.
 San Fransisco Government Open Data
 IBM Asset Dataset
 Open data Index
 Public Git Archive
 GHTorrent
 Microsoft Research Open Data
 Open Government Data Platform India
 Google Dataset Search (beta)
 Enron Email Dataset
 IBB Open Portal
 The Humanitarian Data Exchange
 Public Big Data Sets
 GHTorrent
Infographics
 <img src="https://i.imgur.com/0OoLaa5.png" width="150" />  differencesofadatascientistvsdataengineer) 
 <img src="https://cloud.githubusercontent.com/assets/182906/19517857/604f88d8960c11e697d616c9738cb824.png" width="150" />
 <img src="https://i.imgur.com/W2t2Roz.png" width="150" />
 <img src="https://i.imgur.com/rb9ruaa.png" width="150" />  adatascientist/). 
 <img src="https://i.imgur.com/XBgKF2l.png" width="150" />
 <img src="https://i.imgur.com/l9ZGtal.jpg" width="150" />
 <img src="https://i.imgur.com/TWkB4X6.png" width="150" />
 <img src="https://i.imgur.com/gtTlW5I.png" width="150" />
 <img src="https://scikitlearn.org/stable/_static/ml_map.png" width="150" />
 <img src="https://i.imgur.com/3JSyUq1.png" width="150" />
 <img src="https://i.imgur.com/DQqFwwy.png" width="150" />
 <img src="https://www.springboard.com/blog/wpcontent/uploads/2016/03/20160324_springboard_vennDiagram.png" width="150" height="150" />  sciencecareerpathsdifferentrolesindustry/) by Springboard 
 <img src="https://dataliteracy.geckoboard.com/assets/img/datafallaciestoavoidpreview.jpg" width="150" alt="Data Fallacies To Avoid" />  data scientist/nonstatistician colleagues [how to avoid mistakes with data](https://dataliteracy.geckoboard.com/poster/). From Geckoboard's [Data Literacy Lessons](https://dataliteracy.geckoboard.com/). 

Comics


Other Awesome Lists

Comics
 awesomeawesomeness
 Awesome Machine Learning
 awesomepython
 awesomer
 awesomedatasets
 awesomeMachine Learning & Deep Learning Tutorials
 Awesome Data Science Ideas
 Community Curated Data Science Resources
 Awesome Computer Vision Models
 Awesome Game Datasets
 Top Data Science Interview Questions
 Top Future Trends in Data Science in 2023
 How Generative AI Is Changing Creative Work
 What is generative AI?
 Glossary of common statistics and ML terms
 Deep Learning Interview Questions
