Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/drakearch/kaggle-courses

Kaggle courses and tutorials to get you started in the Data Science world.
https://github.com/drakearch/kaggle-courses

data-science deep-learning machine-learning pandas python

Last synced: about 5 hours ago
JSON representation

Kaggle courses and tutorials to get you started in the Data Science world.

Awesome Lists containing this project

README

        

# Kaggle Courses

- [Python](#python)
- [Pandas](#pandas)
- [Data Visualization](#data-visualization)
- [Intro to Machine Learning](#intro-to-machine-learning)
- [Intermediate Machine Learning](#intermediate-machine-learning)
- [Data Cleaning](#data-cleaning)
- [Feature Engineering](#feature-engineering)
- [Feature Engineering (2019)](#feature-engineering-2019)
- [Geospatial Analysis](#geospatial-analysis)
- [Time Series](#time-series)
- [Machine Learning Explainability](#machine-learning-explainability)
- [Intro to AI Ethics](#intro-to-ai-ethics)
- [Intro to Deep Learning](#intro-to-deep-learning)
- [Deep Learning](#deep-learning)
- [Computer Vision](#computer-vision)
- [Natural Language Processing](#natural-language-processing)
- [Intro to Game AI and Reinforcement Learning](#intro-to-game-ai-and-reinforcement-learning)
- [Intro to SQL](#intro-to-sql)
- [Advanced SQL](#advanced-sql)
- [Microchallenges](#microchallenges)

## Python

1. [Hello, Python](python/01-syntax-variables-and-numbers.ipynb)
A quick introduction to Python syntax, variable assignment, and numbers.

2. [Functions and Getting Help](python/02-functions-and-getting-help.ipynb)
Calling functions and defining our own, and using Python's builtin documentation.

3. [Booleans and Conditionals](python/03-booleans-and-conditionals.ipynb)
Using booleans for branching logic.

4. [Lists and Tuples](python/04-lists.ipynb)
Lists and the things you can do with them. Includes indexing, slicing and mutating.

5. [Loops and List Comprehensions](python/05-loops-and-list-comprehensions.ipynb)
For and while loops, and a much-loved Python feature: list comprehensions.

6. [Strings and Dictionaries](python/06-strings-and-dictionaries.ipynb)
Working with strings and dictionaries, two fundamental Python data types.

7. [Working with External Libraries](python/07-working-with-external-libraries.ipynb)
Imports, operator overloading, and survival tips for venturing into the world of external libraries.

## Pandas

1. [Exercise: Creating, Reading and Writing](pandas/01-creating-reading-and-writing.ipynb)
You can't work with data if you can't read it. Get started here.

2. [Exercise: Indexing, Selecting & Assigning](pandas/02-indexing-selecting-assigning.ipynb)
Pro data scientists do this dozens of times a day. You can, too!

3. [Exercise: Summary Functions and Maps](pandas/03-summary-functions-and-maps.ipynb)
Extract insights from your data.

4. [Exercise: Grouping and Sorting](pandas/04-grouping-and-sorting.ipynb)
Scale up your level of insight. The more complex the dataset, the more this matters.

5. [Exercise: Data Types and Missing Values](pandas/05-data-types-and-missing-values.ipynb)
Deal with the most common progress-blocking problems.

6. [Exercise: Renaming and Combining](pandas/06-renaming-and-combining.ipynb)
Data comes in from many sources. Help it all make sense together.

## Data Visualization

1. [Hello, Seaborn](data_visualization/01-hello-seaborn.ipynb)
Your first introduction to coding for data visualization.

2. [Line Charts](data_visualization/02-line-charts.ipynb)
Visualize trends over time.

3. [Bar Charts and Heatmaps](data_visualization/03-bar-charts-and-heatmaps.ipynb)
Use color or length to compare categories in a dataset.

4. [Scatter Plots](data_visualization/04-scatter-plots.ipynb)
Leverage the coordinate plane to explore relationships between variables.

5. [Distributions](data_visualization/05-distributions.ipynb)
Create histograms and density plots.

6. [Choosing Plot Types and Custom Styles](data_visualization/06-choosing-plot-types-and-custom-styles.ipynb)
Customize your charts and make them look snazzy.

7. [Final Project](data_visualization/07-final-project.ipynb)
Practice for real-world application.

## Intro to Machine Learning

1. [How Models Work](https://www.kaggle.com/dansbecker/how-models-work)
The first step if you're new to machine learning.

2. [Basic Data Exploration](intro_to_machine_learning/02-explore-your-data.ipynb)
Load and understand your data.

3. [Your First Machine Learning Model](intro_to_machine_learning/03-your-first-machine-learning-model.ipynb)
Building your first model. Hurray!

4. [Model Validation](intro_to_machine_learning/04-model-validation.ipynb)
Measure the performance of your model ? so you can test and compare alternatives.

1. [Underfitting and Overfitting](intro_to_machine_learning/05-underfitting-and-overfitting.ipynb)
Fine-tune your model for better performance.

6. [Random Forests](intro_to_machine_learning/06-random-forests.ipynb)
Using a more sophisticated machine learning algorithm.

7. [Exercise: Machine Learning Competitions](intro_to_machine_learning/07-machine-learning-competitions.ipynb)
Enter the world of machine learning competitions to keep improving and see your progress.

## Intermediate Machine Learning

1. [Introduction](intermediate_machine_learning/01-introduction.ipynb)
Review what you need for this Micro-Course.

2. [Missing Values](intermediate_machine_learning/02-missing-values.ipynb)
Missing values happen. Be prepared for this common challenge in real datasets.

3. [Categorical Variables](intermediate_machine_learning/03-categorical-variables.ipynb)
There's a lot of non-numeric data out there. Here's how to use it for machine learning.

4. [Pipelines](intermediate_machine_learning/04-pipelines.ipynb)
A critical skill for deploying (and even testing) complex models with pre-processing.

5. [Cross-Validation](intermediate_machine_learning/05-cross-validation.ipynb)
A better way to test your models.

6. [XGBoost](intermediate_machine_learning/06-xgboost.ipynb)
The most accurate modeling technique for structured data.

7. [Data Leakage](intermediate_machine_learning/07-data-leakage.ipynb)
Find and fix this problem that ruins your model in subtle ways.

## Data Cleaning

1. [Handling Missing Values](data_cleaning/01-handling-missing-values.ipynb)
Drop missing values, or fill them in with an automated workflow.

2. [Scaling and Normalization](data_cleaning/02-scaling-and-normalization.ipynb)
Transform numeric variables to have helpful properties.

3. [Parsing Dates](data_cleaning/03-parsing-dates.ipynb)
Help Python recognize dates as composed of day, month, and year.

4. [Character Encodings](data_cleaning/04-character-encodings.ipynb)
Avoid UnicodeDecodeErrors when loading CSV files.

5. [Inconsistent Data Entry](data_cleaning/05-inconsistent-data-entry.ipynb)
Efficiently fix typos in your data.

## Feature Engineering

1. [What Is Feature Engineering](https://www.kaggle.com/code/ryanholbrook/what-is-feature-engineering)
Learn the steps and principles of creating better features

2. [Mutual Information](feature_engineering/02-mutual-information.ipynb)
Locate features with the most potential.

3. [Creating Features](feature_engineering/03-creating-features.ipynb)
Transform features with Pandas to suit your model.

4. [Clustering With K-Means](feature_engineering/04-clustering-with-k-means.ipynb)
Untangle complex spatial relationships with cluster labels.

5. [Principal Component Analysis](feature_engineering/05-principal-component-analysis.ipynb)
Discover new features by analyzing variation.

6. [Target Encoding](feature_engineering/06-target-encoding.ipynb)
Boost any categorical feature with this powerful technique.

7. [Feature Engineering for House Prices](https://www.kaggle.com/code/ryanholbrook/feature-engineering-for-house-prices)
Apply what you've learned, and join the House Prices competition!

## Feature Engineering (2019)

1. [Baseline Model](feature_engineering_2019/01-baseline-model.ipynb)
Building a baseline model as a starting point for feature engineering.

2. [Categorical Encodings](feature_engineering_2019/02-categorical-encodings.ipynb)
There are many ways to encode categorical data for modeling. Some are pretty clever.

3. [Feature Generation](feature_engineering_2019/03-feature-generation.ipynb)
The frequently useful case where you can combine data from multiple rows into useful features.

4. [Feature Selection](feature_engineering_2019/04-feature-selection.ipynb)
You can make a lot of features. Here's how to get the best set of features for your model.

## Geospatial Analysis

1. [Your First Map](geospatial_analysis/01-your-first-map.ipynb)
Get started with plotting in GeoPandas.

2. [Coordinate Reference Systems](geospatial_analysis/02-coordinate-reference-systems.ipynb)
It's pretty amazing that we can represent the Earth's surface in 2 dimensions!

3. [Interactive Maps](geospatial_analysis/03-interactive-maps.ipynb)
Learn how to make interactive heatmaps, choropleth maps, and more!

4. [Manipulating Geospatial Data](geospatial_analysis/04-manipulating-geospatial-data.ipynb)
Find locations with just the name of a place. And, learn how to join data based on spatial relationships.

5. [Proximity Analysis](geospatial_analysis/05-proximity-analysis.ipynb)
Measure distance, and explore neighboring points on a map.

## Time Series

1. [Linear Regression With Time Series](time_series/01-linear-regression-with-time-series.ipynb)
Use two features unique to time series: lags and time steps.

2. [Trend](time_series/02-trend.ipynb)
Model long-term changes with moving averages and the time dummy.

3. [Seasonality](time_series/03-seasonality.ipynb)
Create indicators and Fourier features to capture periodic change.

4. [Time Series as Features](time_series/04-time-series-as-features.ipynb)
Predict the future from the past with a lag embedding.

5. [Hybrid Models](time_series/05-hybrid-models.ipynb)
Combine the strengths of two forecasters with this powerful technique.

6. [Forecasting With Machine Learning](time_series/06-forecasting-with-machine-learning.ipynb)
Apply ML to any forecasting task with these four strategies.

## Machine Learning Explainability

1. [Use Cases for Model Insights](https://www.kaggle.com/dansbecker/use-cases-for-model-insights)
Why and when do you need insights?

2. [Permutation Importance](machine_learning_explainability/02-permutation-importance.ipynb)
What features does your model think are important?

3. [Partial Plots](machine_learning_explainability/03-partial-plots.ipynb)
How does each feature affect your predictions?

4. [SHAP Values](machine_learning_explainability/04-shap-values.ipynb)
Understand individual predictions.

5. [Advanced Uses of SHAP Values](machine_learning_explainability/05-advanced-uses-of-shap-values.ipynb)
Aggregate SHAP values for even more detailed model insights.

## Intro to AI Ethics

1. [Introduction to AI Ethics](https://www.kaggle.com/var0101/introduction-to-ai-ethics)
Learn what to expect from the course.

2. [Human-Centered Design for AI](intro_to_ai_ethics/02-human-centered-design-for-ai.ipynb)
Design systems that serve people’s needs. Navigate issues in several real-world scenarios.

3. [Identifying Bias in AI](intro_to_ai_ethics/03-identifying-bias-in-ai.ipynb)
Bias can creep in at any stage in the pipeline. Investigate a simple model that identifies toxic text.

4. [AI Fairness](intro_to_ai_ethics/04-ai-fairness.ipynb)
Learn about four different types of fairness. Assess a toy model trained to judge credit card applications.

5. [Model Cards](intro_to_ai_ethics/05-model-cards.ipynb)
Increase transparency by communicating key information about machine learning models.

## Intro to Deep Learning

1. [A Single Neuron](intro_to_deep_learning/01-a-single-neuron.ipynb)
Learn about linear units, the building blocks of deep learning.

2. [Deep Neural Networks](intro_to_deep_learning/02-deep-neural-networks.ipynb)
Add hidden layers to your network to uncover complex relationships.

3. [Stochastic Gradient Descent](intro_to_deep_learning/03-stochastic-gradient-descent.ipynb)
Use Keras and Tensorflow to train your first neural network.

4. [Overfitting and Underfitting](intro_to_deep_learning/04-overfitting-and-underfitting.ipynb)
Improve performance with extra capacity or early stopping.

5. [Dropout and Batch Normalization](intro_to_deep_learning/05-dropout-and-batch-normalization.ipynb)
Add these special layers to prevent overfitting and stabilize training.

6. [Binary Classification](intro_to_deep_learning/06-binary-classification.ipynb)
Apply deep learning to another common task.

## Deep Learning

1. [Intro to DL for Computer Vision](deep_learning/01-intro-to-dl-for-computer-vision.ipynb)
A quick overview of how models work on images.

2. [Building Models From Convolutions](https://www.kaggle.com/dansbecker/building-models-from-convolutions)
Scale up from simple building blocks to models with beyond human capabilities.

3. [TensorFlow Programming](deep_learning/03-tensorflow-programming.ipynb)
Start writing code using TensorFlow and Keras.

4. [Transfer Learning](deep_learning/04-transfer-learning.ipynb)
A powerful technique to build highly accurate models even with limited data.

5. [Data Augmentation](deep_learning/05-data-augmentation.ipynb)
Learn a simple trick that effectively increases amount of data available for model training.

6. [A Deeper Understanding of Deep Learning](https://www.kaggle.com/dansbecker/a-deeper-understanding-of-deep-learning)
How Stochastic Gradient Descent and Back-Propagation train your deep learning model.

7. [Deep Learning From Scratch](deep_learning/07-deep-learning-from-scratch.ipynb)
Build models without transfer learning. Especially important for uncommon image types.

8. [Dropout and Strides for Larger Models](deep_learning/08-dropout-and-strides-for-larger-models.ipynb)
Make your models faster and reduce overfitting.

## Computer Vision

1. [The Convolutional Classifier](computer_vision/01-the-convolutional-classifier.ipynb)
Create your first computer vision model with Keras.

2. [Convolution and ReLU](computer_vision/02-convolution-and-relu.ipynb)
Discover how convnets create features with convolutional layers.

3. [Maximum Pooling](computer_vision/03-maximum-pooling.ipynb)
Learn more about feature extraction with maximum pooling.

4. [The Sliding Window](computer_vision/04-the-sliding-window.ipynb)
Explore two important parameters: stride and padding.

5. [Custom Convnets](computer_vision/05-custom-convnets.ipynb)
Design your own convnet.

6. [Data Augmentation](computer_vision/06-data-augmentation.ipynb)
Boost performance by creating extra training data.

7. [Create Your First Submission](https://www.kaggle.com/ryanholbrook/create-your-first-submission)
Use Kaggle's free TPUs to make a submission to the Petals to the Metal competition!

8. [Getting Started: TPUs + Cassava Leaf Disease](https://www.kaggle.com/jessemostipak/getting-started-tpus-cassava-leaf-disease)
Use Kaggle's free TPUs to make a submission to the Cassava Leaf Disease Classification competition.

## Natural Language Processing

1. [Intro to NLP](natural_language_processing/01-intro-to-nlp.ipynb)
Get started with NLP.

2. [Text Classification](natural_language_processing/02-text-classification.ipynb)
Combine machine learning with your newfound NLP skills.

3. [Word Vectors](natural_language_processing/03-word-vectors.ipynb)
Explore an idea that ushered in a new generation of NLP techniques.

## Intro to Game AI and Reinforcement Learning

1. [Play the Game](intro_to_game_ai_and_reinforcement_learning/01-play-the-game.ipynb)
Write your first game-playing agent.

2. [One-Step Lookahead](intro_to_game_ai_and_reinforcement_learning/02-one-step-lookahead.ipynb)
Make your agent smarter with a few simple changes.

3. [N-Step Lookahead](intro_to_game_ai_and_reinforcement_learning/03-n-step-lookahead.ipynb)
Use the minimax algorithm to dramatically improve your agent.

4. [Deep Reinforcement Learning](intro_to_game_ai_and_reinforcement_learning/04-deep-reinforcement-learning.ipynb)
Explore advanced techniques for creating intelligent agents.

## Intro to SQL

1. [Getting Started With SQL and BigQuery](intro_to_sql/01-getting-started-with-sql-and-bigquery.ipynb)
Learn the workflow for handling big datasets with BigQuery and SQL.

2. [Select, From & Where](intro_to_sql/02-select-from-where.ipynb)
The foundational compontents for all SQL queries.

3. [Group By, Having & Count](intro_to_sql/03-group-by-having-count.ipynb)
Get more interesting insights directly from your SQL queries.

4. [Order By](intro_to_sql/04-order-by.ipynb)
Order your results to focus on the most important data for your use case.

5. [As & With](intro_to_sql/05-as-with.ipynb)
Organize your query for better readability. This becomes especially important for complex queries.

6. [Joining Data](intro_to_sql/06-joining-data.ipynb)
Combine data sources. Critical for almost all real-world data problems.

## Advanced SQL

1. [JOINs and UNIONs](advanced_sql/01-joins-and-unions.ipynb)
Combine information from multiple tables.

2. [Analytic Functions](advanced_sql/02-analytic-functions.ipynb)
Perform complex calculations on groups of rows.

3. [Nested and Repeated Data](advanced_sql/03-nested-and-repeated-data.ipynb)
Learn to query complex datatypes in BigQuery.

4. [Writing Efficient Queries](advanced_sql/04-writing-efficient-queries.ipynb)
Write queries to run faster and use less data.

## Microchallenges

1. [Blackjack Microchallenge](microchallenges/01-blackjack-microchallenge.ipynb)
Test your logic and programming skills with by building a better BlackJack player.

2. [Airline Price Optimization Micro-Challenge](microchallenges/02-airline-price-optimization-micro-challenge.ipynb)
Can you set the best airfare prices in our Airline Sales simulator.