https://github.com/drakearch/kaggle-courses
Kaggle courses and tutorials to get you started in the Data Science world.
https://github.com/drakearch/kaggle-courses
data-science deep-learning machine-learning pandas python
Last synced: 7 months ago
JSON representation
Kaggle courses and tutorials to get you started in the Data Science world.
- Host: GitHub
- URL: https://github.com/drakearch/kaggle-courses
- Owner: drakearch
- License: mit
- Created: 2019-09-05T01:29:45.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2022-04-11T00:04:58.000Z (over 3 years ago)
- Last Synced: 2024-11-08T14:22:56.588Z (12 months ago)
- Topics: data-science, deep-learning, machine-learning, pandas, python
- Language: Jupyter Notebook
- Homepage:
- Size: 8.69 MB
- Stars: 195
- Watchers: 5
- Forks: 63
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-data-science-resources - Kaggle Courses
- awesome-data-science-resources - Kaggle Courses
README
# Kaggle Courses
- [Python](#python)
- [Pandas](#pandas)
- [Data Visualization](#data-visualization)
- [Intro to Machine Learning](#intro-to-machine-learning)
- [Intermediate Machine Learning](#intermediate-machine-learning)
- [Data Cleaning](#data-cleaning)
- [Feature Engineering](#feature-engineering)
- [Feature Engineering (2019)](#feature-engineering-2019)
- [Geospatial Analysis](#geospatial-analysis)
- [Time Series](#time-series)
- [Machine Learning Explainability](#machine-learning-explainability)
- [Intro to AI Ethics](#intro-to-ai-ethics)
- [Intro to Deep Learning](#intro-to-deep-learning)
- [Deep Learning](#deep-learning)
- [Computer Vision](#computer-vision)
- [Natural Language Processing](#natural-language-processing)
- [Intro to Game AI and Reinforcement Learning](#intro-to-game-ai-and-reinforcement-learning)
- [Intro to SQL](#intro-to-sql)
- [Advanced SQL](#advanced-sql)
- [Microchallenges](#microchallenges)
## Python
1. [Hello, Python](python/01-syntax-variables-and-numbers.ipynb)
A quick introduction to Python syntax, variable assignment, and numbers.
2. [Functions and Getting Help](python/02-functions-and-getting-help.ipynb)
Calling functions and defining our own, and using Python's builtin documentation.
3. [Booleans and Conditionals](python/03-booleans-and-conditionals.ipynb)
Using booleans for branching logic.
4. [Lists and Tuples](python/04-lists.ipynb)
Lists and the things you can do with them. Includes indexing, slicing and mutating.
5. [Loops and List Comprehensions](python/05-loops-and-list-comprehensions.ipynb)
For and while loops, and a much-loved Python feature: list comprehensions.
6. [Strings and Dictionaries](python/06-strings-and-dictionaries.ipynb)
Working with strings and dictionaries, two fundamental Python data types.
7. [Working with External Libraries](python/07-working-with-external-libraries.ipynb)
Imports, operator overloading, and survival tips for venturing into the world of external libraries.
## Pandas
1. [Exercise: Creating, Reading and Writing](pandas/01-creating-reading-and-writing.ipynb)
You can't work with data if you can't read it. Get started here.
2. [Exercise: Indexing, Selecting & Assigning](pandas/02-indexing-selecting-assigning.ipynb)
Pro data scientists do this dozens of times a day. You can, too!
3. [Exercise: Summary Functions and Maps](pandas/03-summary-functions-and-maps.ipynb)
Extract insights from your data.
4. [Exercise: Grouping and Sorting](pandas/04-grouping-and-sorting.ipynb)
Scale up your level of insight. The more complex the dataset, the more this matters.
5. [Exercise: Data Types and Missing Values](pandas/05-data-types-and-missing-values.ipynb)
Deal with the most common progress-blocking problems.
6. [Exercise: Renaming and Combining](pandas/06-renaming-and-combining.ipynb)
Data comes in from many sources. Help it all make sense together.
## Data Visualization
1. [Hello, Seaborn](data_visualization/01-hello-seaborn.ipynb)
Your first introduction to coding for data visualization.
2. [Line Charts](data_visualization/02-line-charts.ipynb)
Visualize trends over time.
3. [Bar Charts and Heatmaps](data_visualization/03-bar-charts-and-heatmaps.ipynb)
Use color or length to compare categories in a dataset.
4. [Scatter Plots](data_visualization/04-scatter-plots.ipynb)
Leverage the coordinate plane to explore relationships between variables.
5. [Distributions](data_visualization/05-distributions.ipynb)
Create histograms and density plots.
6. [Choosing Plot Types and Custom Styles](data_visualization/06-choosing-plot-types-and-custom-styles.ipynb)
Customize your charts and make them look snazzy.
7. [Final Project](data_visualization/07-final-project.ipynb)
Practice for real-world application.
## Intro to Machine Learning
1. [How Models Work](https://www.kaggle.com/dansbecker/how-models-work)
The first step if you're new to machine learning.
2. [Basic Data Exploration](intro_to_machine_learning/02-explore-your-data.ipynb)
Load and understand your data.
3. [Your First Machine Learning Model](intro_to_machine_learning/03-your-first-machine-learning-model.ipynb)
Building your first model. Hurray!
4. [Model Validation](intro_to_machine_learning/04-model-validation.ipynb)
Measure the performance of your model ? so you can test and compare alternatives.
1. [Underfitting and Overfitting](intro_to_machine_learning/05-underfitting-and-overfitting.ipynb)
Fine-tune your model for better performance.
6. [Random Forests](intro_to_machine_learning/06-random-forests.ipynb)
Using a more sophisticated machine learning algorithm.
7. [Exercise: Machine Learning Competitions](intro_to_machine_learning/07-machine-learning-competitions.ipynb)
Enter the world of machine learning competitions to keep improving and see your progress.
## Intermediate Machine Learning
1. [Introduction](intermediate_machine_learning/01-introduction.ipynb)
Review what you need for this Micro-Course.
2. [Missing Values](intermediate_machine_learning/02-missing-values.ipynb)
Missing values happen. Be prepared for this common challenge in real datasets.
3. [Categorical Variables](intermediate_machine_learning/03-categorical-variables.ipynb)
There's a lot of non-numeric data out there. Here's how to use it for machine learning.
4. [Pipelines](intermediate_machine_learning/04-pipelines.ipynb)
A critical skill for deploying (and even testing) complex models with pre-processing.
5. [Cross-Validation](intermediate_machine_learning/05-cross-validation.ipynb)
A better way to test your models.
6. [XGBoost](intermediate_machine_learning/06-xgboost.ipynb)
The most accurate modeling technique for structured data.
7. [Data Leakage](intermediate_machine_learning/07-data-leakage.ipynb)
Find and fix this problem that ruins your model in subtle ways.
## Data Cleaning
1. [Handling Missing Values](data_cleaning/01-handling-missing-values.ipynb)
Drop missing values, or fill them in with an automated workflow.
2. [Scaling and Normalization](data_cleaning/02-scaling-and-normalization.ipynb)
Transform numeric variables to have helpful properties.
3. [Parsing Dates](data_cleaning/03-parsing-dates.ipynb)
Help Python recognize dates as composed of day, month, and year.
4. [Character Encodings](data_cleaning/04-character-encodings.ipynb)
Avoid UnicodeDecodeErrors when loading CSV files.
5. [Inconsistent Data Entry](data_cleaning/05-inconsistent-data-entry.ipynb)
Efficiently fix typos in your data.
## Feature Engineering
1. [What Is Feature Engineering](https://www.kaggle.com/code/ryanholbrook/what-is-feature-engineering)
Learn the steps and principles of creating better features
2. [Mutual Information](feature_engineering/02-mutual-information.ipynb)
Locate features with the most potential.
3. [Creating Features](feature_engineering/03-creating-features.ipynb)
Transform features with Pandas to suit your model.
4. [Clustering With K-Means](feature_engineering/04-clustering-with-k-means.ipynb)
Untangle complex spatial relationships with cluster labels.
5. [Principal Component Analysis](feature_engineering/05-principal-component-analysis.ipynb)
Discover new features by analyzing variation.
6. [Target Encoding](feature_engineering/06-target-encoding.ipynb)
Boost any categorical feature with this powerful technique.
7. [Feature Engineering for House Prices](https://www.kaggle.com/code/ryanholbrook/feature-engineering-for-house-prices)
Apply what you've learned, and join the House Prices competition!
## Feature Engineering (2019)
1. [Baseline Model](feature_engineering_2019/01-baseline-model.ipynb)
Building a baseline model as a starting point for feature engineering.
2. [Categorical Encodings](feature_engineering_2019/02-categorical-encodings.ipynb)
There are many ways to encode categorical data for modeling. Some are pretty clever.
3. [Feature Generation](feature_engineering_2019/03-feature-generation.ipynb)
The frequently useful case where you can combine data from multiple rows into useful features.
4. [Feature Selection](feature_engineering_2019/04-feature-selection.ipynb)
You can make a lot of features. Here's how to get the best set of features for your model.
## Geospatial Analysis
1. [Your First Map](geospatial_analysis/01-your-first-map.ipynb)
Get started with plotting in GeoPandas.
2. [Coordinate Reference Systems](geospatial_analysis/02-coordinate-reference-systems.ipynb)
It's pretty amazing that we can represent the Earth's surface in 2 dimensions!
3. [Interactive Maps](geospatial_analysis/03-interactive-maps.ipynb)
Learn how to make interactive heatmaps, choropleth maps, and more!
4. [Manipulating Geospatial Data](geospatial_analysis/04-manipulating-geospatial-data.ipynb)
Find locations with just the name of a place. And, learn how to join data based on spatial relationships.
5. [Proximity Analysis](geospatial_analysis/05-proximity-analysis.ipynb)
Measure distance, and explore neighboring points on a map.
## Time Series
1. [Linear Regression With Time Series](time_series/01-linear-regression-with-time-series.ipynb)
Use two features unique to time series: lags and time steps.
2. [Trend](time_series/02-trend.ipynb)
Model long-term changes with moving averages and the time dummy.
3. [Seasonality](time_series/03-seasonality.ipynb)
Create indicators and Fourier features to capture periodic change.
4. [Time Series as Features](time_series/04-time-series-as-features.ipynb)
Predict the future from the past with a lag embedding.
5. [Hybrid Models](time_series/05-hybrid-models.ipynb)
Combine the strengths of two forecasters with this powerful technique.
6. [Forecasting With Machine Learning](time_series/06-forecasting-with-machine-learning.ipynb)
Apply ML to any forecasting task with these four strategies.
## Machine Learning Explainability
1. [Use Cases for Model Insights](https://www.kaggle.com/dansbecker/use-cases-for-model-insights)
Why and when do you need insights?
2. [Permutation Importance](machine_learning_explainability/02-permutation-importance.ipynb)
What features does your model think are important?
3. [Partial Plots](machine_learning_explainability/03-partial-plots.ipynb)
How does each feature affect your predictions?
4. [SHAP Values](machine_learning_explainability/04-shap-values.ipynb)
Understand individual predictions.
5. [Advanced Uses of SHAP Values](machine_learning_explainability/05-advanced-uses-of-shap-values.ipynb)
Aggregate SHAP values for even more detailed model insights.
## Intro to AI Ethics
1. [Introduction to AI Ethics](https://www.kaggle.com/var0101/introduction-to-ai-ethics)
Learn what to expect from the course.
2. [Human-Centered Design for AI](intro_to_ai_ethics/02-human-centered-design-for-ai.ipynb)
Design systems that serve people’s needs. Navigate issues in several real-world scenarios.
3. [Identifying Bias in AI](intro_to_ai_ethics/03-identifying-bias-in-ai.ipynb)
Bias can creep in at any stage in the pipeline. Investigate a simple model that identifies toxic text.
4. [AI Fairness](intro_to_ai_ethics/04-ai-fairness.ipynb)
Learn about four different types of fairness. Assess a toy model trained to judge credit card applications.
5. [Model Cards](intro_to_ai_ethics/05-model-cards.ipynb)
Increase transparency by communicating key information about machine learning models.
## Intro to Deep Learning
1. [A Single Neuron](intro_to_deep_learning/01-a-single-neuron.ipynb)
Learn about linear units, the building blocks of deep learning.
2. [Deep Neural Networks](intro_to_deep_learning/02-deep-neural-networks.ipynb)
Add hidden layers to your network to uncover complex relationships.
3. [Stochastic Gradient Descent](intro_to_deep_learning/03-stochastic-gradient-descent.ipynb)
Use Keras and Tensorflow to train your first neural network.
4. [Overfitting and Underfitting](intro_to_deep_learning/04-overfitting-and-underfitting.ipynb)
Improve performance with extra capacity or early stopping.
5. [Dropout and Batch Normalization](intro_to_deep_learning/05-dropout-and-batch-normalization.ipynb)
Add these special layers to prevent overfitting and stabilize training.
6. [Binary Classification](intro_to_deep_learning/06-binary-classification.ipynb)
Apply deep learning to another common task.
## Deep Learning
1. [Intro to DL for Computer Vision](deep_learning/01-intro-to-dl-for-computer-vision.ipynb)
A quick overview of how models work on images.
2. [Building Models From Convolutions](https://www.kaggle.com/dansbecker/building-models-from-convolutions)
Scale up from simple building blocks to models with beyond human capabilities.
3. [TensorFlow Programming](deep_learning/03-tensorflow-programming.ipynb)
Start writing code using TensorFlow and Keras.
4. [Transfer Learning](deep_learning/04-transfer-learning.ipynb)
A powerful technique to build highly accurate models even with limited data.
5. [Data Augmentation](deep_learning/05-data-augmentation.ipynb)
Learn a simple trick that effectively increases amount of data available for model training.
6. [A Deeper Understanding of Deep Learning](https://www.kaggle.com/dansbecker/a-deeper-understanding-of-deep-learning)
How Stochastic Gradient Descent and Back-Propagation train your deep learning model.
7. [Deep Learning From Scratch](deep_learning/07-deep-learning-from-scratch.ipynb)
Build models without transfer learning. Especially important for uncommon image types.
8. [Dropout and Strides for Larger Models](deep_learning/08-dropout-and-strides-for-larger-models.ipynb)
Make your models faster and reduce overfitting.
## Computer Vision
1. [The Convolutional Classifier](computer_vision/01-the-convolutional-classifier.ipynb)
Create your first computer vision model with Keras.
2. [Convolution and ReLU](computer_vision/02-convolution-and-relu.ipynb)
Discover how convnets create features with convolutional layers.
3. [Maximum Pooling](computer_vision/03-maximum-pooling.ipynb)
Learn more about feature extraction with maximum pooling.
4. [The Sliding Window](computer_vision/04-the-sliding-window.ipynb)
Explore two important parameters: stride and padding.
5. [Custom Convnets](computer_vision/05-custom-convnets.ipynb)
Design your own convnet.
6. [Data Augmentation](computer_vision/06-data-augmentation.ipynb)
Boost performance by creating extra training data.
7. [Create Your First Submission](https://www.kaggle.com/ryanholbrook/create-your-first-submission)
Use Kaggle's free TPUs to make a submission to the Petals to the Metal competition!
8. [Getting Started: TPUs + Cassava Leaf Disease](https://www.kaggle.com/jessemostipak/getting-started-tpus-cassava-leaf-disease)
Use Kaggle's free TPUs to make a submission to the Cassava Leaf Disease Classification competition.
## Natural Language Processing
1. [Intro to NLP](natural_language_processing/01-intro-to-nlp.ipynb)
Get started with NLP.
2. [Text Classification](natural_language_processing/02-text-classification.ipynb)
Combine machine learning with your newfound NLP skills.
3. [Word Vectors](natural_language_processing/03-word-vectors.ipynb)
Explore an idea that ushered in a new generation of NLP techniques.
## Intro to Game AI and Reinforcement Learning
1. [Play the Game](intro_to_game_ai_and_reinforcement_learning/01-play-the-game.ipynb)
Write your first game-playing agent.
2. [One-Step Lookahead](intro_to_game_ai_and_reinforcement_learning/02-one-step-lookahead.ipynb)
Make your agent smarter with a few simple changes.
3. [N-Step Lookahead](intro_to_game_ai_and_reinforcement_learning/03-n-step-lookahead.ipynb)
Use the minimax algorithm to dramatically improve your agent.
4. [Deep Reinforcement Learning](intro_to_game_ai_and_reinforcement_learning/04-deep-reinforcement-learning.ipynb)
Explore advanced techniques for creating intelligent agents.
## Intro to SQL
1. [Getting Started With SQL and BigQuery](intro_to_sql/01-getting-started-with-sql-and-bigquery.ipynb)
Learn the workflow for handling big datasets with BigQuery and SQL.
2. [Select, From & Where](intro_to_sql/02-select-from-where.ipynb)
The foundational compontents for all SQL queries.
3. [Group By, Having & Count](intro_to_sql/03-group-by-having-count.ipynb)
Get more interesting insights directly from your SQL queries.
4. [Order By](intro_to_sql/04-order-by.ipynb)
Order your results to focus on the most important data for your use case.
5. [As & With](intro_to_sql/05-as-with.ipynb)
Organize your query for better readability. This becomes especially important for complex queries.
6. [Joining Data](intro_to_sql/06-joining-data.ipynb)
Combine data sources. Critical for almost all real-world data problems.
## Advanced SQL
1. [JOINs and UNIONs](advanced_sql/01-joins-and-unions.ipynb)
Combine information from multiple tables.
2. [Analytic Functions](advanced_sql/02-analytic-functions.ipynb)
Perform complex calculations on groups of rows.
3. [Nested and Repeated Data](advanced_sql/03-nested-and-repeated-data.ipynb)
Learn to query complex datatypes in BigQuery.
4. [Writing Efficient Queries](advanced_sql/04-writing-efficient-queries.ipynb)
Write queries to run faster and use less data.
## Microchallenges
1. [Blackjack Microchallenge](microchallenges/01-blackjack-microchallenge.ipynb)
Test your logic and programming skills with by building a better BlackJack player.
2. [Airline Price Optimization Micro-Challenge](microchallenges/02-airline-price-optimization-micro-challenge.ipynb)
Can you set the best airfare prices in our Airline Sales simulator.