https://github.com/felixcharotte/ibm_datascience_capstone
In this project, we predicted if the SpaceX Falcon 9 first stage will land successfully by following the data science methodology. We also summarized the results for the business stakeholders.
https://github.com/felixcharotte/ibm_datascience_capstone
analysis data-analysis data-science data-visualization databases folium jupyter-notebook machine-learning machine-learning-alrgorithms matplotlib pandas plotly plotly-dash python scikit-learn scipy seaborn sql
Last synced: 2 months ago
JSON representation
In this project, we predicted if the SpaceX Falcon 9 first stage will land successfully by following the data science methodology. We also summarized the results for the business stakeholders.
- Host: GitHub
- URL: https://github.com/felixcharotte/ibm_datascience_capstone
- Owner: FelixCharotte
- Created: 2025-03-02T18:37:13.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2025-03-12T21:25:07.000Z (7 months ago)
- Last Synced: 2025-06-04T21:24:29.619Z (4 months ago)
- Topics: analysis, data-analysis, data-science, data-visualization, databases, folium, jupyter-notebook, machine-learning, machine-learning-alrgorithms, matplotlib, pandas, plotly, plotly-dash, python, scikit-learn, scipy, seaborn, sql
- Language: Jupyter Notebook
- Homepage:
- Size: 20.5 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# IBM Applied Data Science Capstone
![]()
## 📄 Summary
This capstone project will ultimately **predict if the Space X Falcon 9 first stage will land successfully**.The full report can be found [here](https://github.com/FelixCharotte/IBM_DataScience_Capstone/blob/b7c9bf0e404447cb190498c6ebe3083a6ddd2eee/IBM%20Data%20Science%20Capstone%20Project%202025.pdf).
### Context and Business Understanding
- SpaceX launches Falcon 9 rockets at a cost of around $62m. This is considerably cheaper than other providers (which usually cost upwards of $165m), and much of the savings are because SpaceX can land, and then re-use the first stage of the rocket.- If we can make predictions on whether the first stage will land, we can determine the cost of a launch, and use this information to assess whether or not an alternate company should bid against SpaceX for a rocket launch.
## 📑 Main Topics
This project follows these steps:
1. [Data Collection](https://github.com/FelixCharotte/IBM_DataScience_Capstone/tree/b7c9bf0e404447cb190498c6ebe3083a6ddd2eee/01.%20Data%20Collection)
- Making GET requests to the SpaceX REST API
- Web Scraping
2. [Data Wrangling ](https://github.com/FelixCharotte/IBM_DataScience_Capstone/tree/b7c9bf0e404447cb190498c6ebe3083a6ddd2eee/02.%20Data%20Wrangling)
- Using the `.fillna()` method to remove NaN values
- Using the `.value_counts()` method to determine the following:
- Number of launches on each site
- Number and occurrence of each orbit
- Number and occurrence of mission outcome per orbit type
- Creating a landing outcome label that shows the following:
- 0 when the booster did not land successfully
- 1 when the booster did land successfully
3. [Exploratory Data Analysis](https://github.com/FelixCharotte/IBM_DataScience_Capstone/tree/b7c9bf0e404447cb190498c6ebe3083a6ddd2eee/03.%20Exploratory%20Data%20Analysis)
- Using SQL queries to manipulate and evaluate the SpaceX dataset
- Using Pandas and Matplotlib to visualize relationships between variables, and determine patterns
4. [Interactive Visual Analytics](https://github.com/FelixCharotte/IBM_DataScience_Capstone/tree/b7c9bf0e404447cb190498c6ebe3083a6ddd2eee/04.%20Interactive%20Visual%20Analytics)
- Geospatial analytics using Folium
- Creating an interactive dashboard using Plotly Dash
5. [Predictive Analysis (Classification)](https://github.com/FelixCharotte/IBM_DataScience_Capstone/tree/b7c9bf0e404447cb190498c6ebe3083a6ddd2eee/05.%20Predicitve%20Analysis%20(Classification))
- Using Scikit-Learn to:
- Pre-process (standardize) the data
- Split the data into training and testing data using train_test_split
- Train different classification models
- Find hyperparameters using GridSearchCV
- Plotting confusion matrices for each classification model
- Assessing the accuracy of each classification model## 🔑 Key Skills Learned/Used
- Using data science methodologies to define and formulate a real-world business problem
- Using data analysis and data visualisation to load a dataset, clean it, and find out interesting insights from it
- Interactive dashboard development with Plotly Dash
- Interactive map development using Folium
- Using machine learning to build a predictive model to help a business function more efficiently
- Structuring and building a data-findings report## 🏆 Certificates
To verify the certificates, click the images to follow the links.Aknowledgement to DanielBarnes18