https://github.com/dbriane208/python-for-data-science
Machine Learning and Data Science repository. Love crafting Machine Learning models.
https://github.com/dbriane208/python-for-data-science
data-analysis data-science data-visualization machine-learning numpy pandas python seaborn
Last synced: 12 months ago
JSON representation
Machine Learning and Data Science repository. Love crafting Machine Learning models.
- Host: GitHub
- URL: https://github.com/dbriane208/python-for-data-science
- Owner: Dbriane208
- Created: 2023-05-29T14:49:07.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-06-29T12:03:00.000Z (over 1 year ago)
- Last Synced: 2024-06-29T13:25:36.490Z (over 1 year ago)
- Topics: data-analysis, data-science, data-visualization, machine-learning, numpy, pandas, python, seaborn
- Language: Jupyter Notebook
- Homepage:
- Size: 21.9 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Python-for-Data-Science
Machine Learning and Data Science repository. This repository contains machine learning projects from begginer to advance. By the end one will understand how to build an end-to-end machine learning project using python and jupyter notebook.
# Steps followed in creating the project
1. Form a hypothesis
2. Find and explore the data
3. Do data pre-processing
4. Do data visualization
5. Train the model
# Installation
To follow through the projects you need to have the following tools in your device :
1. Python 3+ locally
2. Python packages
- Pandas
- Numpy
- Seaborn
- Matplotlib
# The Data Science and Machine Learning workflow
The Data science workflow consists of :
1. Scope of the project : This clearly tells the aim of the project and directions the project will take.
2. Gathering Data : A project is as strong as the underlying data. It is the foundation of analysis. Data sources included files,Databases, Websites,APIs.
3. Cleaning Data : It is important in producing accurate and relieable results. Data cleaning tasks include :
- Correcting data types
- Imputing missing data
- Dealing with inconsistent data
- Reformating the data
4. Exploring Data : Explatory data analysis helps in understanding and visualizing the data your're working with. EDA tasks may include :
- Slicing and dicing the data
- Summarizing the data
- Visualizing the data
5. Modeling Data : Involves structuring and preparing data for specific modelling techniques and applying models to make predictions. Data modelling tasks include:
- Restructuring the data
- Feature engineering (adding new fields)
- Applying Machine Learning Algorithms
6. Sharing Insights : Involves summarizing your key findings and sharing insights with end users and stakeholders.
- Reiterate the problem
- Interprate the results of your analysis
- Share recommendations and insights
- Focus on potential impact and not technical deatails
- Deploying the model or put it into production