An open API service indexing awesome lists of open source software.

https://github.com/dbriane208/python-for-data-science

Machine Learning and Data Science repository. Love crafting Machine Learning models.
https://github.com/dbriane208/python-for-data-science

data-analysis data-science data-visualization machine-learning numpy pandas python seaborn

Last synced: 12 months ago
JSON representation

Machine Learning and Data Science repository. Love crafting Machine Learning models.

Awesome Lists containing this project

README

          

# Python-for-Data-Science
Machine Learning and Data Science repository. This repository contains machine learning projects from begginer to advance. By the end one will understand how to build an end-to-end machine learning project using python and jupyter notebook.

# Steps followed in creating the project
1. Form a hypothesis
2. Find and explore the data
3. Do data pre-processing
4. Do data visualization
5. Train the model

# Installation
To follow through the projects you need to have the following tools in your device :
1. Python 3+ locally
2. Python packages
- Pandas
- Numpy
- Seaborn
- Matplotlib

# The Data Science and Machine Learning workflow
The Data science workflow consists of :
1. Scope of the project : This clearly tells the aim of the project and directions the project will take.
2. Gathering Data : A project is as strong as the underlying data. It is the foundation of analysis. Data sources included files,Databases, Websites,APIs.
3. Cleaning Data : It is important in producing accurate and relieable results. Data cleaning tasks include :
- Correcting data types
- Imputing missing data
- Dealing with inconsistent data
- Reformating the data
4. Exploring Data : Explatory data analysis helps in understanding and visualizing the data your're working with. EDA tasks may include :
- Slicing and dicing the data
- Summarizing the data
- Visualizing the data
5. Modeling Data : Involves structuring and preparing data for specific modelling techniques and applying models to make predictions. Data modelling tasks include:
- Restructuring the data
- Feature engineering (adding new fields)
- Applying Machine Learning Algorithms
6. Sharing Insights : Involves summarizing your key findings and sharing insights with end users and stakeholders.
- Reiterate the problem
- Interprate the results of your analysis
- Share recommendations and insights
- Focus on potential impact and not technical deatails
- Deploying the model or put it into production