https://github.com/riteshghorse/data-science-r-python
Basic data science approaches to deal with the data from data cleaning to model building. It also contains the text mining with R to build a word cloud. Moreover, the data-visualisation in Python to analyse the data.
https://github.com/riteshghorse/data-science-r-python
data-science text-mining visualize-data wordcloud
Last synced: 10 months ago
JSON representation
Basic data science approaches to deal with the data from data cleaning to model building. It also contains the text mining with R to build a word cloud. Moreover, the data-visualisation in Python to analyse the data.
- Host: GitHub
- URL: https://github.com/riteshghorse/data-science-r-python
- Owner: riteshghorse
- License: apache-2.0
- Created: 2018-04-19T05:33:19.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2018-04-19T05:43:57.000Z (almost 8 years ago)
- Last Synced: 2025-02-08T16:37:28.595Z (12 months ago)
- Topics: data-science, text-mining, visualize-data, wordcloud
- Language: Jupyter Notebook
- Homepage:
- Size: 1.25 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.txt
- License: LICENSE
Awesome Lists containing this project
README
Basic Data Science in R/Python
1. Data-Science-1-Air Quality & Data-Science-1-Facebook Metrics
This module contains the basic data science approaches to deal with the data.
It contains the steps for-
-Creating subsets of dataframe
-Merging datasets
-Sorting the dataframe based on a column
-Transpose of a dataframe
-Melting the dataframe (wide to long)
-Casting of dataframe (long to wide)
2. Data-Science-2-Breast-Cancer
This module contains the further data science steps listed below-
-Data cleaning(Remove NA, ?, Negative values)
-Error Correcting(Outlier detection and removal)
-Data transformation
-Build data models using regression and naive bayes classifier
3. Text-mining-R-1 & Text-mining-R-2
This module contains the text mining approcahes with R.
It contains the following implementation-
-Text mining operations
-Calculating the tf count(text frequency count)
-Generating a word cloud
4. Visualizations-in-Python-1 & Visualizations-in-Python-2
This module explores the data visualization option in Python with matplotlib and seaborn
Following visualizations are implemented-
-Histograms
-Dot Plots
-Bar Plots
-Line Charts
-Pie Charts
-Box Plots
-Scatter Plots
-Point Plot
Note: All Python notebooks are created on Python3 environment.
: It is better to install anaconda for Python3 and run these codes.