An open API service indexing awesome lists of open source software.

https://github.com/riteshghorse/data-science-r-python

Basic data science approaches to deal with the data from data cleaning to model building. It also contains the text mining with R to build a word cloud. Moreover, the data-visualisation in Python to analyse the data.
https://github.com/riteshghorse/data-science-r-python

data-science text-mining visualize-data wordcloud

Last synced: 10 months ago
JSON representation

Basic data science approaches to deal with the data from data cleaning to model building. It also contains the text mining with R to build a word cloud. Moreover, the data-visualisation in Python to analyse the data.

Awesome Lists containing this project

README

          

Basic Data Science in R/Python

1. Data-Science-1-Air Quality & Data-Science-1-Facebook Metrics

This module contains the basic data science approaches to deal with the data.
It contains the steps for-
-Creating subsets of dataframe
-Merging datasets
-Sorting the dataframe based on a column
-Transpose of a dataframe
-Melting the dataframe (wide to long)
-Casting of dataframe (long to wide)

2. Data-Science-2-Breast-Cancer

This module contains the further data science steps listed below-
-Data cleaning(Remove NA, ?, Negative values)
-Error Correcting(Outlier detection and removal)
-Data transformation
-Build data models using regression and naive bayes classifier


3. Text-mining-R-1 & Text-mining-R-2

This module contains the text mining approcahes with R.
It contains the following implementation-
-Text mining operations
-Calculating the tf count(text frequency count)
-Generating a word cloud

4. Visualizations-in-Python-1 & Visualizations-in-Python-2
This module explores the data visualization option in Python with matplotlib and seaborn
Following visualizations are implemented-
-Histograms
-Dot Plots
-Bar Plots
-Line Charts
-Pie Charts
-Box Plots
-Scatter Plots
-Point Plot

Note: All Python notebooks are created on Python3 environment.
: It is better to install anaconda for Python3 and run these codes.