An open API service indexing awesome lists of open source software.

https://github.com/rudreshveerkhare/data_science_lab


https://github.com/rudreshveerkhare/data_science_lab

Last synced: 7 months ago
JSON representation

Awesome Lists containing this project

README

          

# Data Science Lab

## Content

[**1.1 Linear Regression**](./Linear_Regression_on_Auto_MPG.ipynb)

> Performed EDA on Auto MPG dataset and trained a Liner Regression Model.

[**1.2 Logistic Regression**](./Horse_Colic_Dataset_Logistic_Regression.ipynb)

> Performed EDA on Horse Colic dataset, and trained a Logistic Regression model for classification.

[**2 PCA**](./PCA.ipynb)

> Performed EDA on Adult Census dataset and used PCA for feature selection

[**3 Naive Bayes Classifier & LDA**](./Na%C3%AFve_Bayes_%26_LDA.ipynb)

> Performed EDA on Adult Census dataset and used Naive Bayes and LDA for classification.

[**4 Hypothesis Testing**](./Hypothesis_Testing.ipynb)

> Performed Hypothesis testing on Male/Female income data, by using Type I error calculation to find out critical value and based on that deciding to reject or accept the hypothesis

[**5 Measuring the Performance of Model**](./Kaggle_House_Prices_Measuring_the_Performance_of_Model.ipynb)

> Submitted solution to a Kaggle Competition within 2hrs, got the public leaderboard rank of **1429** [House Prices - Advanced Regression Techniques](https://www.kaggle.com/competitions/house-prices-advanced-regression-techniques/leaderboard?search=Rudresh+Veerkhare)

[**6 Time Series Forecasting: Data, Analysis, and Practice**](<./Time_Series_Analysis_(Exp_6).ipynb>)

> Performed Time Series analysis on vehicle traffic dataset. Analyzed Timestep & lag features, trends, seasonality ushing Fourier features, then using auto correlation and finally used ARIMA model.

[**7 Text Analysis**](./Text_Analysis.ipynb)

> Used Email Spam dataset to perform text analysis. Steps involved are preparing corpus, removing punctuations and stopwords, use word Stemmer, create Document Term Matrix and finally using CART and Random Forest for classification.

[**8 Support Vector Analysis**](./Support_Vector_Machine.ipynb)

> Studied Support Vector Machines, analyzed the margins made by SVM and effect of hyperparameters.