https://github.com/saadhaniftaj/logistic--lasso-regression-data-analysis
Iris dataset analysis with logistic and Lasso regression, using coordinate descent for feature selection and binary classification. Includes preprocessing and data visualizations
https://github.com/saadhaniftaj/logistic--lasso-regression-data-analysis
data-analysis lasso-regression-model logistic-regression python statistics
Last synced: about 2 months ago
JSON representation
Iris dataset analysis with logistic and Lasso regression, using coordinate descent for feature selection and binary classification. Includes preprocessing and data visualizations
- Host: GitHub
- URL: https://github.com/saadhaniftaj/logistic--lasso-regression-data-analysis
- Owner: saadhaniftaj
- Created: 2024-11-10T17:50:47.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-11-11T18:32:21.000Z (6 months ago)
- Last Synced: 2024-11-11T19:33:25.577Z (6 months ago)
- Topics: data-analysis, lasso-regression-model, logistic-regression, python, statistics
- Language: Jupyter Notebook
- Homepage:
- Size: 74.2 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Iris Dataset Analysis with Coordinate Descent
## Overview
This project explores the Iris dataset, focusing on:
1. **Binary Classification**: Using Logistic Regression and coordinate descent to classify two species.
2. **Feature Selection**: Applying Lasso regression with coordinate descent to select key features predicting petal length.
3. **Data Visualization**: Displaying distributions and relationships between features.## Project Structure
- **Data Preprocessing**: Cleans and encodes the Iris dataset, making it ready for machine learning.
- **Coordinate Descent for Logistic Regression**: Implements a logistic regression classifier for a binary subset of species in the Iris dataset.
- **Lasso Regression for Feature Selection**: Uses Lasso regularization to identify important features for predicting petal length.
- **Visualization**: Provides histograms and scatter plots to illustrate feature distributions and relationships.## How to Run
1. Clone or download this repository.
2. Open the notebook `DS_221_Project.ipynb` in Jupyter Notebook or Jupyter Lab.
3. Run each cell sequentially to see preprocessing, analysis, and visualizations.## Requirements
- Python 3.x
- Libraries: `pandas`, `numpy`, `scikit-learn`, `matplotlib`## Key Results
- **Logistic Regression Accuracy**: Achieves classification accuracy for binary classification using logistic regression.
- **Lasso Regression Coefficients**: Highlights features with the strongest impact on petal length.## Conclusion
This project demonstrates effective data preprocessing, visualization, and machine learning applications using logistic and Lasso regression, providing insights into feature selection and binary classification within the Iris dataset.