Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mehdishahbazi/logistic-regression-tensorflow-ovr
This repository contains the implementation of the Logistic Regression algorithm for classifying the Iris dataset using the One-vs-Rest (OvR) approach, developed with Python 3.12 and TensorFlow v2.16.
https://github.com/mehdishahbazi/logistic-regression-tensorflow-ovr
classification classifier ipynb ipython ipython-notebook iris iris-dataset logistic logistic-regression machine-learning one-vs-all one-vs-rest ova ovr python regression tensor tensorflow tensorflow2
Last synced: 10 days ago
JSON representation
This repository contains the implementation of the Logistic Regression algorithm for classifying the Iris dataset using the One-vs-Rest (OvR) approach, developed with Python 3.12 and TensorFlow v2.16.
- Host: GitHub
- URL: https://github.com/mehdishahbazi/logistic-regression-tensorflow-ovr
- Owner: MehdiShahbazi
- License: mit
- Created: 2024-03-27T11:53:58.000Z (10 months ago)
- Default Branch: master
- Last Pushed: 2024-03-27T20:42:56.000Z (10 months ago)
- Last Synced: 2024-11-13T00:43:31.890Z (2 months ago)
- Topics: classification, classifier, ipynb, ipython, ipython-notebook, iris, iris-dataset, logistic, logistic-regression, machine-learning, one-vs-all, one-vs-rest, ova, ovr, python, regression, tensor, tensorflow, tensorflow2
- Language: Jupyter Notebook
- Homepage:
- Size: 481 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## Description
This repository presents the implementation of Linear Regression in Python 3.12 and TensorFlow v2.16 on the Iris dataset. My practice with TensorFlow library led me to create this repository. Thus, this is a useful resource for anyone using the TensorFlow library for machine learning tasks. The problem at hand is a multi-class classification task, which is addressed using the One-vs-Rest (OvR) approach. The included `.ipynb` file offers detailed explanations and comments for each implementation part.## Logistic Regression
Linear Regression is for estimating continuous values, such as predicting house prices. However, Linear Regression falls short when the goal shifts to predicting the most likely class for a given data point. In such cases, Logistic Regression emerges as the preferred choice. Unlike Linear Regression, Logistic Regression assesses the probability of a data point belonging to a specific class, thus proving invaluable in classification tasks.
Logistic Regression is a variation of Linear Regression, useful when the dependent variable y is categorical. Despite the name logistic regression, it is a probabilistic classification model. Logistic regression fits a special s-shaped curve by taking the linear regression and transforming the numeric estimate into a probability with the following function:
$$
ProbabilityOfaClass = \theta(y) = \frac{e^y}{1 + e^y}
$$which produces values between 0 (as y approaches minus infinity $-\infty$) and 1 (as y approaches plus infinity $+\infty$). This now becomes a special kind of non-linear regression.
In the equation, y is the regression result (the sum of the variables weighted by the coefficients), $e$ is the exponential function, and $\theta(y)$ is the logistic function, also called a logistic curve. It is a common "S" shape (sigmoid curve).
## One-vs-Rest (OvR) Approach
OvR is a heuristic method for using binary classification algorithms for multi-class classification. This involves splitting the multi-class dataset into multiple binary classification problems. A binary classifier is then trained on each binary classification problem and predictions are made using the model that is the most confident. For Iris dataset we would have:Binary Classification 1: Setosa vs [Versicolor, Virginica]
Binary Classification 2: Versicolor vs [Setosa, Virginica]
Binary Classification 3: Virginica vs [Setosa, Versicolor]## Requirements
The code is implemented in Python 3.12 Below are the non-standard libraries and their corresponding versions used in writing the code:
matplotlib==3.8.3
numpy==1.26.4
scikit_learn==1.4.1.post1
tensorflow==2.16.1## Results
The plot below shows the effectiveness of each classifier in predicting labels and reducing the loss for their respective binary label-predicting tasks. Additionally, the test
accuracy of predicting labels using the predictions of all classifiers is **0.73%**, which is the highest accuracy achieved so far. However, the scikit-learn library can achieve a perfect accuracy of 0.98%, and I am still in the process of understanding how it achieves this result.