Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dpb44/exploring-the-intuition-of-neural-networks-on-a-classification-problem-using-only-numpy
Implementing a softmax-based neural network from scratch using NumPy to classify the Iris dataset, leveraging vectorization, gradient descent, and decision boundary visualization.
https://github.com/dpb44/exploring-the-intuition-of-neural-networks-on-a-classification-problem-using-only-numpy
deep-learning neural-network numpy softmax-classifier
Last synced: 10 days ago
JSON representation
Implementing a softmax-based neural network from scratch using NumPy to classify the Iris dataset, leveraging vectorization, gradient descent, and decision boundary visualization.
- Host: GitHub
- URL: https://github.com/dpb44/exploring-the-intuition-of-neural-networks-on-a-classification-problem-using-only-numpy
- Owner: dpb44
- Created: 2025-02-10T21:32:04.000Z (11 days ago)
- Default Branch: main
- Last Pushed: 2025-02-10T22:02:10.000Z (11 days ago)
- Last Synced: 2025-02-10T22:31:13.167Z (11 days ago)
- Topics: deep-learning, neural-network, numpy, softmax-classifier
- Language: Jupyter Notebook
- Homepage:
- Size: 0 Bytes
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Exploring the Intuition of Neural Networks on a Classification Problem Using Only NumPy
## Overview
This project explores the intuition behind neural networks for multiclass classification using **only NumPy**, without high-level frameworks like TensorFlow or PyTorch. The goal is to classify the three **Iris species**—Setosa, Versicolor, and Virginica—based on petal and sepal measurements. We built a single-layer neural network using softmax activation, cross-entropy loss, and gradient descent to optimize model parameters.
### Key Features:
- **Softmax activation** for multi-class classification.
- **Cross-entropy loss function** for model optimization.
- **Gradient Descent- Backpropagation** to update model parameters.
- **Vectorization and broadcasting** for computational efficiency.
- **Decision boundary visualization** to analyze model predictions.## Dataset
The dataset consists of **150 samples**, each with **four numerical features**:
- **Sepal Length**
- **Sepal Width**
- **Petal Length**
- **Petal Width**Each sample belongs to one of three classes:
- **Setosa (0)**
- **Versicolor (1)**
- **Virginica (2)**These features are represented as $X$ in matrix form:
$$
X \in \mathbb{R}^{m \times n_x}
$$where $m = 150$ (50 samples per species) and $n_x = 4$(features per sample).
```python
# Load the dataset using sklearn
from sklearn.datasets import load_irisiris = load_iris()
X, y = iris.data, iris.target
```## One-Hot Encoding
Since we are dealing with a multi-class classification problem, we convert categorical labels into **one-hot encoded vectors**.```python
import numpy as npm, K = y.shape[0], 3 # 3 classes
y_one_hot = np.zeros((m, K))
y_one_hot[np.arange(m), y] = 1
```
This transforms each label into a vector where only the corresponding class index is set to 1.## Model Architecture
We use a **single-layer feed-forward neural network** with softmax activation.### 1. Softmax Function
Since we have three distinct classes, we use the **softmax function** instead of the sigmoid function. The softmax function is given by:$$
g_k(\boldsymbol{t}) = \frac{e^{t_k}}{\sum_{j=1}^{K} e^{t_j}}
$$where $\boldsymbol{t} = (t_k)_{k=1}^K$ represents the unnormalized class scores.
This function converts raw scores into probabilites.
```python
# Compute softmax activation
Z = np.dot(W.T, X) + b
numerator = np.exp(Z)
denominator = np.sum(numerator, axis=0, keepdims=True)
y_hat = (numerator / denominator).T
```### 2. Cross-Entropy Loss Function
The loss function quantifies the difference between predicted and true labels:$$
\mathcal{J}(\boldsymbol{W},\boldsymbol{b}) = -\frac{1}{m} \sum_{i=1}^m \sum_{j=1}^K \mathbf{y}_j^{(i)} \log(\widehat{y}^{(i)}_j)
$$```python
# Compute loss
loss = np.sum(y_one_hot * np.log(y_hat), axis=1)
total_cost = - (1/m) * np.sum(loss)
```### 3. Backpropagation Gradient Descent for Optimization
Using backpropagation, we compute gradients for weights and bias updates$$
\nabla_{\boldsymbol{W}} \mathcal{J}(\boldsymbol{W},\boldsymbol{b}) = \frac{1}{m} (\widehat{\boldsymbol{Y}} - \mathbf{Y}) X^\top
$$$$
\nabla_{\boldsymbol{b}} \mathcal{J}(\boldsymbol{W},\boldsymbol{b}) = \frac{1}{m} \sum_{i=1}^{m} (\widehat{\boldsymbol{y}}^{(i)} - \mathbf{y}^{(i)})
$$We update parameters iteratively using:
$$
\boldsymbol{W} := \boldsymbol{W} - \alpha \nabla_{\boldsymbol{W}} \mathcal{J}(\boldsymbol{W},\boldsymbol{b})
$$$$
\boldsymbol{b} := \boldsymbol{b} - \alpha \nabla_{\boldsymbol{b}} \mathcal{J}(\boldsymbol{W},\boldsymbol{b})
$$
*Note: This image is only to show how the training updates the parameters though back propagation. It is not representative of the single-layer feed-forward neural network we have built.*
```python
# Compute gradients
W_grad = np.dot((y_hat - y_one_hot), X) / m
b_grad = np.sum((y_hat - y_one_hot), axis=1) / m
```## Training the Model
We train the model using **gradient descent** over multiple iterations.```python
# Training loop
costs = []
for i in range(iters):
y_hat = p_model(X, W, b)
cost = compute_cost(y, y_hat)
W_grad, b_grad = compute_gradients(X, y, W, b)
W -= lr * W_grad
b -= lr * b_gradif i % 100 == 0:
costs.append(cost)
print(f"Cost after iteration {i}: {cost:.4f}")
```## Model Evaluation & Results
We tested three feature sets:| Feature Set | Accuracy |
|------------------|----------|
| Petal Measurements | **96%** |
| Sepal Measurements | **75%** |
| Both Features | **98%** |### Observations:
- **Petal measurements alone** perform better than **sepal measurements alone**.
- **Using both features gives the highest accuracy (98%)**.
- The **decision boundary** was influenced by the number of training iterations and learning rate.## Decision Boundary Visualization
To visualize how the model classifies new data, we plot the decision boundary.```python
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormapx_min, x_max = X[:, 0].min() - 0.5, X[:, 0].max() + 0.5
y_min, y_max = X[:, 1].min() - 0.5, X[:, 1].max() + 0.5
xx, yy = np.meshgrid(np.linspace(x_min, x_max, 300),
np.linspace(y_min, y_max, 300))y_pred = p_model(np.c_[xx.ravel(), yy.ravel()], W_trained, b_trained)
y_pred = np.argmax(y_pred, axis=1).reshape(xx.shape)plt.contourf(xx, yy, y_pred, alpha=0.3, cmap=ListedColormap(['lightgreen', 'pink', 'coral']))
plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k')
plt.title("Decision Boundary")
plt.show()
```

### Decision Boundary Analysis:
- **Petal-only features**: Forms **well-defined decision regions** due to strong separability.
- **Sepal-only features**: The model perfoems poorly and is not able to form well-defined boundaires.## Conclusion
- **Petal measurements provide a stronger predictive signal than sepal measurements**.
- **Gradient descent, softmax activation, and cross-entropy loss optimize the model effectively**.
- **Vectorization and broadcasting improve computational efficiency**.
- **Decision boundaries improve with more training iterations and proper hyperparameter tuning for the petal measurements**.This project serves as a **minimal yet powerful demonstration** of how a neural network can be implemented from scratch, reinforcing mathematical intuition behind classification tasks.
---
## References
- [Iris Dataset - UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets/iris)
- [Softmax Regression - Stanford CS229](https://cs229.stanford.edu/)
- [Medium Article by Srija Neogi - Exploring Multi-Class Classification using Deep Learning](https://medium.com/@srijaneogi31/exploring-multi-class-classification-using-deep-learning-cd3134290887)
- [Medium Article by LM Po - Backpropagation: The Backbone of Neural Network Training (Back Propagation Image)](https://medium.com/@lmpo/backpropagation-the-backbone-of-neural-network-training-64946d6c3ae5)
---## **Credits & Acknowledgments**
This coursework was completed under the guidance of **Ms. Tatiana Bubba** (Mathematics Professor).