https://github.com/zihaofu245/mini-deep-learning-nn
A mini deep learning implementation
https://github.com/zihaofu245/mini-deep-learning-nn
algorithms keras-tensorflow machine-learning myproject neural-networks numpy
Last synced: 6 months ago
JSON representation
A mini deep learning implementation
- Host: GitHub
- URL: https://github.com/zihaofu245/mini-deep-learning-nn
- Owner: ZihaoFU245
- License: mit
- Created: 2025-04-13T07:12:04.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2025-04-13T08:13:13.000Z (10 months ago)
- Last Synced: 2025-04-13T08:31:18.170Z (10 months ago)
- Topics: algorithms, keras-tensorflow, machine-learning, myproject, neural-networks, numpy
- Language: Python
- Homepage:
- Size: 46.9 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# MDLNN (Mini Deep Learning Neural Network)
A lightweight, modular deep learning framework implemented in pure NumPy. This framework provides a Keras-like API for building and training neural networks.
## Features
- Sequential model architecture
- Various layer types (Dense, Dropout, Input, Flatten)
- Multiple activation functions (ReLU, Sigmoid, Tanh, Softmax)
- Different weight initializers (Xavier/Glorot, He initialization)
- Loss functions (Binary Cross-Entropy, MSE, MAE)
- Adam optimizer with bias correction
- Training with mini-batch support and progress bars
- Transfer learning support with trainable/non-trainable layers
- Model evaluation and prediction capabilities
## Installation
```bash
pip install numpy tqdm
```
## Quick Start
```python
from MDLNN.models import Sequential
from MDLNN.layers import Input, Dense
from MDLNN.utils import Initializers
# Create a simple binary classification model
model = Sequential([
Input(input_shape=(2,)),
Dense(4, activation="tanh", initializer=Initializers.xavier_uniform),
Dense(1, activation="sigmoid", initializer=Initializers.xavier_uniform)
])
# Compile model with custom optimizer choice and parameters
model.compile(
loss="binary_cross_entropy",
optimizer="adam", # or "sgd"
optimizer_params={'learning_rate': 0.01}
)
# Train the model with mini-batches and progress bar
model.fit(X, y, epochs=100, batch_size=32, verbose=True, shuffle=True)
# Make predictions
predictions = model.predict(X_test)
```
## Components
### Layers
- **Dense**: Fully connected layer with configurable activation
- **Dropout**: Regularization layer to prevent overfitting
- **Input**: Input layer that validates data shape
- **Flatten**: Reshapes input for transition between conv and dense layers
- **Conv2D**: (Placeholder for future implementation)
### Activation Functions
- ReLU
- Sigmoid
- Tanh
- Softmax
### Weight Initializers
- Zeros
- Ones
- Random Normal
- Random Uniform
- Xavier/Glorot (Uniform and Normal)
- He (Uniform and Normal)
### Loss Functions
- Binary Cross-Entropy
- Mean Squared Error (MSE)
- Mean Absolute Error (MAE)
### Optimizers
The framework supports multiple optimizers:
- **Adam**: Advanced optimizer with adaptive learning rates
```python
model.compile(
loss="binary_cross_entropy",
optimizer="adam",
optimizer_params={
'learning_rate': 0.001, # Default: 0.001
'beta1': 0.9, # Default: 0.9
'beta2': 0.999, # Default: 0.999
'epsilon': 1e-8 # Default: 1e-8
}
)
```
- **SGD**: Stochastic Gradient Descent with momentum support
```python
model.compile(
loss="binary_cross_entropy",
optimizer="sgd",
optimizer_params={
'learning_rate': 0.01, # Default: 0.01
'momentum': 0.9 # Default: 0.0
}
)
```
## Advanced Features
### Transfer Learning
Layers can be frozen for transfer learning:
```python
model = Sequential([
Input(input_shape=(784,)),
Dense(256, activation="relu", initializer=Initializers.xavier_uniform, trainable=False), # Frozen layer
Dense(10, activation="softmax", initializer=Initializers.xavier_uniform) # Trainable layer
])
```
### Model Evaluation
```python
# Evaluate model performance
loss, accuracy = model.evaluate(X_test, y_test)
```
### Model Summary
```python
model.summary()
```
Example output:
```
Model Summary:
--------------------------------------------------
Layer 1: Input (2,)
Layer 2: Dense (2 -> 4) | Params: 12
Layer 3: Dense (4 -> 1) | Params: 5
--------------------------------------------------
Total trainable parameters: 17
```
### Model Saving and Loading
The framework provides two ways to save your models:
1. **Complete Model Saving (Recommended)**
Save and load the entire model including architecture, weights, and training configuration:
```python
# Save complete model
model.save('my_model.h5')
# Load complete model
loaded_model = Sequential.load('my_model.h5')
# Use the loaded model directly
predictions = loaded_model.predict(X_test)
```
2. **Weights-Only Saving**
Save and load just the model weights (requires maintaining model architecture):
```python
# Save model weights
model.save_weights('model_weights.npz')
# To load weights, first recreate the model architecture
new_model = Sequential([
Input(input_shape=(784,)),
Dense(512, activation="relu"),
Dense(10, activation="softmax")
])
new_model.compile(loss="cross_entropy", optimizer="adam")
# Then load the weights
new_model.load_weights('model_weights.npz')
```
The complete model saving (HDF5 format) is recommended as it:
- Saves the full model architecture
- Preserves layer configurations
- Stores optimizer settings
- Maintains loss function configuration
- Requires less code to load and use
- Is safer and more efficient for large models
Example workflow:
```python
# Train your model
model.fit(X_train, y_train, epochs=10)
# Save the trained weights
model.save_weights('my_model.npz')
# Later, create a new model with the same architecture
new_model = Sequential([
Input(input_shape=(784,)),
Dense(512, activation="relu"),
Dense(10, activation="softmax")
])
new_model.compile(loss="cross_entropy", optimizer="adam")
# Load the saved weights
new_model.load_weights('my_model.npz')
```
The weights are saved in NumPy's .npz format, which is efficient for storing multiple arrays.
## Training Features
- Mini-batch training with progress bars
- Shuffle option for each epoch
- Training/evaluation mode switching
- Batch size optimization for large datasets
- Gradient computation for trainable parameters only
- Automatic weight initialization during compilation
## Optimizer Configuration
You can customize the Adam optimizer during model compilation:
```python
model.compile(
loss="binary_cross_entropy",
optimizer_params={
'learning_rate': 0.001, # Default: 0.001
'beta1': 0.9, # Default: 0.9
'beta2': 0.999, # Default: 0.999
'epsilon': 1e-8 # Default: 1e-8
}
)
```
## Real-World Examples
### Example 1: XOR Problem
The XOR problem is a classic non-linearly separable problem that demonstrates the basic capabilities of neural networks:
```python
from MDLNN.models import Sequential
from MDLNN.layers import Input, Dense
from MDLNN.utils import Initializers
import numpy as np
# XOR dataset
X = np.array([[0, 0],
[0, 1],
[1, 0],
[1, 1]])
y = np.array([[0],
[1],
[1],
[0]])
# Build the model
model = Sequential([
Input(input_shape=(2,)),
Dense(4, activation="tanh", initializer=Initializers.xavier_uniform),
Dense(1, activation="sigmoid", initializer=Initializers.xavier_uniform)
])
# Compile and train
model.compile(
loss="binary_cross_entropy",
optimizer="adam",
optimizer_params={'learning_rate': 0.01}
)
model.fit(X, y, epochs=200, verbose=True)
# Make predictions
predictions = model.predict(X)
print("\nPredictions (rounded):")
print(np.round(predictions))
```
Expected output:
```
Predictions (rounded):
[[0.]
[1.]
[1.]
[0.]]
```
### Example 2: MNIST Classification
This example shows how to build a deeper network for classifying handwritten digits from the MNIST dataset:
```python
from MDLNN.models import Sequential
from MDLNN.layers import Dense, Input, Dropout
from MDLNN.utils import Initializers
import numpy as np
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
# Load and preprocess MNIST data
X, y = fetch_openml('mnist_784', version=1, return_X_y=True, as_frame=False)
X = X.astype('float32') / 255.0 # Normalize pixel values
# Convert labels to one-hot encoding
y_onehot = np.zeros((y.shape[0], 10))
y = y.astype(int)
y_onehot[np.arange(y.shape[0]), y] = 1
# Split the data
X_train, X_test, y_train, y_test = train_test_split(
X, y_onehot, test_size=0.2, random_state=42
)
# Build a deep model with dropout
model = Sequential([
Input(input_shape=(784,)),
Dense(512, activation="relu", initializer=Initializers.He_uniform),
Dropout(keep_p=0.5),
Dense(256, activation="relu", initializer=Initializers.He_uniform),
Dropout(keep_p=0.3),
Dense(10, activation="softmax", initializer=Initializers.xavier_uniform)
])
# Compile with appropriate loss for multi-class classification
model.compile(
loss="cross_entropy",
optimizer="adam",
optimizer_params={
'learning_rate': 0.001,
'beta1': 0.9,
'beta2': 0.999,
'epsilon': 1e-8
}
)
# Train with mini-batches
model.fit(
X_train,
y_train,
epochs=10,
batch_size=128,
verbose=True,
shuffle=True
)
# Evaluate
loss, accuracy = model.evaluate(X_test, y_test)
```
This MNIST example demonstrates several advanced features:
- Data preprocessing and normalization
- One-hot encoding for multi-class classification
- Using Dropout layers for regularization
- He initialization for ReLU activation layers
- Mini-batch training with progress bars
- Model evaluation with accuracy metrics
## Requirements
- NumPy
- tqdm (for progress bars)
- h5py (for complete model saving)
## Version
Current version: 0.1.0
## License
MIT License