https://github.com/zihaofu245/mini-deep-learning-nn

A mini deep learning implementation
https://github.com/zihaofu245/mini-deep-learning-nn
algorithms keras-tensorflow machine-learning myproject neural-networks numpy
Last synced: 6 months ago
JSON representation
A mini deep learning implementation
Host: GitHub
URL: https://github.com/zihaofu245/mini-deep-learning-nn
Owner: ZihaoFU245
License: mit
Created: 2025-04-13T07:12:04.000Z (10 months ago)
Default Branch: main
Last Pushed: 2025-04-13T08:13:13.000Z (10 months ago)
Last Synced: 2025-04-13T08:31:18.170Z (10 months ago)
Topics: algorithms, keras-tensorflow, machine-learning, myproject, neural-networks, numpy
Language: Python
Homepage:
Size: 46.9 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

          # MDLNN (Mini Deep Learning Neural Network)

A lightweight, modular deep learning framework implemented in pure NumPy. This framework provides a Keras-like API for building and training neural networks.

## Features

- Sequential model architecture

- Various layer types (Dense, Dropout, Input, Flatten)

- Multiple activation functions (ReLU, Sigmoid, Tanh, Softmax)

- Different weight initializers (Xavier/Glorot, He initialization)

- Loss functions (Binary Cross-Entropy, MSE, MAE)

- Adam optimizer with bias correction

- Training with mini-batch support and progress bars

- Transfer learning support with trainable/non-trainable layers

- Model evaluation and prediction capabilities

## Installation

```bash

pip install numpy tqdm

```

## Quick Start

```python

from MDLNN.models import Sequential

from MDLNN.layers import Input, Dense

from MDLNN.utils import Initializers

# Create a simple binary classification model

model = Sequential([

    Input(input_shape=(2,)),

    Dense(4, activation="tanh", initializer=Initializers.xavier_uniform),

    Dense(1, activation="sigmoid", initializer=Initializers.xavier_uniform)

])

# Compile model with custom optimizer choice and parameters

model.compile(

    loss="binary_cross_entropy",

    optimizer="adam",  # or "sgd"

    optimizer_params={'learning_rate': 0.01}

)

# Train the model with mini-batches and progress bar

model.fit(X, y, epochs=100, batch_size=32, verbose=True, shuffle=True)

# Make predictions

predictions = model.predict(X_test)

```

## Components

### Layers

- **Dense**: Fully connected layer with configurable activation

- **Dropout**: Regularization layer to prevent overfitting

- **Input**: Input layer that validates data shape

- **Flatten**: Reshapes input for transition between conv and dense layers

- **Conv2D**: (Placeholder for future implementation)

### Activation Functions

- ReLU

- Sigmoid

- Tanh

- Softmax

### Weight Initializers

- Zeros

- Ones

- Random Normal

- Random Uniform

- Xavier/Glorot (Uniform and Normal)

- He (Uniform and Normal)

### Loss Functions

- Binary Cross-Entropy

- Mean Squared Error (MSE)

- Mean Absolute Error (MAE)

### Optimizers

The framework supports multiple optimizers:

- **Adam**: Advanced optimizer with adaptive learning rates

  ```python

  model.compile(

      loss="binary_cross_entropy",

      optimizer="adam",

      optimizer_params={

          'learning_rate': 0.001,  # Default: 0.001

          'beta1': 0.9,           # Default: 0.9

          'beta2': 0.999,         # Default: 0.999

          'epsilon': 1e-8         # Default: 1e-8

      }

  )

  ```

- **SGD**: Stochastic Gradient Descent with momentum support

  ```python

  model.compile(

      loss="binary_cross_entropy",

      optimizer="sgd",

      optimizer_params={

          'learning_rate': 0.01,  # Default: 0.01

          'momentum': 0.9         # Default: 0.0

      }

  )

  ```

## Advanced Features

### Transfer Learning

Layers can be frozen for transfer learning:

```python

model = Sequential([

    Input(input_shape=(784,)),

    Dense(256, activation="relu", initializer=Initializers.xavier_uniform, trainable=False),  # Frozen layer

    Dense(10, activation="softmax", initializer=Initializers.xavier_uniform)  # Trainable layer

])

```

### Model Evaluation

```python

# Evaluate model performance

loss, accuracy = model.evaluate(X_test, y_test)

```

### Model Summary

```python

model.summary()

```

Example output:

```

Model Summary:

--------------------------------------------------

Layer 1: Input (2,)

Layer 2: Dense (2 -> 4) | Params: 12

Layer 3: Dense (4 -> 1) | Params: 5

--------------------------------------------------

Total trainable parameters: 17

```

### Model Saving and Loading

The framework provides two ways to save your models:

1. **Complete Model Saving (Recommended)**

   Save and load the entire model including architecture, weights, and training configuration:

```python

# Save complete model

model.save('my_model.h5')

# Load complete model

loaded_model = Sequential.load('my_model.h5')

# Use the loaded model directly

predictions = loaded_model.predict(X_test)

```

2. **Weights-Only Saving**

   Save and load just the model weights (requires maintaining model architecture):

```python

# Save model weights

model.save_weights('model_weights.npz')

# To load weights, first recreate the model architecture

new_model = Sequential([

    Input(input_shape=(784,)),

    Dense(512, activation="relu"),

    Dense(10, activation="softmax")

])

new_model.compile(loss="cross_entropy", optimizer="adam")

# Then load the weights

new_model.load_weights('model_weights.npz')

```

The complete model saving (HDF5 format) is recommended as it:

- Saves the full model architecture

- Preserves layer configurations

- Stores optimizer settings

- Maintains loss function configuration

- Requires less code to load and use

- Is safer and more efficient for large models

Example workflow:

```python

# Train your model

model.fit(X_train, y_train, epochs=10)

# Save the trained weights

model.save_weights('my_model.npz')

# Later, create a new model with the same architecture

new_model = Sequential([

    Input(input_shape=(784,)),

    Dense(512, activation="relu"),

    Dense(10, activation="softmax")

])

new_model.compile(loss="cross_entropy", optimizer="adam")

# Load the saved weights

new_model.load_weights('my_model.npz')

```

The weights are saved in NumPy's .npz format, which is efficient for storing multiple arrays.

## Training Features

- Mini-batch training with progress bars

- Shuffle option for each epoch

- Training/evaluation mode switching

- Batch size optimization for large datasets

- Gradient computation for trainable parameters only

- Automatic weight initialization during compilation

## Optimizer Configuration

You can customize the Adam optimizer during model compilation:

```python

model.compile(

    loss="binary_cross_entropy",

    optimizer_params={

        'learning_rate': 0.001,  # Default: 0.001

        'beta1': 0.9,           # Default: 0.9

        'beta2': 0.999,         # Default: 0.999

        'epsilon': 1e-8         # Default: 1e-8

    }

)

```

## Real-World Examples

### Example 1: XOR Problem

The XOR problem is a classic non-linearly separable problem that demonstrates the basic capabilities of neural networks:

```python

from MDLNN.models import Sequential

from MDLNN.layers import Input, Dense

from MDLNN.utils import Initializers

import numpy as np

# XOR dataset

X = np.array([[0, 0],

              [0, 1],

              [1, 0],

              [1, 1]])

y = np.array([[0],

              [1],

              [1],

              [0]])

# Build the model

model = Sequential([

    Input(input_shape=(2,)),

    Dense(4, activation="tanh", initializer=Initializers.xavier_uniform),

    Dense(1, activation="sigmoid", initializer=Initializers.xavier_uniform)

])

# Compile and train

model.compile(

    loss="binary_cross_entropy",

    optimizer="adam",

    optimizer_params={'learning_rate': 0.01}

)

model.fit(X, y, epochs=200, verbose=True)

# Make predictions

predictions = model.predict(X)

print("\nPredictions (rounded):")

print(np.round(predictions))

```

Expected output:

```

Predictions (rounded):

[[0.]

 [1.]

 [1.]

 [0.]]

```

### Example 2: MNIST Classification

This example shows how to build a deeper network for classifying handwritten digits from the MNIST dataset:

```python

from MDLNN.models import Sequential

from MDLNN.layers import Dense, Input, Dropout

from MDLNN.utils import Initializers

import numpy as np

from sklearn.datasets import fetch_openml

from sklearn.model_selection import train_test_split

# Load and preprocess MNIST data

X, y = fetch_openml('mnist_784', version=1, return_X_y=True, as_frame=False)

X = X.astype('float32') / 255.0  # Normalize pixel values

# Convert labels to one-hot encoding

y_onehot = np.zeros((y.shape[0], 10))

y = y.astype(int)

y_onehot[np.arange(y.shape[0]), y] = 1

# Split the data

X_train, X_test, y_train, y_test = train_test_split(

    X, y_onehot, test_size=0.2, random_state=42

)

# Build a deep model with dropout

model = Sequential([

    Input(input_shape=(784,)),

    Dense(512, activation="relu", initializer=Initializers.He_uniform),

    Dropout(keep_p=0.5),

    Dense(256, activation="relu", initializer=Initializers.He_uniform),

    Dropout(keep_p=0.3),

    Dense(10, activation="softmax", initializer=Initializers.xavier_uniform)

])

# Compile with appropriate loss for multi-class classification

model.compile(

    loss="cross_entropy",

    optimizer="adam",

    optimizer_params={

        'learning_rate': 0.001,

        'beta1': 0.9,

        'beta2': 0.999,

        'epsilon': 1e-8

    }

)

# Train with mini-batches

model.fit(

    X_train, 

    y_train,

    epochs=10,

    batch_size=128,

    verbose=True,

    shuffle=True

)

# Evaluate

loss, accuracy = model.evaluate(X_test, y_test)

```

This MNIST example demonstrates several advanced features:

- Data preprocessing and normalization

- One-hot encoding for multi-class classification

- Using Dropout layers for regularization

- He initialization for ReLU activation layers

- Mini-batch training with progress bars

- Model evaluation with accuracy metrics

## Requirements

- NumPy

- tqdm (for progress bars)

- h5py (for complete model saving)

## Version

Current version: 0.1.0

## License

MIT License
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/zihaofu245/mini-deep-learning-nn

Awesome Lists containing this project

README