{"id":31133035,"url":"https://github.com/mlicamele/neural-network","last_synced_at":"2026-04-12T07:33:08.549Z","repository":{"id":313961817,"uuid":"1053603682","full_name":"mlicamele/neural-network","owner":"mlicamele","description":"Project focused on exploring the computations behind neural networks by building one from scratch with only numpy and testing it with the MNIST dataset.","archived":false,"fork":false,"pushed_at":"2025-09-10T14:55:17.000Z","size":20450,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-09-18T05:22:14.156Z","etag":null,"topics":["gradient-descent","matrix-computations","neural-networks","numpy","python"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mlicamele.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-09T17:12:13.000Z","updated_at":"2025-09-10T14:55:20.000Z","dependencies_parsed_at":"2025-09-09T21:06:33.541Z","dependency_job_id":null,"html_url":"https://github.com/mlicamele/neural-network","commit_stats":null,"previous_names":["mlicamele/neural-network"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mlicamele/neural-network","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlicamele%2Fneural-network","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlicamele%2Fneural-network/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlicamele%2Fneural-network/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlicamele%2Fneural-network/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mlicamele","download_url":"https://codeload.github.com/mlicamele/neural-network/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlicamele%2Fneural-network/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279000786,"owners_count":26082851,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-08T02:00:06.501Z","response_time":56,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gradient-descent","matrix-computations","neural-networks","numpy","python"],"created_at":"2025-09-18T05:03:29.482Z","updated_at":"2025-10-08T22:29:42.796Z","avatar_url":"https://github.com/mlicamele.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# neural-network\n\nProject focused on exploring the computations behind neural networks by building one from scratch with only numpy and testing it with the MNIST dataset.\n\n## Academic Project Overview\n\nThis project was developed as part of a Linear Algebra/Multivariable Calculus course to explore the practical applications of mathematical concepts in artificial intelligence. The implementation focuses on understanding the underlying mathematics of neural networks, including matrix operations, partial derivatives, and gradient descent optimization.\n\n## Features\n\n- **From-Scratch Implementation**: Built entirely using NumPy for matrix operations\n- **Modular Architecture**: Object-oriented design with separate layer classes\n- **Mathematical Foundation**: Detailed implementation of forward and backward propagation\n- **MNIST Classification**: Digit recognition on the classic 28×28 pixel dataset\n- **Custom Layer Types**: Dense, ReLU activation, and Softmax output layers\n- **Gradient Descent**: Implementation of backpropagation with configurable learning rates\n\n## Results\n\n- **Training Accuracy**: 92.6%\n- **Test Accuracy**: 91.4%\n- **Dataset**: MNIST (28×28 grayscale digit images)\n- **Architecture**: 784 → 10 → 10 → 10 (with ReLU and Softmax activations)\n\n## Architecture\n\n### Network Structure\n```\nInput Layer (784) → Dense Layer (10) → ReLU → Dense Layer (10) → Softmax → Output (10)\n```\n\n- **Input**: 784 features (28×28 pixel values)\n- **Hidden Layer**: 10 neurons with ReLU activation\n- **Output**: 10 classes (digits 0-9) with Softmax probabilities\n\n### Mathematical Foundation\n\nThe implementation includes detailed mathematical derivations for:\n- **Forward Propagation**: Linear combinations with weights and biases\n- **Backward Propagation**: Gradient computation using chain rule\n- **Weight Updates**: Gradient descent optimization\n- **Error Gradients**: Partial derivatives for each layer type\n\n## Installation\n\n### Prerequisites\n- Python 3.7+\n- NumPy for matrix operations\n\n### Data\n- MNIST Dataset used can be found here: https://www.kaggle.com/competitions/digit-recognizer/data\n\n### Setup\n```bash\n# Clone the repository\ngit clone https://github.com/mlicamele/neural-network.git\ncd neural-network\n\n# Install dependencies\npip install numpy\n```\n\n## Quick Start\n\n```python\nfrom neural_network import Network, Dense, Relu, Softmax\nimport numpy as np\n\n# Create the network architecture\nnet = Network([\n    Dense(784, 10),    # Input to hidden layer\n    Relu(),            # ReLU activation\n    Dense(10, 10),     # Hidden layer\n    Softmax()          # Output layer with softmax\n], \nlearning_rate=0.5,\nepochs=1000)\n\n# Train the network\nnet.train(X_train, y_train)\n\n# Make predictions\npredictions = net.predict(X_test)\n```\n\n## Layer Classes\n\n### Dense Layer\nImplements fully connected layers with learnable weights and biases:\n```python\nclass Dense(Layer):\n    def __init__(self, input_size, output_size):\n        self.weights = np.random.rand(output_size, input_size) - 0.5\n        self.biases = np.random.rand(output_size, 1) - 0.5\n```\n\n### Activation Layers\n- **ReLU**: Rectified Linear Unit activation function\n- **Softmax**: Probability distribution for multi-class classification\n\n### Mathematical Operations\n- **Forward Pass**: `output = weights @ input + biases`\n- **Backward Pass**: Gradient computation using chain rule\n- **Weight Updates**: `weights -= learning_rate * gradient`\n\n## Mathematical Implementation\n\n### Forward Propagation\nFor each layer, the output is computed as:\n```\nY = W·X + B\n```\nWhere:\n- `W` is the weight matrix\n- `X` is the input vector\n- `B` is the bias vector\n\n### Backward Propagation\nGradients are computed using partial derivatives:\n- **Input gradient**: `∂E/∂X = W^T · ∂E/∂Y`\n- **Weight gradient**: `∂E/∂W = ∂E/∂Y · X^T`\n- **Bias gradient**: `∂E/∂B = ∂E/∂Y`\n\n## Project Structure\n\n```\nneural-network/\n│\n├── network.py                                                # Main Network class\n├── test_network.ipy  nb                                         # Training network and predicting on MNIST dataset\n├── gradient_descent.ipynb                                    # Non-object oriented approach to get familiar with the basics of the computation\n├── data\n│   ├── test.csv                                              # Test dataset\n│   └── train.csv                                             # Train dataset\n├── Applications of Linear ALgebra in Neural Networks.pdf     # Full research paper\n└── README.md\n```\n\n## Key Learning Outcomes\n\nThis project demonstrates:\n- **Matrix Operations**: Extensive use of NumPy for linear algebra\n- **Calculus Applications**: Partial derivatives in backpropagation\n- **Optimization**: Gradient descent implementation\n- **Object-Oriented Design**: Modular layer architecture\n- **Machine Learning Fundamentals**: Training, validation, and testing\n\n## Training Process\n\n1. **Data Preprocessing**: MNIST images flattened to 784-dimensional vectors\n2. **One-Hot Encoding**: Target labels converted to probability distributions\n3. **Forward Propagation**: Data flows through network layers\n4. **Loss Calculation**: Error computed between predictions and targets\n5. **Backward Propagation**: Gradients computed and weights updated\n6. **Iteration**: Process repeated for specified epochs\n\n## Hyperparameters\n\n| Parameter | Value | Description |\n|-----------|-------|-------------|\n| Learning Rate | 0.5 | Step size for gradient descent |\n| Epochs | 1000 | Number of training iterations |\n| Hidden Units | 10 | Neurons in hidden layer |\n| Batch Processing | Full dataset | All samples processed simultaneously |\n\n## Educational Features\n\n- **Mathematical Derivations**: Complete derivations included in documentation\n- **Step-by-Step Implementation**: Each mathematical operation explained\n- **Visualization Ready**: Easy to add plotting for loss curves and accuracy\n- **Extensible Design**: Simple to add new layer types or activation functions\n\n## Academic References\n\nThis implementation is based on fundamental neural network principles detailed in:\n- Cano, A. (2017). A survey on graphic processing unit computing for large‐scale data mining\n- Chakrabarti, B. K. (1995). Neural networks\n- Dolson, M. (1989). Machine tongues xii: Neural networks\n- Fan, J., Ma, C., \u0026 Zhong, Y. (2021). A selective overview of deep learning\n- Higham, C. F., \u0026 Higham, D. J. (2019). Deep learning: An introduction for applied mathematicians\n- West, P. M., Brockett, P. L., \u0026 Golden, L. L. (1997). A comparative analysis of neural networks and statistical methods for predicting consumer choice\n- Zhang, C.-H. (2007). Continuous generalized gradient descent\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmlicamele%2Fneural-network","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmlicamele%2Fneural-network","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmlicamele%2Fneural-network/lists"}