https://github.com/jmaczan/ffnn

Feedforward Neural Network from scratch - backpropagation, gradient descent, activation functions
https://github.com/jmaczan/ffnn

activation-functions backpropagation educational feedforward-neural-network from-scratch gradient-descent machine-learning neural-network nn python rectifier relu softmax

Last synced: about 1 year ago
JSON representation

Feedforward Neural Network from scratch - backpropagation, gradient descent, activation functions

Host: GitHub
URL: https://github.com/jmaczan/ffnn
Owner: jmaczan
Created: 2023-10-28T09:34:28.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-11-24T12:19:39.000Z (over 2 years ago)
Last Synced: 2025-05-18T11:07:09.484Z (about 1 year ago)
Topics: activation-functions, backpropagation, educational, feedforward-neural-network, from-scratch, gradient-descent, machine-learning, neural-network, nn, python, rectifier, relu, softmax
Language: Python
Homepage:
Size: 57.6 KB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # neural-network

🐱 Neural Network from a very scratch - backpropagation, gradient descent, activation functions

## prerequisities

install anaconda, so you have `conda` available in shell

## development

```

# To activate this environment, use

#

#     $ conda activate nn

#

# To deactivate an active environment, use

#

#     $ conda deactivate

```

export env settings to .yml:

```

conda env export --from-history > environment.yml

```

## notes

softmax derivative:

$$

softmax(z_i) = \frac{e^{z_i}}{\sum_{j=1}^K{e^{z_j}}}

$$

we need two derivatives of softmax - with respect to $z_i$ and with respect to $z_k$ when $k \ne i$

let's say $softmax(z_i) = S_i$. derivative of $S_i$ w.r.t $z_i$:

$$

\frac{\partial{S_i}}{\partial{z_i}}=\frac{\partial}{\partial{z_i}}(\frac{e^{z_i}}{\sum_{j=1}^K{e^{z_j}}})

$$

apply the quotient rule of differentiation

$$

\frac{\partial}{\partial {x}}(\frac{f}{g})=\frac{f'g - fg'}{g^2}

$$

so then

$$

\frac{\partial}{\partial{z_i}}(\frac{e^{z_i}}{\sum_{j=1}^K{e^{z_j}}})=\frac{e^{z_i}\sum_{j=1}^K{e^{z_j}} - e^{z_i}\sum_{j=1}^K{e^{z_j}}'}{(\sum_{j=1}^K{e^{z_j}})^2}

$$

derivative of sum ${\sum_{j=1}^K{e^{z_j}}}$ is ${e^{z_i}}$, because all other derivatives of $e^(z_k)$ w.r.t $e^{z_i}$ are $0$

$$

\frac{e^{z_i}\sum_{j=1}^K{e^{z_j}} - e^{z_i}e^{z_i}}{(\sum_{j=1}^K{e^{z_j}})^2} = \frac{e^{z_i}(\sum_{j=1}^K{e^{z_j}} - e^{z_i})}{(\sum_{j=1}^K{e^{z_j}})^2}=\frac{e^{z_i}}{\sum_{j=1}^K{e^{z_j}}}\frac{(\sum_{j=1}^K{e^{z_j}})-e^{z_i}}{\sum_{j=1}^K{e^{z_j}}}=S_i(1-S_i)

$$

now another derivative of $S_i$ w.r.t $z_k$ when $k \ne i$:

$$

\frac{\partial{S_i}}{\partial{z_k}}=\frac{\partial}{\partial{z_k}}(\frac{e^{z_i}}{\sum_{j=1}^K{e^{z_j}}})

$$

once again let's apply the quotient rule of differentiation

$$

\frac{\partial}{\partial{z_k}}(\frac{e^{z_i}}{\sum_{j=1}^K{e^{z_j}}})=\frac{\frac{\partial{e^{z_i}}}{\partial{z_k}}\sum_{j=1}^K{e^{z_j}}-e^{z_i}\frac{\partial \sum_{j=1}^K{e^{z_j}}}{\partial{z_k}}}{(\sum_{j=1}^K{e^{z_j}})^2}

$$

first term in numerator is $0$, because from the basic principles of partial differentiation, where the derivative of a function with respect to a variable that does not appear in the function is $0$

then there's a sum, from which all terms are $0$ except $e^{z_k}$. so then:

$$

\frac{-e^{z_i}e^{z_k}}{(\sum_{j=1}^K{e^{z_j}})^2}=S_i\frac{-e^{z_k}}{\sum_{j=1}^K{e^{z_j}}}

$$

So $\frac{-e^{z_k}}{\sum_{j=1}^K{e^{z_j}}}$ is softmax but with ${z_k}$ as a parameter,so then:

$$

\frac{\partial{S_i}}{\partial{z_k}}=-S_i \cdot S_k

$$

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/jmaczan/ffnn

Awesome Lists containing this project

README