{"id":21864112,"url":"https://github.com/raoulluque/imagerecognitionfromscratch","last_synced_at":"2025-04-14T20:54:01.683Z","repository":{"id":264341668,"uuid":"893030099","full_name":"RaoulLuque/ImageRecognitionFromScratch","owner":"RaoulLuque","description":"Deep convolutional neural network implementation from scratch using Python and NumPy to classify the MNIST dataset","archived":false,"fork":false,"pushed_at":"2025-01-17T10:56:44.000Z","size":129950,"stargazers_count":4,"open_issues_count":7,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-28T09:11:08.640Z","etag":null,"topics":["convolutional-neural-networks","digit-recognition","machine-learning","mnist","neural-network","numpy","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/RaoulLuque.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-23T10:55:51.000Z","updated_at":"2025-01-17T10:56:46.000Z","dependencies_parsed_at":"2025-01-17T11:39:18.118Z","dependency_job_id":null,"html_url":"https://github.com/RaoulLuque/ImageRecognitionFromScratch","commit_stats":null,"previous_names":["raoulluque/image-recognition-neural-network","raoulluque/imagerecognitionfromscratch"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RaoulLuque%2FImageRecognitionFromScratch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RaoulLuque%2FImageRecognitionFromScratch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RaoulLuque%2FImageRecognitionFromScratch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RaoulLuque%2FImageRecognitionFromScratch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/RaoulLuque","download_url":"https://codeload.github.com/RaoulLuque/ImageRecognitionFromScratch/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248961083,"owners_count":21189990,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["convolutional-neural-networks","digit-recognition","machine-learning","mnist","neural-network","numpy","python"],"created_at":"2024-11-28T04:07:29.866Z","updated_at":"2025-04-14T20:54:01.665Z","avatar_url":"https://github.com/RaoulLuque.png","language":"Python","readme":"# Image recognition from scratch\r\nThis repository documents the progress of developing a convolutional neural network from scratch. The goal was to classify the MNIST dataset and the best model achieves an error rate of 0.40%. This work is also accompanied by a written work, see [ImageRecognitionFromScratch](Image_recognition_from_scratch.pdf).\r\n\r\n# Models\r\nThe following is a brief summary of different models that represent different checkpoints in development process.\r\n\r\nTo run the code or a specific model, please refer to the [running a model](#running-a-model) section.\r\n\r\nThe logs of the respective models can be found by clicking the links below the respective model to browse the repositories at the respective state and opening the [best_result.log](best_result.log) or [best_result.txt](best_result.log) file (depending on how old the model is).\r\n\r\nThe best model using this library is the [twelfth](#twelfth-model-0040-error-rate) with an error rate of 0.40% on the MNIST dataset test images. To load this model just download the repository at this state and refer to the [running a model](#running-a-model) section. Note that a pretrained model is stored in the [models](models) directory. Since the goal of this assignment was  to achieve an error rate of 0.30% and this repositories implementation does not have GPU support, a Jupyter notebook is provided with which a TensorFlow model can be setup that achieves a sub 0.30% error rate. Said model could be setup with this library as well, would however take a very long time to train.\r\n\r\n## First model (09-10% error rate)\r\n[3f5521c](https://github.com/RaoulLuque/image-recognition-neural-network/tree/3f5521c3a99c06911f46d639afd329db93781204)\r\n- Stochastic gradient descent (batch size of 1)\r\n- Mean square error function\r\n- Model layout:\r\n  ```\r\n  model.add_layer(FCLayer(28 * 28, 100))  # input_shape=(1, 28*28)   ;   output_shape=(1, 100)\r\n  model.add_layer(ActivationLayer(ActivationFunction.tanh))\r\n  model.add_layer(FCLayer(100, 50))  # input_shape=(1, 100)          ;   output_shape=(1, 50)\r\n  model.add_layer(ActivationLayer(ActivationFunction.tanh))\r\n  model.add_layer(FCLayer(50, 10))  # input_shape=(1, 50)            ;   output_shape=(1, 10)\r\n  model.add_layer(ActivationLayer(ActivationFunction.tanh))\r\n  ```\r\n- 9-10% error rate\r\n- 100 epochs\r\n- Fixed learning rate of 0.1\r\n\r\n## Second model (06.75% error rate)\r\n[9eac97e](https://github.com/RaoulLuque/ImageRecognitionFromScratch/tree/9eac97e44408121367c2a4befaad8b49598b5123)\r\n- Mini batch gradient descent (batch size of 32)\r\n- Mean square error function\r\n- Model layout:\r\n  ```\r\n  model.add_layer(FCLayer(28 * 28, 128))  # input_shape=(1, 28*28)   ;   output_shape=(1, 128)\r\n  model.add_layer(ActivationLayer(ActivationFunction.tanh, 128))\r\n  model.add_layer(FCLayer(128, 10))       # input_shape=(1, 128)     ;   output_shape=(1, 10)\r\n  model.add_layer(ActivationLayer(ActivationFunction.tanh, 10))\r\n  ```\r\n- 6.75% error rate\r\n- 100 epochs\r\n- Fixed learning rate of 0.1\r\n\r\n## Third model (03.10% error rate)\r\n[73111ee](https://github.com/RaoulLuque/ImageRecognitionFromScratch/tree/73111ee333557ac0d6c4aefa3cfc2a775a0cccdd)\r\n- Mini batch gradient descent (batch size of 32)\r\n- Cross entropy loss function\r\n- Softmax activation function on last layer\r\n- Model layout:\r\n  ```\r\n  model.add_layer(FCLayer(28 * 28, 128))  # input_shape=(1, 28*28)   ;   output_shape=(1, 128)\r\n  model.add_layer(ActivationLayer(ActivationFunction.tanh, 128))\r\n  model.add_layer(FCLayer(128, 10))       # input_shape=(1, 128)     ;   output_shape=(1, 10)\r\n  model.add_layer(ActivationLayer(ActivationFunction.softmax, 10))\r\n  ```\r\n- 3.1% error rate\r\n- 100 epochs\r\n- Fixed learning rate of 0.1\r\n\r\n## Fourth model (02.64% error rate)\r\n[d578b4b](https://github.com/RaoulLuque/ImageRecognitionFromScratch/tree/d578b4b0c7c053d292ae270f1e7d40fed14926c5)\r\n- Mini batch gradient descent (batch size of 32)\r\n- Cross entropy loss function\r\n- Softmax activation function on last layer\r\n- Adam optimizer\r\n- Model layout:\r\n  ```\r\n  model.add_layer(FCLayer(28 * 28, 128, optimizer=Optimizer.Adam))  # input_shape=(1, 28*28)   ;   output_shape=(1, 128)\r\n  model.add_layer(ActivationLayer(ActivationFunction.tanh, 128))\r\n  model.add_layer(FCLayer(128, 10, optimizer=Optimizer.Adam))       # input_shape=(1, 128)     ;   output_shape=(1, 10)\r\n  model.add_layer(ActivationLayer(ActivationFunction.softmax, 10))\r\n  ```\r\n- 2.64% error rate\r\n- 30 epochs\r\n- Fixed learning rate of 0.01\r\n\r\n## Fifth model (02.19% error rate)\r\n[1a608e1](https://github.com/RaoulLuque/ImageRecognitionFromScratch/tree/1a608e1aa6394129d516857bde713eeddd258f84)\r\n- Mini batch gradient descent (batch size of 32)\r\n- Cross entropy loss function\r\n- Softmax activation function on last layer\r\n- Adam optimizer\r\n- Dropout layers\r\n- Model layout:\r\n  ```\r\n  model.add_layer(FCLayer(28 * 28, 128, optimizer=Optimizer.Adam))  # input_shape=(1, 28*28)   ;   output_shape=(1, 128)\r\n  model.add_layer(ActivationLayer(ActivationFunction.tanh, 128))\r\n  model.add_layer(DropoutLayer(0.2, 128))\r\n  model.add_layer(FCLayer(128, 10, optimizer=Optimizer.Adam))       # input_shape=(1, 128)     ;   output_shape=(1, 10)\r\n  model.add_layer(ActivationLayer(ActivationFunction.softmax, 10))\r\n  ```\r\n- 2.19% error rate\r\n- 50 epochs\r\n- Fixed learning rate of 0.001\r\n\r\n## Sixth model (02.02% error rate)\r\n[251738c](https://github.com/RaoulLuque/ImageRecognitionFromScratch/tree/251738c9ff68e2344f4ee6ded2dfd62f122815c1)\r\n- Mini batch gradient descent (batch size of 16)\r\n- Cross entropy loss function\r\n- Softmax activation function on last layer\r\n- Adam optimizer\r\n- Dropout layers\r\n- Model layout:\r\n  ```\r\n  model.add_layer(\r\n    FCLayer(28 * 28, 128, optimizer=Optimizer.Adam))             # input_shape=(1, 28*28)    ;   output_shape=(1, 128)\r\n    model.add_layer(ActivationLayer(ActivationFunction.ReLu, 128))\r\n    model.add_layer(DropoutLayer(0.2, 128))\r\n    model.add_layer(FCLayer(128, 50, optimizer=Optimizer.Adam))  # input_shape=(1, 128)      ;   output_shape=(1, 50)\r\n    model.add_layer(ActivationLayer(ActivationFunction.ReLu, 50))\r\n    model.add_layer(DropoutLayer(0.2, 50))\r\n    model.add_layer(FCLayer(50, 10, optimizer=Optimizer.Adam))   # input_shape=(1, 50)       ;   output_shape=(1, 10)\r\n    model.add_layer(ActivationLayer(ActivationFunction.softmax, 10))\r\n  ```\r\n- 2.02% error rate\r\n- 200 epochs\r\n- Fixed learning rate of 0.0005\r\n\r\n## Seventh model (01.93% error rate)\r\n[0ae6882](https://github.com/RaoulLuque/ImageRecognitionFromScratch/tree/0ae68824cf0889e8a7dcfc6b965cf504ea153767)\r\n- Mini batch gradient descent (batch size of 16)\r\n- Cross entropy loss function\r\n- Softmax activation function on last layer\r\n- Adam optimizer\r\n- Dropout layers\r\n- (Default) Data augmentation (0.25 Chance to do so)\r\n- Model layout:\r\n    ```\r\n    model.add_layer(\r\n      model.add_layer(FCLayer(28 * 28, 128, optimizer=Optimizer.Adam))  # input_shape=(1, 28*28)    ;   output_shape=(1, 128)\r\n    model.add_layer(ActivationLayer(ActivationFunction.ReLu, 128))\r\n    model.add_layer(DropoutLayer(0.2, 128))\r\n\r\n    model.add_layer(FCLayer(128, 50, optimizer=Optimizer.Adam))         # input_shape=(1, 128)      ;   output_shape=(1, 50)\r\n    model.add_layer(ActivationLayer(ActivationFunction.ReLu, 50))\r\n    model.add_layer(DropoutLayer(0.2, 50))\r\n\r\n    model.add_layer(FCLayer(50, 10, optimizer=Optimizer.Adam))          # input_shape=(1, 50)       ;   output_shape=(1, 10)\r\n    model.add_layer(ActivationLayer(ActivationFunction.softmax, 10))\r\n    ```\r\n- 1.93% error rate\r\n- 100 epochs\r\n- Fixed learning rate of 0.0005\r\n\r\n## Eight model (01.58% error rate)\r\n[c07da15](https://github.com/RaoulLuque/ImageRecognitionFromScratch/tree/c07da150ba8dee6527e3e1474645f096351a8467)\r\n- Mini batch gradient descent (batch size of 16)\r\n- Cross entropy loss function\r\n- Softmax activation function on last layer\r\n- Adam optimizer\r\n- Dropout layers\r\n- (Default) Data augmentation (0.25 Chance to do so)\r\n- Early stopping (min relative delta 0.005 and patience of 15)\r\n- He weight initialization\r\n- Model layout:\r\n    ```\r\n    model.add_layer(FCLayer(28 * 28, 128, optimizer=Optimizer.Adam, weight_initialization=WeightInitialization.he_bias_zero))  # input_shape=(1, 28*28)    ;   output_shape=(1, 128)\r\n    model.add_layer(ActivationLayer(ActivationFunction.ReLu, 128))\r\n    model.add_layer(DropoutLayer(0.2, 128))\r\n\r\n    model.add_layer(FCLayer(128, 50, optimizer=Optimizer.Adam, weight_initialization=WeightInitialization.he_bias_zero))       # input_shape=(1, 128)      ;   output_shape=(1, 50)\r\n    model.add_layer(ActivationLayer(ActivationFunction.ReLu, 50))\r\n    model.add_layer(DropoutLayer(0.2, 50))\r\n\r\n    model.add_layer(FCLayer(50, 10, optimizer=Optimizer.Adam, weight_initialization=WeightInitialization.he_bias_zero))        # input_shape=(1, 50)       ;   output_shape=(1, 10)\r\n    model.add_layer(ActivationLayer(ActivationFunction.softmax, 10))\r\n    ```\r\n- 1.58% error rate\r\n- 175 epochs (early stopping after 91)\r\n- Fixed learning rate of 0.0005\r\n\r\n## Ninth model (00.80% error rate)\r\n[712a13e](https://github.com/RaoulLuque/ImageRecognitionFromScratch/tree/712a13e6be3114f63187c794fe71220213aadf41)\r\n- Mini batch gradient descent (batch size of 16)\r\n- Cross entropy loss function\r\n- Softmax activation function on last layer\r\n- Adam optimizer\r\n- Dropout layers\r\n- (Default) Data augmentation (0.25 Chance to do so)\r\n- Early stopping (min relative delta 0.005 and patience of 20)\r\n- He weight initialization\r\n- 2 2D convolutional layers\r\n- Model layout:\r\n    ```\r\n    # Block 1: input_shape=(BATCH_SIZE, 1, 28, 28) output_shape=(BATCH_SIZE, 8, 28, 28)\r\n    model.add_layer( Convolution2D(D_batch_size=BATCH_SIZE, C_number_channels=1, NF_number_of_filters=8, H_height_input=28, W_width_input=28, optimizer=Optimizer.Adam))\r\n    model.add_layer(ActivationLayer(ActivationFunction.ReLu, 0, convolutional_network=True))\r\n    model.add_layer(MaxPoolingLayer2D(D_batch_size=BATCH_SIZE, PS_pool_size=2, S_stride=2, C_number_channels=8, H_height_input=28, W_width_input=28))\r\n    model.add_layer(DropoutLayer(0.2, 0, convolutional_network=True))\r\n\r\n    # Block 2: input_shape=(BATCH_SIZE, 8, 28, 28) output_shape=(BATCH_SIZE, 16, 14, 14)\r\n    model.add_layer( Convolution2D(D_batch_size=BATCH_SIZE, C_number_channels=8, NF_number_of_filters=16, H_height_input=14, W_width_input=14, optimizer=Optimizer.Adam))\r\n    model.add_layer(ActivationLayer(ActivationFunction.ReLu, 0, convolutional_network=True))\r\n    model.add_layer(MaxPoolingLayer2D(D_batch_size=BATCH_SIZE, PS_pool_size=2, S_stride=2, C_number_channels=16, H_height_input=14, W_width_input=14))\r\n    model.add_layer(DropoutLayer(0.2, 0, convolutional_network=True))\r\n\r\n    # Block 3: input_shape=(BATCH_SIZE, 16, 7, 7) output_shape=(BATCH_SIZE, 16 * 7 * 7)\r\n    model.add_layer(FlattenLayer(D_batch_size=BATCH_SIZE, C_number_channels=16, H_height_input=7, W_width_input=7))\r\n\r\n    # Block 4: input_shape=(BATCH_SIZE, 128 * 7 * 7) output_shape=(BATCH_SIZE, 10)\r\n    model.add_layer(FCLayer(16 * 7 * 7, 10, optimizer=Optimizer.Adam, convolutional_network=True))\r\n    model.add_layer(ActivationLayer(ActivationFunction.softmax, 10, convolutional_network=True))\r\n    ```\r\n- 0.80% error rate\r\n- 150 epochs (early stopping after 29)\r\n- Fixed learning rate of 0.001\r\n\r\n## Tenth model (00.44% error rate)\r\n[b883661](https://github.com/RaoulLuque/ImageRecognitionFromScratch/tree/b8836618da58081a6959b4dbdd59d24a59aab2e7)\r\n- Mini batch gradient descent (batch size of 16)\r\n- Cross entropy loss function\r\n- Softmax activation function on last layer\r\n- Adam optimizer\r\n- Dropout layers\r\n- (Default) Data augmentation (0.5 Chance to do so)\r\n- Early stopping (min relative delta 0.005 and patience of 25)\r\n- He weight initialization\r\n- 3 2D convolutional layers\r\n- Model layout:\r\n    ```\r\n    # Block 1: input_shape=(BATCH_SIZE, 1, 28, 28) output_shape=(BATCH_SIZE, 16, 14, 14)\r\n    model.add_layer(Convolution2D(D_batch_size=BATCH_SIZE, C_number_channels=1, NF_number_of_filters=16, H_height_input=28, W_width_input=28, optimizer=Optimizer.Adam))\r\n    model.add_layer(ActivationLayer(ActivationFunction.ReLu, 0, convolutional_network=True))\r\n    model.add_layer(MaxPoolingLayer2D(D_batch_size=BATCH_SIZE, PS_pool_size=2, S_stride=2, C_number_channels=16, H_height_input=28, W_width_input=28))\r\n    model.add_layer(DropoutLayer(0.2, 0, convolutional_network=True))\r\n\r\n    # Block 2: input_shape=(BATCH_SIZE, 16, 14, 14) output_shape=(BATCH_SIZE, 32, 14, 14)\r\n    model.add_layer(Convolution2D(D_batch_size=BATCH_SIZE, C_number_channels=16, NF_number_of_filters=32, H_height_input=14, W_width_input=14, optimizer=Optimizer.Adam))\r\n    model.add_layer(ActivationLayer(ActivationFunction.ReLu, 0, convolutional_network=True))\r\n    model.add_layer(DropoutLayer(0.2, 0, convolutional_network=True))\r\n\r\n    # Block 3: input_shape=(BATCH_SIZE, 32, 14, 14) output_shape=(BATCH_SIZE, 48, 7, 7)\r\n    model.add_layer(Convolution2D(D_batch_size=BATCH_SIZE, C_number_channels=32, NF_number_of_filters=48, H_height_input=14, W_width_input=14, optimizer=Optimizer.Adam))\r\n    model.add_layer(ActivationLayer(ActivationFunction.ReLu, 0, convolutional_network=True))\r\n    model.add_layer(MaxPoolingLayer2D(D_batch_size=BATCH_SIZE, PS_pool_size=2, S_stride=2, C_number_channels=48, H_height_input=14, W_width_input=14))\r\n    model.add_layer(DropoutLayer(0.2, 0, convolutional_network=True))\r\n\r\n    # Block 4: input_shape=(BATCH_SIZE, 48, 7, 7) output_shape=(BATCH_SIZE, 48 * 7 * 7)\r\n    model.add_layer(FlattenLayer(D_batch_size=BATCH_SIZE, C_number_channels=48, H_height_input=7, W_width_input=7))\r\n\r\n    # Block 5: input_shape=(BATCH_SIZE, 48 * 7 * 7) output_shape=(BATCH_SIZE, 10)\r\n    model.add_layer(FCLayer(48 * 7 * 7, 10, optimizer=Optimizer.Adam, convolutional_network=True))\r\n    model.add_layer(ActivationLayer(ActivationFunction.softmax, 10, convolutional_network=True))\r\n    ```\r\n- 0.44% error rate\r\n- 150 epochs (early stopping after 48)\r\n- Tunable learning rate scheduler (starting learning rate of 0.001)\r\n\r\n## Eleventh model (00.42% error rate)\r\n[b91ea97](https://github.com/RaoulLuque/ImageRecognitionFromScratch/tree/b91ea97b311aa2ebd95666acc398832be9c0db9a)\r\n- Mini batch gradient descent (batch size of 16)\r\n- Cross entropy loss function\r\n- Softmax activation function on last layer\r\n- Adam optimizer\r\n- Dropout layers\r\n- (Default) Data augmentation (0.5 Chance to do so)\r\n- Early stopping (min relative delta 0.005 and patience of 25)\r\n- He weight initialization\r\n- 4 2D convolutional layers\r\n- Model layout:\r\n    ```\r\n    # Block 1: input_shape=(BATCH_SIZE, 1, 28, 28) output_shape=(BATCH_SIZE, 16, 14, 14)\r\n    model.add_layer(\r\n        Convolution2D(D_batch_size=BATCH_SIZE, C_number_channels=1, NF_number_of_filters=16, H_height_input=28,\r\n                      W_width_input=28, optimizer=optimizer))\r\n    model.add_layer(ActivationLayer(ActivationFunction.ReLu, 0, convolutional_network=True))\r\n    model.add_layer(DropoutLayer(0.2, 0, convolutional_network=True))\r\n\r\n    # Block 2: input_shape=(BATCH_SIZE, 16, 28, 28) output_shape=(BATCH_SIZE, 32, 7, 7)\r\n    model.add_layer(\r\n        Convolution2D(D_batch_size=BATCH_SIZE, C_number_channels=16, NF_number_of_filters=32, H_height_input=28,\r\n                      W_width_input=28, optimizer=optimizer))\r\n    model.add_layer(ActivationLayer(ActivationFunction.ReLu, 0, convolutional_network=True))\r\n    model.add_layer(MaxPoolingLayer2D(D_batch_size=BATCH_SIZE, PS_pool_size=2, S_stride=2, C_number_channels=32,\r\n                                      H_height_input=28, W_width_input=28))\r\n    model.add_layer(DropoutLayer(0.2, 0, convolutional_network=True))\r\n\r\n    # Block 3: input_shape=(BATCH_SIZE, 32, 14, 14) output_shape=(BATCH_SIZE, 48, 14, 14)\r\n    model.add_layer(\r\n        Convolution2D(D_batch_size=BATCH_SIZE, C_number_channels=32, NF_number_of_filters=48, H_height_input=14,\r\n                      W_width_input=14, optimizer=optimizer))\r\n    model.add_layer(ActivationLayer(ActivationFunction.ReLu, 0, convolutional_network=True))\r\n    model.add_layer(DropoutLayer(0.2, 0, convolutional_network=True))\r\n\r\n    # Block 4: input_shape=(BATCH_SIZE, 48, 14, 14) output_shape=(BATCH_SIZE, 64, 7, 7)\r\n    model.add_layer(\r\n        Convolution2D(D_batch_size=BATCH_SIZE, C_number_channels=48, NF_number_of_filters=64, H_height_input=14,\r\n                      W_width_input=14, optimizer=optimizer))\r\n    model.add_layer(ActivationLayer(ActivationFunction.ReLu, 0, convolutional_network=True))\r\n    model.add_layer(MaxPoolingLayer2D(D_batch_size=BATCH_SIZE, PS_pool_size=2, S_stride=2, C_number_channels=64,\r\n                                      H_height_input=14, W_width_input=14))\r\n    model.add_layer(DropoutLayer(0.2, 0, convolutional_network=True))\r\n\r\n    # Block 5: input_shape=(BATCH_SIZE, 64, 7, 7) output_shape=(BATCH_SIZE, 64 * 7 * 7)\r\n    model.add_layer(FlattenLayer(D_batch_size=BATCH_SIZE, C_number_channels=64, H_height_input=7, W_width_input=7))\r\n\r\n    # Block 6: input_shape=(BATCH_SIZE, 64 * 7 * 7) output_shape=(BATCH_SIZE, 10)\r\n    model.add_layer(FCLayer(64 * 7 * 7, 10, optimizer=optimizer, convolutional_network=True))\r\n    model.add_layer(ActivationLayer(ActivationFunction.softmax, 10, convolutional_network=True))\r\n    ```\r\n- 0.42% error rate\r\n- 150 epochs (early stopping after 70)\r\n- Tunable learning rate scheduler (starting learning rate of 0.001). Halve after every 5 epochs\r\n\r\n## Twelfth model (00.40% error rate)\r\n[403d08d](https://github.com/RaoulLuque/ImageRecognitionFromScratch/tree/403d08d3e5fdfde9250b921cfb6bbc894aaadee7)\r\n- Mini batch gradient descent (batch size of 16)\r\n- Cross entropy loss function\r\n- Softmax activation function on last layer\r\n- Adam optimizer\r\n- Dropout layers\r\n- (Default) Data augmentation (0.8 Chance to do so)\r\n- Early stopping (min relative delta 0.005 and patience of 15)\r\n- He weight initialization\r\n- 4 2D convolutional layers\r\n- Model layout:\r\n    ```\r\n    # Block 1: input_shape=(BATCH_SIZE, 1, 28, 28) output_shape=(BATCH_SIZE, 16, 14, 14)\r\n    model.add_layer(\r\n        Convolution2D(D_batch_size=BATCH_SIZE, C_number_channels=1, NF_number_of_filters=16, H_height_input=28,\r\n                      W_width_input=28, optimizer=optimizer))\r\n    model.add_layer(ActivationLayer(ActivationFunction.ReLu, 0, convolutional_network=True))\r\n    model.add_layer(DropoutLayer(0.2, 0, convolutional_network=True))\r\n\r\n    # Block 2: input_shape=(BATCH_SIZE, 16, 28, 28) output_shape=(BATCH_SIZE, 32, 7, 7)\r\n    model.add_layer(\r\n        Convolution2D(D_batch_size=BATCH_SIZE, C_number_channels=16, NF_number_of_filters=32, H_height_input=28,\r\n                      W_width_input=28, optimizer=optimizer))\r\n    model.add_layer(ActivationLayer(ActivationFunction.ReLu, 0, convolutional_network=True))\r\n    model.add_layer(MaxPoolingLayer2D(D_batch_size=BATCH_SIZE, PS_pool_size=2, S_stride=2, C_number_channels=32,\r\n                                      H_height_input=28, W_width_input=28))\r\n    model.add_layer(DropoutLayer(0.2, 0, convolutional_network=True))\r\n\r\n    # Block 3: input_shape=(BATCH_SIZE, 32, 14, 14) output_shape=(BATCH_SIZE, 48, 14, 14)\r\n    model.add_layer(\r\n        Convolution2D(D_batch_size=BATCH_SIZE, C_number_channels=32, NF_number_of_filters=48, H_height_input=14,\r\n                      W_width_input=14, optimizer=optimizer))\r\n    model.add_layer(ActivationLayer(ActivationFunction.ReLu, 0, convolutional_network=True))\r\n    model.add_layer(DropoutLayer(0.2, 0, convolutional_network=True))\r\n\r\n    # Block 4: input_shape=(BATCH_SIZE, 48, 14, 14) output_shape=(BATCH_SIZE, 64, 7, 7)\r\n    model.add_layer(\r\n        Convolution2D(D_batch_size=BATCH_SIZE, C_number_channels=48, NF_number_of_filters=64, H_height_input=14,\r\n                      W_width_input=14, optimizer=optimizer))\r\n    model.add_layer(ActivationLayer(ActivationFunction.ReLu, 0, convolutional_network=True))\r\n    model.add_layer(MaxPoolingLayer2D(D_batch_size=BATCH_SIZE, PS_pool_size=2, S_stride=2, C_number_channels=64,\r\n                                      H_height_input=14, W_width_input=14))\r\n    model.add_layer(DropoutLayer(0.2, 0, convolutional_network=True))\r\n\r\n    # Block 5: input_shape=(BATCH_SIZE, 64, 7, 7) output_shape=(BATCH_SIZE, 64 * 7 * 7)\r\n    model.add_layer(FlattenLayer(D_batch_size=BATCH_SIZE, C_number_channels=64, H_height_input=7, W_width_input=7))\r\n\r\n    # Block 6: input_shape=(BATCH_SIZE, 64 * 7 * 7) output_shape=(BATCH_SIZE, 10)\r\n    model.add_layer(FCLayer(64 * 7 * 7, 10, optimizer=optimizer, convolutional_network=True))\r\n    model.add_layer(ActivationLayer(ActivationFunction.softmax, 10, convolutional_network=True))\r\n    ```\r\n- 0.40% error rate\r\n- 150 epochs (early stopping after 42)\r\n- Tunable learning rate scheduler (starting learning rate of 0.001). Halve after every 5 epochs (and every 3 epochs after the 20th epoch)\r\n\r\n# Running a model\r\nTo start up the application, one will have to install the dependencies first. [uv](https://github.com/astral-sh/uv) is recommended to be installed. An installation guide can be found [here](https://docs.astral.sh/uv/getting-started/). If [pipx](https://pipx.pypa.io/stable/) is already installed on the machine, it is as easy as\r\n````commandline\r\npipx install uv\r\n````\r\n\r\nAfter having installed uv, to create a venv and install the necessary dependencies, run:\r\n```commandline\r\nuv python install\r\nuv sync --all-extras --dev\r\n```\r\nThe above will install all dependencies. To finish the setup of the python environment, please also run:\r\n```commandline\r\nset -a\r\nsource .env\r\n```\r\n\r\nNow the project could be run with\r\n```commandline\r\nuv run src/main.py\r\n```\r\nHowever, the project uses [poethepoet](https://github.com/nat-n/poethepoet) as a task runner. To install poethepoet, run with pipx installed\r\n````commandline\r\npipx install poethepoet\r\n````\r\n\r\nNow the application can be started by running\r\n```commandline\r\npoe run\r\n```\r\n\r\nTo run a specific model, click on the link provided below the model in this README, and download the source code of that specific commit and proceed as described above.\r\n\r\nFor some models, a pre-trained model is provided in the [models](models) directory is provided. This is either a zipped model which has to be extracted or a .pkl file, whose name can be provided at line 61 in [main.py](src/main.py) as the `model_to_load` variable.\r\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fraoulluque%2Fimagerecognitionfromscratch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fraoulluque%2Fimagerecognitionfromscratch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fraoulluque%2Fimagerecognitionfromscratch/lists"}