Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/madlittlemods/zig-ocr-mnist-k-nearest-neighbors
Basic OCR example written in Zig using K-nearest neighbor against the MNIST dataset
https://github.com/madlittlemods/zig-ocr-mnist-k-nearest-neighbors
k-nearest-neighbors knn mnist mnist-handwriting-recognition ocr zig
Last synced: about 2 months ago
JSON representation
Basic OCR example written in Zig using K-nearest neighbor against the MNIST dataset
- Host: GitHub
- URL: https://github.com/madlittlemods/zig-ocr-mnist-k-nearest-neighbors
- Owner: MadLittleMods
- Created: 2023-09-30T07:09:30.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-10-01T19:24:01.000Z (over 1 year ago)
- Last Synced: 2024-11-01T13:05:41.330Z (3 months ago)
- Topics: k-nearest-neighbors, knn, mnist, mnist-handwriting-recognition, ocr, zig
- Language: Zig
- Homepage:
- Size: 26.4 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Basic OCR example using K-nearest neighbors against the MNIST dataset
A from scratch, simple OCR project to recognize/detect text in images from the MNIST
dataset which is just a bunch of 28x28 images of white number digits centered on a black
background.We're using K-nearest neighbors to classify the images which is the simplest way we can
compare our test sample against our training samples. Basically it takes our test sample
image and compares it to all the training samples and finds the closest match (yes, it
is inefficient and slow with a lot of training samples).Basically a Zig rewrite following this tutorial by Vlad Harbuz from @clumsycomputer:
https://www.youtube.com/watch?v=vzabeKdW9tE.Just getting my feet wet in OCR.
## Setup
Download and extract the MNIST dataset from http://yann.lecun.com/exdb/mnist/ to a
directory called `data/` in the root of this project. Here is a copy-paste command
you can run:```sh
# Make a data/ directory
mkdir data/ &&
cd data/ &&
# Download the MNIST dataset
curl -O http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz &&
curl -O http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz &&
curl -O http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz &&
curl -O http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz &&
# Unzip the files
gunzip *.gz
```## Building and running
Tested with Zig 0.11.0
```sh
$ zig build run
zig build run
debug: training labels header mnist_data_utils.MnistLabelFileHeader{ .magic_number = 2049, .number_of_labels = 60000 }
debug: training images header mnist_data_utils.MnistImageFileHeader{ .magic_number = 2051, .number_of_images = 60000, .number_of_rows = 28, .number_of_columns = 28 }
debug: testing labels header mnist_data_utils.MnistLabelFileHeader{ .magic_number = 2049, .number_of_labels = 10000 }
debug: testing images header mnist_data_utils.MnistImageFileHeader{ .magic_number = 2051, .number_of_images = 10000, .number_of_rows = 28, .number_of_columns = 28 }
debug: prediction 7
debug: nearest neighbors { k_nearest_neighbors.LabeledDistance{ .label = 7, .distance = 1034 }, k_nearest_neighbors.LabeledDistance{ .label = 7, .distance = 1047 }, k_nearest_neighbors.LabeledDistance{ .label = 7, .distance = 1095 }, k_nearest_neighbors.LabeledDistance{ .label = 7, .distance = 1097 }, k_nearest_neighbors.LabeledDistance{ .label = 7, .distance = 1121 } }
┌──────────┐
│ Label: 7 │
┌────────────────────────────────────────────────────────┐
│ │
│ │
│ │
│ │
│ │
│ │
│ │
│ ▒▒▓▓▓▓▓▓░░░░ │
│ ████████████████████████████▓▓░░ │
│ ▒▒▒▒▒▒▒▒▓▓████████████████████▓▓ │
│ ░░▒▒░░▒▒▒▒▒▒░░░░████▒▒ │
│ ▒▒████░░ │
│ ░░████▒▒ │
│ ▓▓████░░ │
│ ░░████░░ │
│ ▓▓██▓▓░░ │
│ ░░████░░ │
│ ▒▒██▓▓ │
│ ▒▒████░░ │
│ ░░████▓▓ │
│ ░░██████░░ │
│ ░░████▒▒ │
│ ░░████▒▒░░ │
│ ▓▓████░░ │
│ ░░██████░░ │
│ ▒▒██████░░ │
│ ▒▒████░░ │
│ │
└────────────────────────────────────────────────────────┘
...
```## Results/Accuracy
With `k=5`, running with all 60k training images against the 10k test images, we get
an accuracy of 96.72% (9672/10000).With `k=10`, running with all 60k training images against the 10k test images, we get
an accuracy of 95.09% (9509/10000).## Dev notes
See the [*developer notes*](./dev-notes.md) for more information.