https://github.com/bluebie/nzsl-training-data-generator
Tool for reading NZSL-Dictionary dataset, and using PoseNet ML model to extract information and images from video of NZSL sign performances, to generate datasets to train CNNs to recognise traits of visual signed languages
https://github.com/bluebie/nzsl-training-data-generator
linguistics ml nzsl posenet sign-language
Last synced: 6 months ago
JSON representation
Tool for reading NZSL-Dictionary dataset, and using PoseNet ML model to extract information and images from video of NZSL sign performances, to generate datasets to train CNNs to recognise traits of visual signed languages
- Host: GitHub
- URL: https://github.com/bluebie/nzsl-training-data-generator
- Owner: Bluebie
- Created: 2018-11-22T06:49:12.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2018-11-29T15:11:27.000Z (almost 7 years ago)
- Last Synced: 2025-04-30T10:14:11.114Z (6 months ago)
- Topics: linguistics, ml, nzsl, posenet, sign-language
- Language: JavaScript
- Size: 46.9 KB
- Stars: 5
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
NZSL Training Data Generator
============================Reads data from @Bluebie/NZSL-Dictionary dataset, and uses ffmpeg to extract video frames, posenet & tensorflow.js to classify poses, and some manual filtering to assess quality. At the output stage, images of features like hands and faces can be extracted from the videos and labeled with metadata from the NZSL-Dictionary dataset like location and handshape.
This big mess is intended to be a tool for generating training datasets to teach a convolutional neural network to recognise BANZLAN sign language handshapes from video. I hope it can be useful to analise the linguistic features of related languages like Auslan and BSL, or build better models for computers to communicate directly using sign languages.
### Files
* `video-reader.js`: a simple ffmpeg interface, to convert video files in to a stack of PNG images in a temporary directory
* `pose-machine.js`: a wrapper around PoseNet, with some aditional filtering to improve quality signals and reject bad classifications
* `nzsl.js`: a really simple model to read in data from the NZSL-Dictionary dataset, and conveniently access related videos and image files
* `extractor-bot.js`: the main script, which queries the NZSL-Dictionary dataset and bulk exports labeled training samples
* `run-extractor-bot.js`: example of extractor bot in use
* `test.js`: a really bad test script that just pokes a few apis so i can see they're working, desperately in need of replacement with a propper testing system