https://github.com/temporal/vcs

A simple, off-line voice recognition and control system for music control.
https://github.com/temporal/vcs

speech-api voice-control voice-recognition winamp

Last synced: 15 days ago
JSON representation

A simple, off-line voice recognition and control system for music control.

Host: GitHub
URL: https://github.com/temporal/vcs
Owner: TeMPOraL
Created: 2017-06-09T23:38:48.000Z (over 8 years ago)
Default Branch: master
Last Pushed: 2017-06-09T23:39:09.000Z (over 8 years ago)
Last Synced: 2025-01-19T14:55:06.603Z (12 months ago)
Topics: speech-api, voice-control, voice-recognition, winamp
Language: C++
Size: 458 KB
Stars: 1
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.org

Awesome Lists containing this project

README

          #+title: Voice Control System

#+startup: hidestars

/This software is a piece of ancient history; I worked on that circa 2007./

A Voice Control System for controlling WinAmp under Microsoft Windows using voice commands. Works

*completely off-line*, relying on user training instead of cloud computing.

A piece of software I wrote to be able to switch music without using the computer directly - it helped me

avoid distractions during a period of heavy studying.

The software operates on a simple tree grammar:

- =Computer=

  - =Music [control]=

    - =Playback=

      - =Stop=

      - =Pause=

      - =Resume=

      - =Loop=

      - =Repeat [Playlist]=

      - =Shuffle=

      - =Next=

      - =Previous=

    - =Volume=

      - =Mute=

      - =Full=

      - =Half=

      - =One quarter=

      - =Three quarters=

      - =Louder=

      - =Quieter=

    - =Playlist=

      - =Alpha=

      - =Beta=

      - =Gamma=

      - =Delta=

    - =Track info=

    - =Preserve=

    - =Release=

  - =Exit=

Note the existence of =Preserve= / =Release= pair of command. The first one locks system to the =Music control= subtree,

so that I didn't have to repeat that part of the tree to issue more music commands. The second one releases the lock, thus

returning the default state of the program to the top level.

The system was trained using built-in Speech API training in control panel, with a set of all words in the entire grammar.

After few training iterations (see [[file:testy.pdf][testy.pdf]] for my training sheet) in various conditions - different places of the room, different

volume of music playback - the system became pretty reliable, and I could use it from any place in the room even with loud music

or radio playing.

Overall, this system kind of proves that useful voice recognition does /not/ require computation in the cloud - it was entirely feasible

using 2007 tech and a mid-range PC from that era.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/temporal/vcs

Awesome Lists containing this project

README