{"id":13744211,"url":"https://github.com/MycroftAI/mimic-recording-studio","last_synced_at":"2025-05-09T02:33:03.953Z","repository":{"id":38673846,"uuid":"149193848","full_name":"MycroftAI/mimic-recording-studio","owner":"MycroftAI","description":"Mimic Recording Studio is a Docker-based application you can install to record voice samples, which can then be trained into a TTS voice with Mimic2","archived":false,"fork":false,"pushed_at":"2023-04-28T16:11:13.000Z","size":6233,"stargazers_count":508,"open_issues_count":41,"forks_count":119,"subscribers_count":14,"default_branch":"master","last_synced_at":"2025-05-02T17:00:18.701Z","etag":null,"topics":["docker","hacktoberfest","microphone","mimic","mycroft","mycroftai","recording-studio","tacotron","tts","tts-engine","voice"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MycroftAI.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2018-09-17T22:05:30.000Z","updated_at":"2025-05-01T02:32:46.000Z","dependencies_parsed_at":"2023-02-02T20:15:51.705Z","dependency_job_id":"a76e60b6-749a-4b25-936b-0cec78104e1e","html_url":"https://github.com/MycroftAI/mimic-recording-studio","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MycroftAI%2Fmimic-recording-studio","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MycroftAI%2Fmimic-recording-studio/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MycroftAI%2Fmimic-recording-studio/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MycroftAI%2Fmimic-recording-studio/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MycroftAI","download_url":"https://codeload.github.com/MycroftAI/mimic-recording-studio/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253177882,"owners_count":21866414,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker","hacktoberfest","microphone","mimic","mycroft","mycroftai","recording-studio","tacotron","tts","tts-engine","voice"],"created_at":"2024-08-03T05:01:05.211Z","updated_at":"2025-05-09T02:33:02.134Z","avatar_url":"https://github.com/MycroftAI.png","language":"JavaScript","funding_links":[],"categories":["JavaScript"],"sub_categories":[],"readme":"# Mimic Recording Studio\n\n![demo](./img/demo.gif)\n\n- [Mimic Recording Studio](#mimic-recording-studio)\n  * [Software Quick Start](#software-quick-start)\n    + [Windows self-hosted Quick Start](#windows-self-hosted-quick-start)\n    + [Linux/Mac self-hosted Quick Start](#linuxmac-self-hosted-quick-start)\n      - [Install Dependencies](#install-dependencies)\n      - [Build and Run](#build-and-run)\n    + [Manual Install, Build and Start](#manual-install--build-and-start)\n      - [Backend](#backend)\n        * [Dependencies](#dependencies)\n        * [Build \u0026 Run](#build---run)\n      - [Frontend](#frontend)\n        * [Dependencies](#dependencies-1)\n        * [Build \u0026 Run](#build---run-1)\n    + [Coming soon!](#coming-soon-)\n  * [Data](#data)\n    + [Audio Recordings](#audio-recordings)\n      - [WAV files](#wav-files)\n      - [{uuid}-metadata.txt](#-uuid--metadatatxt)\n    + [Corpus](#corpus)\n      - [Corpora in other languages](#corpora-in-other-languages)\n  * [Technologies](#technologies)\n    + [Frontend](#frontend-1)\n      - [Functions](#functions)\n    + [Backend](#backend-1)\n      - [Functions](#functions-1)\n    + [Docker](#docker)\n- [Recording Tips](#recording-tips)\n- [Advanced](#advanced)\n  * [Query database structure](#query-database-structure)\n    * [Table \"audiomodel\"](#table-\"audiomodel\")\n    * [Table \"usermodel\"](#table-\"usermodel\")\n  * [Modify recorder uuid](#modify-recorder-uuid)\n- [Providing your recording to Mycroft for training](#providing-your-recording-to-mycroft-for-training)\n- [Contributions](#contributions)\n- [Where to get support and assistance](#where-to-get-support-and-assistance)\n\nThe [Mycroft](https://mycroft.ai) open source Mimic technologies are\nText-to-Speech engines which take a piece of written text and convert it into\nspoken audio. The latest generation of this technology,\n[Mimic 2](https://github.com/MycroftAI/mimic2), uses machine learning techniques\nto create a model which can speak a specific language, sounding like the voice\non which it was trained.\n\nThe Mimic Recording Studio simplifies the collection of training data from\nindividuals, each of which can be used to produce a distinct voice for Mimic.\n\n\n\n## Software Quick Start\n\n### Windows self-hosted Quick Start\n\n* `git clone https://github.com/MycroftAI/mimic-recording-studio.git`\n* `cd mimic-recording-studio`\n* `start-windows.bat`\n\n\n### Linux/Mac self-hosted Quick Start\n\n#### Install Dependencies\n* [Docker](https://docs.docker.com/) (community edition is fine)\n* [Docker Compose](https://docs.docker.com/compose/install/)\n\nWhy docker? To make this super easy to set up and run cross platforms.\n\n#### Build and Run\n\n* `git clone https://github.com/MycroftAI/mimic-recording-studio.git`\n* `cd mimic-recording-studio`\n* `docker-compose up` to build and run (_Note: You may need to use `sudo docker-compose up` depending on your distribution_)\n\n  Alternatively, you can build and run separately. `docker-compose build` then `docker-compose up`\n* In your browser, go to `http://localhost:3000`\n\n**Note:**\nThe first execution of `docker-compose up` will take a while as this command will also build the docker containers. Subsequent executions of `docker-compose up` should be quicker to boot.\n\n### Manual Install, Build and Start\n\n#### Backend\n\n##### Dependencies\n\n* python 3.5 +\n* [ffmpeg](https://www.ffmpeg.org/)\n\n##### Build \u0026 Run\n\n* `cd backend/`\n* `pip install -r requirements.txt`\n* `python run.py`\n\n#### Frontend\n\n##### Dependencies\n\n* [node \u0026 npm](https://nodejs.org/en/)\n* [create-react-app](https://github.com/facebook/create-react-app)\n* [yarn](https://yarnpkg.com/en/) - optional for faster build, install, and start\n\n##### Build \u0026 Run\n\n* `cd frontend/`\n* `npm install`, alternatively `yarn install`\n* `npm start`, alternatively `yarn start`\n\n### Coming soon!\nOnline, http://mimic.mycroft.ai hosted version requiring zero setup.\n\n\n## Data\n\n### Audio Recordings\n\n#### WAV files\n\nAudio is saved as WAV files to the `backend/audio_file/{uuid}/` directory. The\nbackend automatically trims the beginning and ending silence for all WAV files\nusing [ffmpeg](https://www.ffmpeg.org/).\n\n#### {uuid}-metadata.txt\n\nMetadata is also saved to `backend/audio_file/{uuid}/`. This file maps the WAV\nfile name to the phrase spoken. This along with the WAV files are what you\nneeded to get started on training [Mimic 2](https://github.com/MycroftAI/mimic2).\n\n### Corpus\n\nFor now, we have an English corpus, `english_corpus.csv` made available which\ncan be found in `backend/prompt/`. To use your own corpus follow these steps.\n\n1. Create a csv file in the same format as `english_corpus.csv` using tabs\n   (`\\t`) as the delimiter.\n2. Make sure there are no empty lines in the corpus\n3. Add your corpus to the `backend/prompt` directory.\n4. Change the `CORPUS` environment variable in `docker-compose.yml` to your\n   corpus name.\n   \n#### Corpora in other languages\n\nIf you wish to develop a corpus in a language other than English, then Mimic Recording Studio can be used to produce voice recordings for TTS voices in additional languages. If you are building a corpus in a language other than English, we encourage you to choose phrases which: \n\n* occur in natural, everyday speech in the target language\n* have a variety of string lengths\n* cover a wide variety of _phonemes_ (basic sounds)\n\n**IMPORTANT:**\nFor now, you must reset the `sqlite` database to use a new corpus. If you've\nrecorded on another corpus and would like to save that data, you can simply\nrename your `sqlite` db found in `backend/db/` to another name. The backend will\ndetect that `mimicstudio.db` is not there and create a new one for you. You may\ncontinue recording data for your new corpus.\n\n## Technologies\n\n### Frontend\n\nThe web UI is built using Javascript and [React](https://reactjs.org/) and\n[create-react-app](https://github.com/facebook/create-react-app) as a\nscaffolding tool. Refer to [CRA.md](/frontend/CRA.md) to find out more on how to\nuse create-react-app.\n\n#### Functions\n\n* Record and play audio\n* Generate audio visualization\n* Calculate and display metrics\n\n### Backend\n\nThe web service is built using Python, [Flask](http://flask.pocoo.org/) as the\nbackend framework, [gunicorn](https://gunicorn.org/) as a http webserver, and\n[sqlite](https://www.sqlite.org/index.html) as the database.\n\n#### Functions\n\n* Process audio\n* Serves corpus and metrics data\n* Record info in database\n* Record data to the file system\n\n### Docker\n\nDocker is used to containerize both applications. By default, the frontend uses\nnetwork port `3000` while the backend uses networking port `5000`. You can\nconfigure these in the `docker-compose.yml` file.\n\n_NOTE: If you are running `docker-registry`, this runs by default on port `5000`, so you will need to change which port you use._\n\n# Recording Tips\n\nCreating a voice requires an achievable, but significant effort. An individual will need to record 15,000 - 20,000 phrases.  In order to get the best possible Mimic voice, the recordings need to be clean and consistent. To  that end, follow these recommendations:\n\n* Record in a quiet environment with noise-dampening material.\n  If your ears can hear outside noise, so can the microphone. For best results,\n  even the sound of air conditioning blowing through a vent should be avoided.\n  Bare walls create subtle echoes and reverberation.  A sound dampening booth\n  is ideal, but you can also create a homemade recording studio using soft\n  materials such as acoustic foam in a closet.  Comforters and mattresses can\n  also be used effectively!\n* Speak at a consistent volume and speed.  Rushing through the phrases will only\n  result in a lower quality voice.\n* Use a quality microphone.\n  To obtain consistent results, we recommend a headset microphone so your mouth\n  is always the same distance from the mic.\n* Avoid vocal fatigue.\n  Record a maximum of 4 hours a day, taking a break every half hour.\n* Backup your Mimic-Recording-Studio directory on a regular basis to avoid data loss.\n\n# Advanced\n\n## Query database structure\nMimic-Recording-Studio writes all recordings in a sqlite database file located under /backend/db/. This can be opened with database tools like DBeaver.\n\nThe database includes two tables.\n\n![database_table_overview](./img/database_table_overview.png)\n\n### Table \"audiomodel\"\nAll recordings are persisted in this table with \n* recording timestamp (created_date)\n* uuid of speaker (matches the filesystem path under /backend/audio_files/id)\n* wav filename in filesystem (audio_id)\n* text of recorded phrase (phrase)\n\nThe database can be used to query your recordings.\n\nHere are some example queries:\n\n```sql\n-- List all recordings\nSELECT * FROM audiomodel;\n\n-- Lists recordings from january 2020 order by phrase\nSELECT * FROM audiomodel WHERE created_date BETWEEN '2020-01-01' AND '2020-01-31' ORDER BY prompt;\n\n-- Lists number of recordings per day\nSELECT DATE(created_date), COUNT(*) AS RecordingsPerDay\nFROM audiomodel\nGROUP BY DATE(created_date )\nORDER BY DATE(created_date)\n\n-- Shows average text length of recordings\nSELECT AVG(LENGTH(prompt)) AS avgLength FROM audiomodel\n```\n\nThere are many ways that querying the sqlite database might be useful. For example, looking for recordings in a specific time range might help to remove recordings made in a bad environment.\n\n### Table \"usermodel\"\nMimic-Recording-Studio can be used by more than one speaker using the same sqlite database file.\n\nThis tables provides following informations per speaker:\n* Unique identifier of speaker (uuid)\n* Name of speaker (user_name)\n* Newest recorded line number of corpus (prompt_num)\n* Total recording time (total_time_spoken)\n* How many chars have been recorded (len_char_spoken)\n\nThese values are used to calculate metrics. For example, the speaking pace may show if the recorded phrase is too fast or slow compared to previous recordings.\n\nQuery table \"usermodel\" to get a list of speakers including uuid and some recording statistics on them.\n\n```sql\nSELECT user_name AS [name], uuid FROM usermodel;\n```\n\n![database_table_usermodel](./img/database_table_usermodel.png)\n\n\n## Modify recorder uuid\nThe browser used to record your phrases persists the users `uuid` and `name` in it's localStorage to keep it synchronous with sqlite and filesystem.\n\nIf a problem occurs and your browser looses/changes uuid mapping for Mimic-Recording-Studio you could have difficulties to continue a previous recording session.\nThen update the following two attributes in localStorage of your browser:\n\n* uuid ([Query table \"usermodel\"](#table-\"usermodel\") or check filesystem path under /backend/audio_files/)\n* name ([Query table \"usermodel\"](#table-\"usermodel\"))\n\n\nOpen Mimic-Recording-Studio in your browser, jump to web-developer options, localStorage and set name and uuid to the original values.\n\n![browser_local_storage](./img/browser_localStorage.png)\n\nAfter that you should be able to continue your previous recording session without further problems.\n\n# Providing your recording to Mycroft for training\n\nWe welcome your voice donations to Mycroft for use in Text-to-Speech applications. If you would like to provide your voice recordings, you _must_ license them to us under the Creative Commons [CC0 Public Domain license](https://creativecommons.org/share-your-work/public-domain/cc0/) so that we can utilise them in TTS voices - which are derivative works. If you're ready to donate your voice recordings, email us at hello@mycroft.ai. \n\n# Contributions\n\nPR's are gladly accepted!\n\n# Where to get support and assistance\n\nYou can get help and support with Mimic Recording Studio at; \n\n* The [Mycroft Forum](https://community.mycroft.ai)\n* In [Mycroft Chat](https://chat.mycroft.ai)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FMycroftAI%2Fmimic-recording-studio","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FMycroftAI%2Fmimic-recording-studio","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FMycroftAI%2Fmimic-recording-studio/lists"}