{"id":26731766,"url":"https://github.com/stackedcache/lingvobaza","last_synced_at":"2025-03-28T00:37:59.244Z","repository":{"id":283987659,"uuid":"953456679","full_name":"stackedcache/lingvobaza","owner":"stackedcache","description":"Using python to generate sound bytes for Russian language study and practice. ","archived":false,"fork":false,"pushed_at":"2025-03-23T13:39:34.000Z","size":0,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-23T14:32:03.477Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/stackedcache.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-23T12:17:08.000Z","updated_at":"2025-03-23T13:39:37.000Z","dependencies_parsed_at":"2025-03-23T14:42:58.885Z","dependency_job_id":null,"html_url":"https://github.com/stackedcache/lingvobaza","commit_stats":null,"previous_names":["stackedcache/lingvobaza"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stackedcache%2Flingvobaza","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stackedcache%2Flingvobaza/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stackedcache%2Flingvobaza/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stackedcache%2Flingvobaza/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/stackedcache","download_url":"https://codeload.github.com/stackedcache/lingvobaza/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245949550,"owners_count":20698916,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-03-28T00:37:56.821Z","updated_at":"2025-03-28T00:37:59.208Z","avatar_url":"https://github.com/stackedcache.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# LINGVOBAZA - The Language Base\n\nLingvobaza is a system I created to generate audio files using Google's text to speech python library.\n\nThe `/scripts/generate-audio-from-csv.py` script reads the csv file in `/data/phrases.csv` and generates corresponding audio files.\n\nThe CSV has the following layout: \n\n| id | category | english_phrase | russian_phrase | explanation_ru | explanation_en | audio_generated |\n|----|----------|----------------|----------------|----------------|----------------|-----------------|\n| 1 | directions | You can find him under the tree. | Его можно найти под деревом. | Под... Под… Под | This is the word for under (this is a preposition) | yes/no |\n\n- The script reads the csv and checks if the `audio_generated` field is marked yes or no. \n    - Fields marked yes are skipped and not regenerated.\n    - This allows phrases to be added over time. \n\n- For each phrase that is not generated, a subdirectory based on category is created, and the first four words of \nthe english phrase are used as the filename. \n\n- The resulting audio file: \n    - Reads the English phrase  \n    - Reads the Russian phrase\n    - Reads a slower version of the Russian phrase \n    - Reads the Russian explanation\n    - Reads the English explanation. \n\n## USING THE SYSTEM \n\n- Simply clone the repository or set up the file structure, populate your CSV as described above, and run the `generate-audio-from-csv.py` script.\n- Then you can use the audio player of your choice to listen to the files. \n    - I am using ffplay on Debian to play files from the cli\n\n## EXAMPLE OUTPUT\n\n```bash\n$ python3 ./generate-audio-from-csv.py \n5 Files to generate!\n[+] Generating: 1_you_can_find_him.mp3\n[+] Generating: 2_first_he_left_then.mp3\n[+] Generating: 3_he_was_angry_not.mp3\n[+] Generating: 4_although_he_was_tired.mp3\n[+] Generating: 5_he_ended_up_in.mp3\nDone generating new audio files\n```\n\n### EXAMPLE STRUCTURE \n\n```bash\n../\n├── audio\n│   └── phrases\n│       ├── connectors\n│       │   ├── 2_first_he_left_then.mp3\n│       │   ├── 3_he_was_angry_not.mp3\n│       │   └── 4_although_he_was_tired.mp3\n│       ├── directions\n│       │   └── 1_you_can_find_him.mp3\n│       └── verbs\n│           └── 5_he_ended_up_in.mp3\n├── data\n│   └── phrases.csv\n├── README.md\n└── scripts\n    └── generate-audio-from-csv.py\n```\n\n## LIVE AUDIO PLAYER! (Github Pages Frontend)\n\nYou can use the LingvoBaza system directly from your browser or mobile:\n[Live Audio Flashcard Player](https://stackedcache.github.io/lingvobaza/)\n\nThis lightweight front end lets you: \n    - Play all generated audio files.\n    - View example sentences and explanations.\n    - Works on mobile (Bootstrap-based, dark mode)\n    - More features coming soon (maybe ;P)\n\n### TECHNICAL EXPLANATION \n\n- The python generation script now creates a JSON file based on the spreadsheet content\n- There is a branch of this repo -- `gh-pages-frontend`\n- `gh-pages-frontend` serves as the source for GitHub pages \n- The script.js file parses the JSON file to populate the HTML side\n- HTML5 audio players source the data from the audio folders of the repo.\n\n## FUTURE IMPROVEMENTS \n\n- Possibly to build a front end audio player to listen from the web on any device.\n- Possibly to build a `play_all` script to loop through all files in the audio directories\n- Possibly to add a `shuffle` feature to either web front end or playthrough scripts\n\n\n## Comments and suggestions\n\n- Please reach out on Github or [Substack](https://stackedcache.substack.com/) with any ideas or improvements\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstackedcache%2Flingvobaza","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstackedcache%2Flingvobaza","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstackedcache%2Flingvobaza/lists"}