{"id":16958485,"url":"https://github.com/dohliam/audio-cloze-tests","last_synced_at":"2025-10-08T23:20:49.367Z","repository":{"id":88991751,"uuid":"136404494","full_name":"dohliam/audio-cloze-tests","owner":"dohliam","description":"Audio cloze test generator using open data","archived":false,"fork":false,"pushed_at":"2018-06-07T01:22:16.000Z","size":33,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-09-13T01:30:30.461Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"http://storybookscanada.ca/cloze/","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dohliam.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-06-07T01:22:09.000Z","updated_at":"2025-02-13T04:12:26.000Z","dependencies_parsed_at":"2023-06-13T11:15:28.301Z","dependency_job_id":null,"html_url":"https://github.com/dohliam/audio-cloze-tests","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/dohliam/audio-cloze-tests","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dohliam%2Faudio-cloze-tests","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dohliam%2Faudio-cloze-tests/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dohliam%2Faudio-cloze-tests/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dohliam%2Faudio-cloze-tests/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dohliam","download_url":"https://codeload.github.com/dohliam/audio-cloze-tests/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dohliam%2Faudio-cloze-tests/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279000735,"owners_count":26082862,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-08T02:00:06.501Z","response_time":56,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-13T22:42:43.064Z","updated_at":"2025-10-08T23:20:49.332Z","avatar_url":"https://github.com/dohliam.png","language":"Ruby","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Audio cloze test generator using open data\n\nThis repository contains code to automatically generate cloze tests with audio support based on data from the [Global Storybooks](globalstorybooks.net) project. They are designed as an aid for self-study rather than diagnostic testing.\n\nThe generated pages update to test new randomly selected words on refresh, so no test is ever the same. They are separated by level so users can select the most appropriate test for their language ability.\n\nIn order to generate the tests you will need to download a copy of one of the Global Storybooks source repos. A good example to start with, containing a variety of different languages, is the [sbc-source](https://github.com/global-asp/sbc-source) repo. You can also (optionally) download a local copy of the audio files rather than streaming them online.\n\n## Demo\n\nYou can use the cloze tests generated by this script live online on the [Storybooks Canada](http://storybookscanada.ca/cloze) website.\n\n## Features\n\nIt is important to note that this is meant as a self-study aid, rather than a classroom assessment tool.\n\nThis is a proof of concept that may have applications for other purposes -- for exampe, it should be quite simple to use this framework to create all types of test content -- including hand-selected clozes drawing on similar parts of speech.\n\nThe main appeal of the audio cloze test is that it \"gamifies\" listening practice. Some users may find it to be an enjoyable way of checking to see if they have heard the text correctly.\n\nOne thing to note is that although it would in theory be simple to \"guess\" the answer from the context (or just try each option until hitting on the right one), since this is very low stakes there is really no point in doing so. You may find that you have to listen much more closely than you would otherwise to see if you can pick out the correct word from the audio.\n\nDesign principles of this experiment:\n\n* Clozes or \"distractors\" are drawn entirely from the [Global Storybooks](globalstorybooks.net) corpus of 40 stories in each given language\n  * In principle, the clozes could be restricted to stories of the same level, something which might be added in the next update.\n* The test updates with new, randomly-selected clozes each time the page is loaded, meaning it is possible to test yourself on the same story multiple times for listening practise.\n* The tests are generated entirely automatically, meaning an infinite number of tests can be created at random from the same corpus of 40 stories.\n* All the tests are audio-linked at the sentence/paragraph level, drawing on our database of recorded audio in multiple languages.\n* There is instant feedback on whether the answer was right or wrong, and a running tally to keep track of your progress\n* Rather than trying to classify words by parts of speech (possible in theory for English, but much more complicated if working with a dozen other languages), the clozes for each word have been selected at random from all the words beginning with the same letter in the corpus\n  * This is a rough but surprisingly adequate proxy for \"similarity\" this purpose\n  * If you try this with a language you are learning or are not very familiar with, you may find it requires a fair bit of attention to distinguish words beginning with the same sounds from a list!\n* Stopwords (\"the\", \"was\", \"their\" etc) have been left in deliberately to allow readers to practise discriminating basic vocabulary\n  * Although it would be quite simple to remove these using e.g., [this project](https://github.com/6/stopwords-json) for the purposes of listening practise it has been felt that sometimes common words are just as important as \"content\" words in a new language -- this could be changed in a future update though (particularly at higher levels where they might be distracting)\n* At the moment, test generation works for all languages that use spaces to separate words (for this reason Chinese and Japanese will require either automated or manual semantic parsing and are not yet included)\n\n### Language-specific features\n\n* Proper names ignored:\n  * Applies to English and most other Latin-based orthographies\n* Case support:\n  * The German corpus preserves letter case distinctions\n* Right-to-left language support:\n  * Special templates are automatically applied to accommodate Arabic, Persian, and Urdu\n  * Includes changes to text direction and language-specific fonts\n\n## Installation\n\nThis script requires a working Ruby installation (ideally 1.9.3 or above), as well as the unicode_utils gem and a copy of a Global Storybooks source repo with markdown files.\n\nTo install unicode_utils:\n\n    gem install unicode_utils\n\nTo get a copy of a markdown source folder you can download the [sbc-source](https://github.com/global-asp/sbc-source) repo and place the entire folder in the root directory of this project.\n\nIf you would like to download the audio rather than streaming online, you can obtain it in all available languages from [this repo](https://github.com/global-asp/gsn-audio). Once you have the audio stored locally you will need to update the URLs in `cloze.js` to point to your local filesystem or webhost.\n\n## Usage\n\nIf you have all of the [requirements](#installation) listed above, you should be able to enter the project folder and run the following command:\n\n    ./parse_cloze [LANG]\n\n(Where `[LANG]` is an ISO language code -- for example, `en` for English, or `es` for Spanish.)\n\nThis will create a subfolder in the project root directory called `json_output`, in which there should be a subfolder named after the language (e.g., `en` or `es`), as well as two folders containing the Javascript and CSS files needed to run the site. Within the language subfolder there should be 40 story folders arranged by index number, each of which contains an individual audio cloze test. These tests should work in any Javascript-enabled browser without further configuration.\n\n## Acknowledgements\n\n* CSS: [Spectre CSS](https://github.com/picturepan2/spectre)\n* Audio: [GSN Audio](https://github.com/global-asp/gsn-audio)\n* Original story text: [African Storybook](http://africanstorybook.org)\n* Translations: [Global ASP](https://github.com/global-asp/global-asp)\n\n## License\n\nMIT.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdohliam%2Faudio-cloze-tests","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdohliam%2Faudio-cloze-tests","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdohliam%2Faudio-cloze-tests/lists"}