{"id":29026111,"url":"https://github.com/mtg/freesound-datasets","last_synced_at":"2026-03-17T02:38:30.588Z","repository":{"id":19036845,"uuid":"84932543","full_name":"MTG/freesound-datasets","owner":"MTG","description":"A platform for the collaborative creation of open audio collections labeled by humans and based on Freesound content. ","archived":false,"fork":false,"pushed_at":"2023-10-06T12:17:34.000Z","size":23909,"stargazers_count":133,"open_issues_count":24,"forks_count":12,"subscribers_count":23,"default_branch":"master","last_synced_at":"2024-04-15T00:14:59.860Z","etag":null,"topics":["crowdsourcing","dataset","freesound"],"latest_commit_sha":null,"homepage":"https://annotator.freesound.org/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MTG.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-03-14T09:50:37.000Z","updated_at":"2024-01-04T16:12:18.000Z","dependencies_parsed_at":"2022-08-24T01:11:04.976Z","dependency_job_id":null,"html_url":"https://github.com/MTG/freesound-datasets","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/MTG/freesound-datasets","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MTG%2Ffreesound-datasets","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MTG%2Ffreesound-datasets/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MTG%2Ffreesound-datasets/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MTG%2Ffreesound-datasets/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MTG","download_url":"https://codeload.github.com/MTG/freesound-datasets/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MTG%2Ffreesound-datasets/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262003990,"owners_count":23243358,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crowdsourcing","dataset","freesound"],"created_at":"2025-06-26T05:08:27.815Z","updated_at":"2026-03-17T02:38:30.583Z","avatar_url":"https://github.com/MTG.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Freesound Annotator\nThe platform changed its name from *Freesound Datasets* to *Freesound Annotator*.\n\n[Freesound Annotator](https://annotator.freesound.org/) is a platform for the collaborative creation of open audio collections labeled by humans and based on Freesound content. Freesound Annotator allows the following functionalites:\n- **explore** the contents of datasets\n- **contribute** to the creation of the datasets by providing annotations\n- it will allow to **download** different _timestamped_ releases of the datasets\n- it also promotes **discussions** around both platform and datasets\n\nThis repository serves the following main purposes:\n- **development and maintenance** of the Freesound Annotator\n- allow people to see the ongoing progress in a **transparent** manner\n- concentrate **discussion** from the community\n\nWe would like the community to get involved and to **share comments and suggestions** with us and other users. Feel free to take a look at the issues and join ongoing discussions, or create a new issue. We encourage discussion about several aspects of the datasets and the platform, including but not limited to: faulty audio samples, wrong annotations, annotation tasks protocol, etc. You can check the [Discussion page](https://annotator.freesound.org/fsd/discussion/) on the Freesound Annotator for some more ideas for discussion.\n\nThe first dataset created through the Freesound Annotator is [FSD](https://annotator.freesound.org/fsd/): a large-scale, general-purpose dataset composed of [Freesound](https://freesound.org/) content annotated with labels from Google’s [AudioSet Ontology](https://research.google.com/audioset/ontology/index.html). All datasets collected through the platform will be openly available under Creative Commons licenses.\n\nYou can find more information about the platform and the creation of FSD in our paper:\n\n\u003e  E. Fonseca, J. Pons, X. Favory, F. Font, D. Bogdanov, A. Ferraro, S. Oramas, A. Porter and X. Serra. “Freesound Datasets: A Platform for the Creation of Open Audio Datasets” In *Proceedings of the 18th International Society for Music Information Retrieval Conference*, Suzhou, China, 2017.\n \n\n\n## Getting Started\n\nYou'll need to have [`docker`](https://docs.docker.com/install/) and [`docker-compose`](https://docs.docker.com/compose/install/) installed.\n\n### Configuration\n\nCopy `freesound_datasets/local_settings.example.py` to `freesound_datasets/local_settings.py`\nand follow the instructions in the file to fill in services credentials for your project.\n\nTo allow downloads, you need to fill in\n\n * `FS_CLIENT_ID`\n * `FS_CLIENT_SECRET`\n\nIf you want to log in with an external service fill in the relevant `SOCIAL_AUTH_` keys.\n\nOtherwise, to create a user using Django's models you can run\n\n    docker-compose run --rm web python manage.py createsuperuser\n\nYou will need to install the PostgreSQL [`pg_trgm`](https://www.postgresql.org/docs/9.6/pgtrgm.html) extension in order to enable the text-search in the *sound curation task*. After having started the containers (`docker-compose up`), from an other terminal, you can run\n\n    docker-compose run --rm db psql -h db -U postgres\n    CREATE EXTENSION pg_trgm;\n\n\n### Running\n\nThe first time you load the application you will need to perform migrations:\n\n    docker-compose run --rm web python manage.py migrate\n\nRun Freesound Annotator using docker-compose:\n\n    docker-compose up\n\nAnd point your browser at `http://localhost:8000`.\n\n\n### Dummy data\n\nYou will need test data to develop.\nTo load some data, first load the dataset fixtures by running\n\n    docker-compose run --rm web python manage.py loaddata datasets/fixtures/initial.json\n\nThis will create an empty dataset with a loaded taxonomy.\n\nOnce you have the dataset, you can generate fake data (sounds, annotations and votes),\nby running\n\n    docker-compose run --rm web python manage.py generate_fake_data fsd 100 5 1000 1000\n\nThis will create 100 fake sounds, 5 fake users, 1000 fake annotations and 1000 fake annotation votes.\nYou can run this command again to generate more data.\n\n\n## Add a new dataset with a validation task\n\n1) Create dataset object from the admin\n\n2) Create a taxonomy.json file which looks like the example below and load using command (get the dataset ID from the admin): \n\n```python manage.py load_taxonomy 'DATASET_ID PATH/TO/TAOXNOMY_FILE.json```\n\n```json\n{\n    \"id1\":{\n        \"id\":\"id1\",\n        \"name\":\"Main class\",\n        \"child_ids\":[\"id2\"],\n        \"description\":\"A description for this cateogry.\",\n        \"citation_uri\":\"http://en.wikipedia.org/wiki/Artillery\",\n        \"positive_examples_FS\":[\n            253284,\n            85201\n        ]\n    },\n    \"id2\":{\n        \"id\":\"id2\",\n        \"name\":\"Sub class\",\n        \"parent_ids\":[\"id1\"],\n        \"description\":\"A description for this cateogry.\",\n        \"positive_examples_FS\":[\n            1234,\n            1235\n        ]\n    }\n}\n```\n\n3) Create objects for the taxonomy nodes using command (get the taxonomy ID from the admin):\n\n```python manage.py create_taxonomy_node_instances TAXONOMY_ID```\n\n4) Load candidate sound annotations from a JSON file using the command (`algorithm_name` is ):\n\n```python manage.py load_sounds_for_dataset short_ds_name filepath.json algorithm_name```\n\n```json\n{\n    \"366411\":{\n      \"username\":\"Rach_Capache\",\n      \"description\":\"Cat sniffing: sniffing microphone. Recorded with a ZOOM H6 recorder and X/Y capsule. The sound was post-processed to remove any background noise or room tone.\",\n      \"license\":\"http://creativecommons.org/licenses/by-nc/3.0/\",\n      \"tags\":[\n         \"close\",\n         \"sniffing\",\n         \"cat\",\n         \"kitty\",\n         \"pet\",\n         \"sniff\",\n         \"owi\"\n      ],\n      \"previews\":\"http://www.freesound.org/data/previews/366/366411_5959200-hq.ogg\",\n      \"duration\":5.0,\n      \"category_ids\":[\n         \"id1\"\n      ],\n      \"name\":\"Cat Sniffing.wav\"\n   },\n   \"38121\":{\n      \"username\":\"deprogram\",\n      \"description\":\"dog bark \",\n      \"license\":\"http://creativecommons.org/licenses/by-nc/3.0/\",\n      \"tags\":[\n         \"bark\",\n         \"dog\"\n      ],\n      \"previews\":\"http://www.freesound.org/data/previews/38/38121_389218-hq.ogg\",\n      \"duration\":2.9953968254,\n      \"category_ids\":[\n         \"id2\"\n      ],\n      \"name\":\"bark.wav\"\n   }\n}\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmtg%2Ffreesound-datasets","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmtg%2Ffreesound-datasets","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmtg%2Ffreesound-datasets/lists"}