{"id":17497535,"url":"https://github.com/simonepri/wn16s","last_synced_at":"2025-03-29T17:22:13.457Z","repository":{"id":66099018,"uuid":"189223834","full_name":"simonepri/WN16S","owner":"simonepri","description":"WordNet dataset with semantic relations only","archived":false,"fork":false,"pushed_at":"2019-10-31T15:34:23.000Z","size":10650,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-02-04T18:23:17.976Z","etag":null,"topics":["dataset","semantic","synsets","wn","wn16s","wordnet"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/simonepri.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"license","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-05-29T12:45:47.000Z","updated_at":"2019-06-23T08:07:30.000Z","dependencies_parsed_at":"2023-10-16T01:13:58.008Z","dependency_job_id":null,"html_url":"https://github.com/simonepri/WN16S","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonepri%2FWN16S","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonepri%2FWN16S/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonepri%2FWN16S/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonepri%2FWN16S/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/simonepri","download_url":"https://codeload.github.com/simonepri/WN16S/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246216536,"owners_count":20742003,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dataset","semantic","synsets","wn","wn16s","wordnet"],"created_at":"2024-10-19T15:55:25.853Z","updated_at":"2025-03-29T17:22:13.435Z","avatar_url":"https://github.com/simonepri.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003e\n  WN16S\n\u003c/h1\u003e\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://github.com/simonepri/WN16S/releases/latest/download/WN16S.tgz\"\u003e\u003cimg src=\"https://img.shields.io/github/downloads/simonepri/WN16S/latest/WN16S.tgz.svg\" alt=\"github downloads\"/\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/simonepri/WN16S/releases\"\u003e\u003cimg src=\"https://img.shields.io/github/tag/simonepri/WN16S.svg\" alt=\"dataset download\" /\u003e\u003c/a\u003e\n  \u003ca href=\"https://en.wikipedia.org/wiki/tab-separated_values\"\u003e\u003cimg src=\"https://img.shields.io/badge/format-tsv-e67e22.svg\" alt=\"dataset format\" /\u003e\u003c/a\u003e\n  \u003ca href=\"https://wordnet.princeton.edu/\"\u003e\u003cimg src=\"https://img.shields.io/badge/source-WordNet-2ecc71.svg\" alt=\"dataset source\" /\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/simonepri/WN16S/tree/master/license\"\u003e\u003cimg src=\"https://img.shields.io/github/license/simonepri/WN16S.svg\" alt=\"software license\" /\u003e\u003c/a\u003e\n\u003c/p\u003e\n\u003cbr /\u003e\n\u003cp align=\"center\"\u003e\n  WordNet dataset with semantic relations only\n\u003c/p\u003e\n\n## Motivation\nIn [WordNet][wn] two kinds of relations are recognized: lexical and semantic. Lexical relations hold between word forms (lemmas); semantic relations hold between word meanings (synsets).\n\nI wanted to have a dataset with the lexical relations filtered out to build synset embeddings based only on the semantic relations of the WN graph.\n\n## Structure\nIn the [dataset folder][dataset], you can find many `tsv` and `txt` files the meaning of which is explained hereafter.\n\n| file name | purpose | notes |\n| --------- | ------- | ----- |\n| `count_synsets.txt` | File that contains the number of synsets. | |\n| `count_relations.txt`   | Files that contain the number of relations. | |\n| `count_edges_all.txt` | File that contains the number of total edges. | |\n| `count_edges_*.tsv`   | Files that contain the number of edges of type *. | |\n| `synset_name_to_id.tsv`   | File that maps each synset's name to a numeric id starting from 0. | The file is sorted on the first column. |\n| `synset_id_to_name.tsv`   | File that maps each synset id to a synset's name. | The file is sorted on the first column. |\n| `relation_name_to_id.tsv` | File that maps each relation to a numeric id starting from 0. | The file is sorted on the first column. |\n| `relation_id_to_name.tsv` | File that maps each relation id to a relation's name. | The file is sorted on the first column. |\n| `edges_as_id_all.tsv` | File that contains all the edges of the WordNet's semantic subgraph as triples of ids (id synset 1, id relation, id synset 2). | The file is sorted on the second column. |\n| `edges_as_id_*.tsv`   | Files that contain only the edges of type *. | The file is sorted on the second column. |\n| `edges_as_name_all.tsv` | File that contains all the edges of the WordNet's semantic subgraph as triples of names (name synset 1, name relation, name synset 2). | The file is sorted on the second column. |\n| `edges_as_name_*.tsv`   | Files that contain only the edges of type *. | The file is sorted on the second column. |\n\n## Download\nA compressed version of the dataset can be downloaded from the [release page][releases] or by clicking [here][download].\n\n## Source\nThe dataset is generated using [nltk][nltk] and is a subset of the [WordNet][wn] dataset.\n\n## License\nAll source code of this project is licensed under the MIT License - see the [license][license] file for details.\n\n[dataset]: https://github.com/simonepri/WN16S/tree/master/dataset\n[releases]: https://github.com/simonepri/WN16S/releases/latest\n[download]: https://github.com/simonepri/WN16S/releases/latest/download/WN16S.tgz\n[license]: https://github.com/simonepri/WN16S/tree/master/license\n\n[wn]: https://wordnet.princeton.edu\n[nltk]: https://github.com/nltk/nltk\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsimonepri%2Fwn16s","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsimonepri%2Fwn16s","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsimonepri%2Fwn16s/lists"}