{"id":13810709,"url":"https://github.com/facebookresearch/hypernymysuite","last_synced_at":"2025-05-14T15:31:07.492Z","repository":{"id":47446961,"uuid":"132787783","full_name":"facebookresearch/hypernymysuite","owner":"facebookresearch","description":"Hearst Patterns Revisited: Automatic Hypernym Detection from Large Text Corpora","archived":true,"fork":false,"pushed_at":"2021-08-31T14:40:40.000Z","size":4989,"stargazers_count":153,"open_issues_count":1,"forks_count":22,"subscribers_count":61,"default_branch":"main","last_synced_at":"2024-12-17T01:37:34.847Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/facebookresearch.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-05-09T16:58:27.000Z","updated_at":"2024-04-03T05:50:12.000Z","dependencies_parsed_at":"2022-09-10T08:12:01.093Z","dependency_job_id":null,"html_url":"https://github.com/facebookresearch/hypernymysuite","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2Fhypernymysuite","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2Fhypernymysuite/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2Fhypernymysuite/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2Fhypernymysuite/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/facebookresearch","download_url":"https://codeload.github.com/facebookresearch/hypernymysuite/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254171699,"owners_count":22026493,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-04T03:00:23.898Z","updated_at":"2025-05-14T15:31:02.475Z","avatar_url":"https://github.com/facebookresearch.png","language":"Python","funding_links":[],"categories":["Hypernymy Discovery \u0026 Lexical Entailment"],"sub_categories":[],"readme":"# Hypernymy Suite\n\nHypernymySuite is a tool for evaluating some hypernymy detection modules. Its\npredominant focus is reproducing the results for the following paper.\n\n\u003e Stephen Roller, Douwe Kiela, and Maximilian Nickel. 2018. Hearst Patterns\n\u003e Revisited: Automatic Hypernym Detection from Large Text Corpora. ACL.\n\u003e ([arXiv](https://arxiv.org/abs/1806.03191))\n\nWe hope that open sourcing our evaluation will help facilitate future research.\n\n## Example\n\nYou can produce results in a JSON format by calling main.py:\n\n    python main.py cnt --dset hearst_counts.txt.gz\n\nThese results can be made machine readable by piping them into `compile_table`:\n\n    python main.py cnt --dset hearst_counts.txt.gz | python compile_table.py\n\nTo generate the full table from the report, you may simply use `generate_table.sh`:\n\n    bash generate_table.sh results.json\n\nPlease note that due to licensing concerns, we were not able to release our\ntrain/validation/test folds from the paper, so results may differ slightly than\nthose reported.\n\n## Requirements\n\nThe module was developed with python3 in mind, and is not tested for python2.\nNonetheless, cross-platform compatibility may be possible.\n\nThe suite requires several packages you probably already have installed:\n`numpy`, `scipy`, `pandas`, `scikit-learn` and `nltk`. These can be installed\nusing pip:\n\n    pip install -r requirements.txt\n\nIf you've never used `nltk` before, you'll need to install the wordnet module.\n\n    python -c \"import nltk; nltk.download('wordnet')\"\n    \nOn OS X, you may need to install `coreutils` and `gnu-sed` for the script `download_data.sh` to run correctly. These can be installed using brew:\n\n    brew install coreutils gnu-sed\n\nAfter installation, you will either need to modify `download_data.sh` to run `gsort` and `gsed` instead of `sort` and `sed`, or alternatively add a \"gnubin\" directory to your PATH from your bashrc:\n\n    PATH=\"/usr/local/opt/coreutils/libexec/gnubin:$PATH\"\n\nFor more information, see `brew info coreutils` or `brew info gnu-sed`.\n\n## Evaluating your own model\n\nYou can evaluate your own model in two separate ways. The simplest way is simply\nto create a copy of example.tsv, and fill in your model's predictions in the `sim`\ncolumn. You must include a prediction for every pair, but you may set the `is_oov`\ncolumn to `1` to ensure it is correctly calculated.\n\nYou may then evaluate the model:\n\n    python main.py precomputed --dset example.tsv\n\nYou can also implement any model by extending the `base.HypernymySuiteModel` class\nand filling in your own implemenation for `predict` or `predict_many`.\n\n## References\n\nIf you find this code useful for your research, please cite the following paper:\n\n    @inproceedings{roller2018hearst\n        title = {Hearst Patterns Revisited: Automatic Hypernym Detection from Large Text Corpora},\n        author = {Roller, Stephen and Kiela, Douwe and Nickel, Maximilian},\n        year = {2018},\n        booktitle = {Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics},\n        location = {Melbourne, Australia},\n        publisher = {Association for Computational Linguistics}\n    }\n\n## License\n\nThis code is licensed under [CC-BY-NC4.0](https://creativecommons.org/licenses/by-nc/4.0/).\n\nThe data contained in `hearst_counts.txt` was extracted from a combination of\n[Wikipedia](https://en.wikipedia.org/wiki/Wikipedia:Database_download) and Gigaword.\nPlease see publication for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffacebookresearch%2Fhypernymysuite","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffacebookresearch%2Fhypernymysuite","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffacebookresearch%2Fhypernymysuite/lists"}