Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/urduhack/urdu-words
📝A text file containing 150,000 Urdu words for all your dictionary/word-based projects e.g: auto-completion / autosuggestion.
https://github.com/urduhack/urdu-words
autosuggestion backer bigram dictionary ner ner-labels sponsors trigram urdu urdu-words words-collection
Last synced: 3 months ago
JSON representation
📝A text file containing 150,000 Urdu words for all your dictionary/word-based projects e.g: auto-completion / autosuggestion.
- Host: GitHub
- URL: https://github.com/urduhack/urdu-words
- Owner: urduhack
- License: mit
- Created: 2018-04-11T06:12:35.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2020-10-08T08:09:34.000Z (over 3 years ago)
- Last Synced: 2024-03-24T14:16:04.086Z (3 months ago)
- Topics: autosuggestion, backer, bigram, dictionary, ner, ner-labels, sponsors, trigram, urdu, urdu-words, words-collection
- Language: Python
- Homepage:
- Size: 1.12 MB
- Stars: 38
- Watchers: 7
- Forks: 18
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Lists
- awesome-urdu - UrduHack Words-List - Includes N-grams, NER Labels (Urdu Datasets / Urdu Lexical Resources)
README
# 150k+ unique Urdu words collections
[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](https://github.com/urduhack/urdu-words/blob/master/LICENSE)
![Last commit](https://img.shields.io/github/last-commit/urduhack/urdu-words.svg)
[![Build Status](https://travis-ci.org/urduhack/urdu-words.svg?branch=master)](https://travis-ci.org/urduhack/urdu-words)
![Last commit](https://img.shields.io/github/last-commit/urduhack/urdu-words.svg)
[![image](https://img.shields.io/github/contributors/urduhack/urdu-words.svg)](https://github.com/urduhack/urdu-words/graphs/contributors)
[![Join Slack](https://img.shields.io/badge/join-us%20on%20slack-gray.svg?longCache=true&logo=slack&colorB=red)](https://join.slack.com/t/urduhack/shared_invite/zt-5cpkrvz8-Zu_tOyR5AEcspCBCyqhSZQ)
[![Say Thanks!](https://img.shields.io/badge/Say%20Thanks-!-1EAEDB.svg)](https://saythanks.io/to/akkefa)Consists of text files containing 150k+ Urdu words for all your dictionary/word-based projects e.g: auto-completion / autosuggestion / Embedding networks / Tagging
## Files you may be interested in:I pulled out the words into a simple new-line-delimited text file.
Which is more useful when building apps or importing into databases etc.- [words.txt](words.txt) Contains all urdu words.
- [bigram_words.txt](bigram_words.txt) Contains all urdu bigram words.
- [trigram_words.txt](trigram_words.txt) Contains all urdu trigram words.## NER Labels
I have added words for labelling Named Entity Recognition(NER) Data. These labels contain words related to different categories
like _Persons_, _Locations_, _Organizations_ and _Dates_ etc. These words give a good starting point for labelling NER data.
Below are the files containing different label words.- [locations.txt](ner/locations.txt) Contains locations from across the world
- [persons.txt](ner/persons.txt) Contains Person Names
- [organizations.txt](ner/organizations.txt) Contains Organization names
- [dates.txt](ner/dates.txt) Contains time and date related words## Table of contents
- [Contributing](#contributing)
- [Bugs and feature requests](#bugs-and-feature-requests)
- [Contributors](#contributors)
- [Copyright and license](#copyright-and-license)## Contributing
All contributions are more than welcomed. Contributions may close an issue, fix a bug (reported or not reported), improve the existing code and so on.
If you would like to add a word or a new set of words, send a PR.## Bugs and feature requests
Have a bug or a feature request? If you wish to remove or update some of the words, please file an issue first before sending a PR on the repo. [[please open a new issue](https://github.com/urduhack/urdu-words/issues/new)]
## Contributors
Special thanks to everyone who contributed to getting the Urdu hack to the current state.
Thanks to Center for Language Engineering for providing the word list.## Backers [![Backers on Open Collective](https://opencollective.com/urduhack/backers/badge.svg)](#backers)
Thank you to all our backers! 🙏 [[Become a backer](https://opencollective.com/urduhack#backer)]## Sponsors [![Sponsors on Open Collective](https://opencollective.com/urduhack/sponsors/badge.svg)](#sponsors)
Support this project by becoming a sponsor. [[Become a sponsor](https://opencollective.com/urduhack#sponsor)]Code released under the [MIT License](ttps://github.com/urduhack/urdu-words/blob/master/LICENSE).