Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/abhishekgupta92/lexical_db_bangla
Automatically constructed lexical database for Bangla inspired from Wordnet
https://github.com/abhishekgupta92/lexical_db_bangla
Last synced: 30 days ago
JSON representation
Automatically constructed lexical database for Bangla inspired from Wordnet
- Host: GitHub
- URL: https://github.com/abhishekgupta92/lexical_db_bangla
- Owner: abhishekgupta92
- Created: 2012-06-20T19:59:21.000Z (about 12 years ago)
- Default Branch: master
- Last Pushed: 2012-07-12T09:04:53.000Z (almost 12 years ago)
- Last Synced: 2024-02-18T13:31:07.051Z (4 months ago)
- Language: Python
- Size: 2.08 MB
- Stars: 11
- Watchers: 6
- Forks: 9
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Lists
- awesome-bangla - Bengali Lexical Dictionary (2012)
README
Lexical Database Bangla
=======================
Automatic construction of lexical database for Bangla inspired from Wordnet using a bilingual dictionary and Wordnet.Usage
=====
1. Download the package.python setup.py install
2. In your python codeimport lexical_db_bangla
syns_set=lexical_db_bangla.syns(word)
print syns_setwhere word is any bangla word.
Approach
========
For each Bangla word in the billingual (bangla to english dictionary), we need to look up all possible English words. Then we out find out the synsets for those English words from Princeton WordNet, extract the whole network of those synsets and copy that to our target wordnet for Bangla. Then, we try to translate the structure where ever possible, like name of the features attached with each word/synset, the features of these words and of course the actual words into Bangla.Note
====
The bangla to bangla dictionary have already been generated which is then parsed and used to find the synonymns. You can also generate the same using read_dict.py file. Also, remember the file has a really high memory footprint.
Dumps in the folder english_bangla_datasets have been downloaded from http://www.bengalinux.org/english-to-bengali-dictionary/dumps/. The license for the same can be found in the folder in the file Copying.txt### Dependencies:
----------------* [Python] (http://www.python.org)
* [NLTK Library] (http://www.nltk.org)
* Numpy Library (required by nltk)
* NLTK CorporaAfter you have installed the NLTK Library, do the following to download the NLTK Corpora:
1. Go to your python shell. Type:
nltk.download()
2. Download
* **Wordnet** Corpora