https://github.com/prashantranjan09/improved-word-embeddings

Improving Word Embeddings by combining word embeddings with their POS (Part Of Speech) tag.
https://github.com/prashantranjan09/improved-word-embeddings

part-of-speech-tagger pos wordembeddings

Last synced: about 2 months ago
JSON representation

Improving Word Embeddings by combining word embeddings with their POS (Part Of Speech) tag.

Host: GitHub
URL: https://github.com/prashantranjan09/improved-word-embeddings
Owner: PrashantRanjan09
Created: 2018-06-17T08:55:58.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2020-08-18T17:38:51.000Z (about 5 years ago)
Last Synced: 2024-04-27T23:38:39.208Z (over 1 year ago)
Topics: part-of-speech-tagger, pos, wordembeddings
Language: Python
Homepage:
Size: 314 KB
Stars: 19
Watchers: 2
Forks: 7
Open Issues: 3
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Improved-Word-Embeddings
This is a paper implementation of https://arxiv.org/pdf/1711.08609.pdf. This paper increases the accuracy of word embeddings by enriching the word embedding with its associated POS (Part of Speech) tag. (See Below):

![Optional Text](../master/ImprovedWordEmbeddings.png)

However, this excludes the Lexicon2Vec conversion and follows the following steps:

Inputs:
S = {W1, W2,……., Wn} , Input sentence S contains n words
PT = {T1, T2,……, Tm}, All POS tags
Corpus = Imdb/Your corpus

Output:
IMV: Improved word vectors of corpus

The steps follow the following algorithm:

for j=1 to m do
VTj GenerateVector ( Tj )
Tj < Tj , VTj >
end for

for each Wi in S do
If Wi exist in "your choice of word vector" then extract VecWi
MVi VecWi
endif

# POS ExtractPOS ( Wi )
for k=1 to m do
If POS=Tk then ADD VTk into MVi
end if
end for

#### Usage
This gives you the option of testing it on the Imdb data and also to run it on your corpus.
To run it on the imdb data:

In config.json assign use_imdb : 1

To run it on your corpus:

In config.json assign use_imdb : 0

The model used is a very generic one. Feel free to make changes to the model as per your requirements.
To change model:

In pos_function.py change model_build

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/prashantranjan09/improved-word-embeddings

Awesome Lists containing this project

README