https://github.com/mlampros/glover

Global Vectors for Word Representation
https://github.com/mlampros/glover

global-vectors glove r word-representation

Last synced: 6 months ago
JSON representation

Global Vectors for Word Representation

Host: GitHub
URL: https://github.com/mlampros/glover
Owner: mlampros
Created: 2017-01-04T16:42:54.000Z (almost 9 years ago)
Default Branch: master
Last Pushed: 2021-04-17T04:15:41.000Z (over 4 years ago)
Last Synced: 2025-03-26T02:42:58.496Z (7 months ago)
Topics: global-vectors, glove, r, word-representation
Language: R
Homepage: https://mlampros.github.io/GloveR/
Size: 1.83 MB
Stars: 7
Watchers: 3
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          
[![tic](https://github.com/mlampros/GloveR/workflows/tic/badge.svg?branch=master)](https://github.com/mlampros/GloveR/actions)

[![codecov.io](https://codecov.io/github/mlampros/GloveR/coverage.svg?branch=master)](https://codecov.io/github/mlampros/GloveR?branch=master)



## GloveR




The GloveR package is an R wrapper for the [*Global Vectors for Word Representation*](http://nlp.stanford.edu/projects/glove/) (GloVe). *GloVe* is an unsupervised learning algorithm for obtaining vector representations for words. Training is performed on aggregated global word-word co-occurrence statistics from a corpus, and the resulting representations showcase interesting linear substructures of the word vector space. For more information consult : *Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation*. COPYRIGHTS file and LICENSE can be found in the *inst* folder of the R package.




This R package has some limitations:

* it works only on a unix OS

* the data file should be big enough for the package-function *Glove* to work properly

To install the package from Github use the *install_github* function of the devtools package,





```R

devtools::install_github('mlampros/GloveR')

```




Use the following link to report bugs/issues (for the R wrapper),





[https://github.com/mlampros/GloveR/issues](https://github.com/mlampros/GloveR/issues)




#### **Example usage**




```R

# example input data ---> 'dat.txt'

library(GloveR)

#-----------------------------

# vocabulary count computation

#-----------------------------

res = vocabulary_counts(train_data = '/data_GloveR/dat.txt', MAX_vocab = 0,

                        MIN_count = 5, output_vocabulary = '/data_GloveR/VOCAB.txt', 

                        

                        trace = TRUE)

                        

               

               

#-------------------------

# cooccurrence statistics

#-------------------------

co_mat = cooccurrence_statistics(train_data = '/data_GloveR/dat.txt', vocab_input = '/data_GloveR/VOCAB.txt',

                                  

                                 output_cooccurences = '/data_GloveR/COOCUR.bin', symmetric_both = TRUE, 

                                 

                                 context_words = 15, memory_gb = 4.0, MAX_product = 0, overflowLength = 0, 

                                 

                                 trace = TRUE)

#---------------------------

# shuffling of cooccurrences

#---------------------------

shfl = shuffle_cooccurrences(input_cooccurences = '/data_GloveR/COOCUR.bin',

                             output_cooccurences = '/data_GloveR/COOCUR_output.bin',

                             memory_gb = 4.0, arraySize = 0, trace = TRUE)

#---------------------------------------

# Global Vectors for Word Representation

#---------------------------------------

gl = Glove(input_cooccurences = '/data_GloveR/COOCUR_output.bin',

           output_vectors = '/data_GloveR/vectors',

           vocab_input = '/data_GloveR/VOCAB.txt',

           model_output = 2, iter_num = 5, learn_rate = 0.05, 

           

           save_squared_grads_file = NULL, alpha_weight = 0.75, 

           

           cutoff = 10, binary_output = 0, vectorSize = 50, threads = 6, 

           

           trace = TRUE)

```




More information about the parameters of each function can be found in the package documentation.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mlampros/glover

Awesome Lists containing this project

README