https://github.com/akb89/entropix
Entropy, Zipf's law and distributional semantics
https://github.com/akb89/entropix
Last synced: 8 months ago
JSON representation
Entropy, Zipf's law and distributional semantics
- Host: GitHub
- URL: https://github.com/akb89/entropix
- Owner: akb89
- License: mit
- Created: 2018-12-16T07:40:42.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2022-06-22T03:41:36.000Z (almost 4 years ago)
- Last Synced: 2025-05-29T15:55:42.011Z (about 1 year ago)
- Language: Python
- Size: 964 KB
- Stars: 4
- Watchers: 2
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# entropix
[![GitHub release][release-image]][release-url]
[![PyPI release][pypi-image]][pypi-url]
[![Build][build-image]][build-url]
[![MIT License][license-image]][license-url]
[release-image]:https://img.shields.io/github/release/akb89/entropix.svg?style=flat-square
[release-url]:https://github.com/akb89/entropix/releases/latest
[pypi-image]:https://img.shields.io/pypi/v/entropix.svg?style=flat-square
[pypi-url]:https://pypi.org/project/entropix/
[build-image]:https://img.shields.io/github/workflow/status/akb89/entropix/CI?style=flat-square
[build-url]:https://github.com/akb89/entropix/actions?query=workflow%3ACI
[license-image]:http://img.shields.io/badge/license-MIT-000000.svg?style=flat-square
[license-url]:LICENSE.txt
Generate count-based Distributional Semantic Models by sampling SVD singular vectors instead of using top components.
## Install
```shell
pip install entropix
```
or, after a git clone:
```shell
python3 setup.py install
```
## Use
### Sequential mode
```shell
entropix sample \
--model /abs/path/to/dense/numpy/model.npy \
--vocab /abs/path/to/corresponding/model.vocab \
--dataset dataset_to_optimize_on \ # men, simlex or simverb
--shuffle \
--mode seq \
--kfold-size .2 \ # size of kfold, between 0 and .5
--metric pearson \ # spr(spearman), pearson, rmse or both (spr+rmse)
--num-threads 5
```
### Limit mode
```shell
entropix sample \
--model /abs/path/to/dense/numpy/model.npy \
--vocab /abs/path/to/corresponding/model.vocab \
--dataset dataset_to_optimize_on \ # men, simlex or simverb
--mode limit \
--metric pearson \
--limit 10 # number of dimensions to sample
```