Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/qlai/stochasticLDA
Python implementation of Stochastic Variational Inference for LDA
https://github.com/qlai/stochasticLDA
Last synced: 3 months ago
JSON representation
Python implementation of Stochastic Variational Inference for LDA
- Host: GitHub
- URL: https://github.com/qlai/stochasticLDA
- Owner: qlai
- License: gpl-3.0
- Created: 2016-04-04T01:14:33.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2020-05-26T18:11:58.000Z (over 4 years ago)
- Last Synced: 2024-05-19T23:35:25.071Z (6 months ago)
- Language: Python
- Homepage:
- Size: 35.2 KB
- Stars: 25
- Watchers: 3
- Forks: 23
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
##Stochastic Variational Inference for Latent Dirichlet Allocation
Code structure from the OnlineVB code provided by Matthew D. Hoffman ([email protected]) and the algorithm is as described in Hoffman's paper below
Based on the following papers:
- [Latent Dirichlet Allocation](https://www.cs.princeton.edu/~blei/papers/BleiNgJordan2003.pdf) by David M. Blei, Andrew Y. Ng and Michael I. Jordan
- [Stochastic Variational Inference](http://www.columbia.edu/~jwp2128/Papers/HoffmanBleiWangPaisley2013.pdf) by Matthew D. Hoffman, David M. Blei, Chong Wang and John Paisley###Also aiming to implement SVI for HDP as described in the second paper above, work in progress
###How to Use
See 'Help' using
```python stochastic_lda.py -h```You will need:
- A file [dictionary.csv] containing your vocabular
- A file [doclist.txt] containing the list of documents in the directory that you want to sample from
- At the moment your documents can be just a normal txt file, no pre-processing requiredFor classwork, work in progress...
- [x] Basic initial implementation
- [x] Debug for common corpus
- [x] Support Command-Line Usage for user-defined test mode and normal mode
- [x] Run on own data
- [ ] Implement HDP