https://github.com/joewandy/hlda

Gibbs sampler for the Hierarchical Latent Dirichlet Allocation topic model
https://github.com/joewandy/hlda

gibbs-sampler hierarchical-topic-models lda topic-hierarchies topic-modeling

Last synced: 5 months ago
JSON representation

Gibbs sampler for the Hierarchical Latent Dirichlet Allocation topic model

Host: GitHub
URL: https://github.com/joewandy/hlda
Owner: joewandy
License: gpl-3.0
Created: 2016-09-29T23:13:42.000Z (over 8 years ago)
Default Branch: master
Last Pushed: 2022-12-08T08:01:46.000Z (over 2 years ago)
Last Synced: 2024-09-23T04:39:07.339Z (7 months ago)
Topics: gibbs-sampler, hierarchical-topic-models, lda, topic-hierarchies, topic-modeling
Language: Jupyter Notebook
Size: 5.74 MB
Stars: 146
Watchers: 6
Forks: 38
Open Issues: 21
Metadata Files:
- Readme: README.md
- License: LICENSE.txt

Awesome Lists containing this project

awesome-topic-models - hlda - Python package based on *Mallet's* Gibbs sampler having a fixed depth on the nCRP tree (Models / Hierarchical LDA (hLDA) [:page_facing_up:](https://dl.acm.org/doi/10.5555/2981345.2981348))

README

Hierarchical Latent Dirichlet Allocation
----------------------------------------

**Note: this repository should only be used for education purpose. For production use, I'd recommend using https://github.com/bab2min/tomotopy which is more production-ready**

---

Hierarchical Latent Dirichlet Allocation (hLDA) addresses the problem of learning topic hierarchies from data. The model relies on a non-parametric prior called the nested Chinese restaurant process, which allows for arbitrarily large branching factors and readily accommodates growing
data collections. The hLDA model combines this prior with a likelihood that is based on a hierarchical variant of latent Dirichlet allocation.

[Hierarchical Topic Models and the Nested Chinese Restaurant Process](http://www.cs.columbia.edu/~blei/papers/BleiGriffithsJordanTenenbaum2003.pdf)

[The Nested Chinese Restaurant Process and Bayesian Nonparametric Inference of Topic Hierarchies](http://cocosci.berkeley.edu/tom/papers/ncrp.pdf)

Implementation
--------------

- [hlda/sampler.py](hlda/sampler.py) is the Gibbs sampler for hLDA inference, based on the implementation from [Mallet](http://mallet.cs.umass.edu/topics.php) having a fixed depth on the nCRP tree.

Installation
------------

- Simply use `pip install hlda` to install the package.
- An example notebook that infers the hierarchical topics on the BBC Insight corpus can be found in [notebooks/bbc_test.ipynb](notebooks/bbc_test.ipynb).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/joewandy/hlda

Awesome Lists containing this project

README