Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

awesome-topic-models

✨ Awesome - A curated list of amazing Topic Models (implementations, libraries, and resources)
https://github.com/jonaschn/awesome-topic-models

  • gensim - Python library for topic modelling ![GitHub Repo stars](https://img.shields.io/github/stars/RaRe-Technologies/gensim?style=social)
  • scikit-learn - Python library for machine learning ![GitHub Repo stars](https://img.shields.io/github/stars/scikit-learn/scikit-learn?style=social)
  • tomotopy - Python extension for Gibbs sampling based *tomoto* which is written in C++ ![GitHub Repo stars](https://img.shields.io/github/stars/bab2min/tomotopy?style=social)
  • tomoto - Ruby extension for Gibbs sampling based *tomoto* which is written in C++ ![GitHub Repo stars](https://img.shields.io/github/stars/ankane/tomoto?style=social)
  • OCTIS - Python package to integrate, optimize and evaluate topic models ![GitHub Repo stars](https://img.shields.io/github/stars/MIND-Lab/OCTIS?style=social)
  • tmtoolkit - Python topic modeling toolkit with parallel processing power ![GitHub Repo stars](https://img.shields.io/github/stars/WZBSocialScienceCenter/tmtoolkit?style=social)
  • Mallet - Java-based package for topic modeling
  • TopicModel4J - Java implementation using collapsed Gibbs sampling
  • BIDMach - CPU and GPU-accelerated machine learning library ![GitHub Repo stars](https://img.shields.io/github/stars/BIDData/BIDMach?style=social)
  • BigARTM - Fast topic modeling platform ![GitHub Repo stars](https://img.shields.io/github/stars/bigartm/bigartm?style=social)
  • TopicNet - A high-level Python interface for BigARTM library ![GitHub Repo stars](https://img.shields.io/github/stars/machine-intelligence-laboratory/TopicNet?style=social)
  • stm - R package for the Structural Topic Model (CTM in case of no covariates) [:page_facing_up:](https://github.com/bstewart/stm/blob/master/vignettes/stmVignette.pdf?raw=true)
  • RMallet - R package to interface with the Java machine learning tool MALLET ![GitHub Repo stars](https://img.shields.io/github/stars/mimno/RMallet?style=social)
  • R-lda - R implementation using collapsed Gibbs sampling
  • topicmodels - R package with interface to C code for LDA and CTM ![GitHub Repo stars](https://img.shields.io/github/stars/cran/topicmodels?style=social)
  • lda++ - C++ library for LDA and (fast) supervised LDA (sLDA/fsLDA) using variational inference [:page_facing_up:](https://dl.acm.org/doi/pdf/10.1145/2964284.2967237) [:page_facing_up:](http://www.cs.columbia.edu/~blei/papers/WangBleiFeiFei2009.pdf)
  • scikit-learn - Python library for machine learning ![GitHub Repo stars](https://img.shields.io/github/stars/scikit-learn/scikit-learn?style=social)
  • gensim - Python implementation using multi-pass [randomized SVD solver](https://arxiv.org/pdf/0909.4061.pdf) or a [one-pass merge algorithm](https://rdcu.be/cghAi)
  • SVDlibc - C implementation of SVD by Doug Rohde
  • sparsesvd - Python wrapper for SVDlibc
  • BIDMach - CPU and GPU-accelerated machine learning library ![GitHub Repo stars](https://img.shields.io/github/stars/BIDData/BIDMach?style=social)
  • scikit-learn - Python library for machine learning ![GitHub Repo stars](https://img.shields.io/github/stars/scikit-learn/scikit-learn?style=social)
  • gensim - Python implementation of [online NMF](https://arxiv.org/pdf/1604.02634.pdf)
  • BIDMach - CPU and GPU-accelerated machine learning library ![GitHub Repo stars](https://img.shields.io/github/stars/BIDData/BIDMach?style=social)
  • scikit-learn - Python library for machine learning ![GitHub Repo stars](https://img.shields.io/github/stars/scikit-learn/scikit-learn?style=social)
  • lda - Python implementation using collapsed Gibbs sampling which follows scikit-learn interface [:page_facing_up:](https://www.pnas.org/content/pnas/101/suppl_1/5228.full.pdf)
  • lda-gensim - Python implementation using online variational inference [:page_facing_up:](https://proceedings.neurips.cc/paper/2010/file/71f6278d140af599e06ad9bf1ba03cb0-Paper.pdf)
  • ldamulticore-gensim - Parallelized Python implementation using online variational inference [:page_facing_up:](https://proceedings.neurips.cc/paper/2010/file/71f6278d140af599e06ad9bf1ba03cb0-Paper.pdf)
  • GibbsSamplingLDA-TopicModel4J - Java implementation using collapsed Gibbs sampling [:page_facing_up:](https://www.pnas.org/content/pnas/101/suppl_1/5228.full.pdf)
  • CVBLDA-TopicModel4J - Java implementation using collapsed variational Bayesian (CVB) inference [:page_facing_up:](https://papers.nips.cc/paper/2006/file/532b7cbe070a3579f424988a040752f2-Paper.pdf)
  • Mallet - Java-based package for topic modeling
  • gensim-wrapper-Mallet - Python wrapper for Mallet's implementation [:page_facing_up:](https://www.jmlr.org/papers/volume10/newman09a/newman09a.pdf)[:page_facing_up:](https://dl.acm.org/doi/pdf/10.1145/1557019.1557121)
  • PartiallyCollapsedLDA - Various fast parallelized samplers for LDA, including Partially Collapsed LDA, LightLDA, Partially Collapsed Light LDA and a very efficient Polya-Urn LDA
  • Vowpal Wabbit - C++ implementaion using online variational Bayes inference [:page_facing_up:](https://proceedings.neurips.cc/paper/2010/file/71f6278d140af599e06ad9bf1ba03cb0-Paper.pdf)
  • tomotopy - Python binding for C++ implementation using Gibbs sampling and different [term-weighting](https://www.aclweb.org/anthology/N10-1070.pdf) options [:page_facing_up:](https://www.jmlr.org/papers/volume10/newman09a/newman09a.pdf)
  • topicmodel-lib - Cython library for online/streaming LDA (Online VB, Online CVB0, Online CGS, Online OPE, Online FW, Streaming VB, Streaming OPE, Streaming FW, ML-OPE, ML-CGS, ML-FW)
  • jsLDA - JavaScript implementation of LDA topic modeling in the browser
  • lda-nodejs - Node.js implementation of LDA topic modeling
  • lda-purescript - PureScript, browser-based implementation of LDA topic modeling
  • TopicModels.jl - Julia implementation of LDA
  • turicreate - C++ [LDA](https://github.com/apple/turicreate/blob/master/userguide/text/README.md) and [aliasLDA](https://apple.github.io/turicreate/docs/api/generated/turicreate.topic_model.create.html) implementation with export to Apple's Core ML for use in iOS, macOS, watchOS, and tvOS apps
  • MeTA - C++ implementation of (parallel) collapsed [Gibbs sampling, CVB0 and SCVB](https://meta-toolkit.org/topic-models-tutorial.html)
  • Fugue - Java implementation of collapsed Gibbs sampling with slice sampling for hyper-parameter optimization
  • GA-LDA - R scripts using Genetic Algorithms (GA) for hyper-paramenter optimization, based on Panichella [:page_facing_up:](https://doi.org/10.1016/j.infsof.2020.106411)
  • Search-Based-LDA - R scripts using Genetic Algorithms (GA) for hyper-paramenter optimization by Panichella [:page_facing_up:](https://doi.org/10.1016/j.infsof.2020.106411)
  • Dodge - Python tuning tool that ignores redundant tunings [:page_facing_up:](https://arxiv.org/pdf/1902.01838.pdf)
  • LDADE - Python tuning tool using differential evolution [:page_facing_up:](https://arxiv.org/pdf/1608.08176.pdf)
  • ldatuning - R package to find optimal number of topics for LDA [:page_facing_up:](https://rpubs.com/siri/ldatuning)
  • Scalable - Scalable Hyperparameter Selection for LDA [:page_facing_up:](https://www.tandfonline.com/doi/full/10.1080/10618600.2020.1741378)
  • topic_interpretability - Computation of the semantic interpretability of topics produced by topic models [:page_facing_up:](https://aclanthology.org/E14-1056.pdf)
  • topic-coherence-sensitivity - Code to compute topic coherence for several topic cardinalities and aggregate scores across them [:page_facing_up:](https://aclanthology.org/N16-1057.pdf)
  • topic-model-diversity - A collection of topic diversity measures for topic modeling [:page_facing_up:](https://dl.acm.org/doi/abs/10.1007/978-3-030-80599-9_4)
  • LDA\* - Tencent's hybrid sampler that uses different samplers for different types of documents in combination with an asymmetric parameter server [:page_facing_up:](http://www.vldb.org/pvldb/vol10/p1406-yu.pdf)
  • FastLDA - C++ implementation of LDA [:page_facing_up:](https://dl.acm.org/doi/pdf/10.1145/1401890.1401960)
  • dmlc - Single-and multi-threaded C++ implementations of [lightLDA](https://arxiv.org/pdf/1412.1576.pdf), [F+LDA](https://arxiv.org/pdf/1412.4986v1.pdf), [AliasLDA](https://dl.acm.org/doi/pdf/10.1145/2623330.2623756), forestLDA and many more
  • SparseLDA - Java algorithm and data structure for evaluating Gibbs sampling distributions used in Mallet [:page_facing_up:](https://dl.acm.org/doi/pdf/10.1145/1557019.1557121)
  • warpLDA - C++ cache efficient LDA implementation which samples each token in O(1) [:page_facing_up:](https://arxiv.org/pdf/1510.08628.pdf)
  • lightLDA - C++ implementation using O(1) Metropolis-Hastings sampling [:page_facing_up:](https://arxiv.org/pdf/1412.1576.pdf)
  • F+LDA - C++ implementation of F+LDA using an appropriately modified Fenwick tree [:page_facing_up:](https://arxiv.org/pdf/1412.4986v1.pdf)
  • AliasLDA - C++ implemenation using Metropolis-Hastings and *alias* method[:page_facing_up:](https://dl.acm.org/doi/pdf/10.1145/2623330.2623756)
  • Yahoo-LDA - Yahoo!'s topic modelling framework [:page_facing_up:](https://dl.acm.org/doi/pdf/10.1145/2124295.2124312)
  • PLDA+ - Google's C++ implementation using data placement and pipeline processing [:page_facing_up:](https://dl.acm.org/doi/pdf/10.1145/1961189.1961198)
  • Familia - Apply inference on pre-trained SentenceLDA models [:warning:](https://github.com/baidu/Familia/issues/111) [:page_facing_up:](https://arxiv.org/pdf/1707.09823.pdf)
  • SaberLDA - GPU-based system that implements a sparsity-aware algorithm to achieve sublinear time complexity
  • GS-LDA-BIDMach - CPU and GPU-accelerated Scala implementation using Gibbs sampling
  • VB-LDA-BIDMach - CPU and GPU-accelerated Scala implementation using online variational Bayes inference
  • gensim - Python implementation using online variational inference [:page_facing_up:](http://proceedings.mlr.press/v15/wang11a/wang11a.pdf)
  • tomotopy - Python extension for C++ implementation using Gibbs sampling [:page_facing_up:](https://www.jmlr.org/papers/volume10/newman09a/newman09a.pdf)
  • Mallet - Java-based package for topic modeling
  • TopicModel4J - Java implementation using collapsed Gibbs sampling
  • hca - C implementation of non-parametric topic models (HDP, HPYP-LDA, etc.) with focus on hyperparameter tuning
  • bnp - Cython reimplementation based on *online-hdp* following scikit-learn's API.
  • Scalable HDP - interesting paper
  • tomotopy - Python extension for C++ implementation using Gibbs sampling
  • Mallet - Java-based package for topic modeling
  • hlda - Python package based on *Mallet's* Gibbs sampler having a fixed depth on the nCRP tree
  • hLDA - C implementation of hierarchical LDA by David Blei
  • tomotopy - Python extension for C++ implementation using Gibbs sampling based on FastDTM
  • FastDTM - Scalable C++ implementation using Gibbs sampling with Stochastic Gradient Langevin Dynamics (MCMC-based) [:page_facing_up:](https://arxiv.org/pdf/1602.06049.pdf)
  • ldaseqmodel-gensim - Python implementation using online variational inference [:page_facing_up:](https://proceedings.neurips.cc/paper/2010/file/71f6278d140af599e06ad9bf1ba03cb0-Paper.pdf)
  • dtm-BigTopicModel - C++ engine for running large-scale topic models
  • tca - C implementation using Gibbs sampling with/without burstiness modelling [:page_facing_up:](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.705.1649&rep=rep1&type=pdf)
  • DETM - Python implementation of the Dynamic Embedded Topic Model [:page_facing_up:](https://arxiv.org/pdf/1907.05545.pdf)
  • gensim - Python implementation with online training (constant in memory w.r.t. the number of documents)
  • TopicModel4J - Java implementation using collapsed Gibbs sampling
  • Matlab Topic Modeling Toolbox - Matlab implementations of LDA, ATM, HMM-LDA, LDA-COL (Collocation) models by Mark Steyvers and Tom Griffiths
  • Topic-Model - Simple Python implementation using Gibbs sampling
  • tomotopy - Python extension for C++ implementation using Gibbs sampling
  • TopicModel4J - Java implementation using collapsed Gibbs sampling
  • Mallet - Java-based package for topic modeling
  • gensims_mallet_wrapper - Python wrapper for Mallet using gensim interface
  • STMT - Scala implementation by Daniel Ramage
  • topbox - Python wrapper for labeled LDA implementation of *Stanford TMT*
  • Labeled-LDA-Python - Python implementation (easy to use, does not scale)
  • JGibbLabeledLDA - Java implementation based on the popular [JGibbLDA](jgibblda.sourceforge.net) package
  • tomotopy - Python extension for C++ implementation using Gibbs sampling
  • TopicModel4J - Java implementation using collapsed Gibbs sampling
  • STMT - Scala implementation of PLDA & PLDP by Daniel Ramage
  • tomotopy - Python extension for C++ implementation using Gibbs sampling
  • Mallet - Java-based package for topic modeling
  • tomotopy - Python extension for C++ implementation using Gibbs sampling
  • PTM - Prescription Topic Model for Traditional Chinese Medicine Prescriptions [:page_facing_up:](https://ieeexplore.ieee.org/abstract/document/8242679) (interesting benchmark models)
  • TopicModel4J - Java implementation using collapsed Gibbs sampling
  • tomotopy - Python extension for C++ implementation using Gibbs sampling [:page_facing_up:](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.149.922)
  • ctm-c - C implementation of the correlated topic model by David Blei
  • BigTopicModel - C++ engine for running large-scale MedLDA models [:page_facing_up:](https://dl.acm.org/doi/10.1145/2487575.2487658)
  • stm - R package for the Structural Topic Model (CTM in case of no covariates) [:page_facing_up:](https://github.com/bstewart/stm/blob/master/vignettes/stmVignette.pdf?raw=true)
  • BigTopicModel - C++ engine for running large-scale MedLDA models [:page_facing_up:](https://dl.acm.org/doi/10.1145/2487575.2487658)
  • Constrained-RTM - Java implementation of Contrained RTM [:page_facing_up:](https://doi.org/10.1016/j.ins.2019.09.039)
  • R-lda - R implementation using collapsed Gibbs sampling
  • tomotopy - Python extension for C++ implementation using Gibbs sampling
  • R-lda - R implementation using collapsed Gibbs sampling
  • slda - Cython implementation of Gibbs sampling for LDA and various sLDA variants
  • YWWTools - Java-based package for various topic models by Weiwei Yang
  • sLDA - C++ implementation of supervised topic models with a categorical response.
  • :page_facing_up:
  • TopicModel4J - Java implementation using collapsed Gibbs sampling
  • Familia - Apply inference on pre-trained SentenceLDA models [:warning:](https://github.com/baidu/Familia/issues/111) [:page_facing_up:](https://arxiv.org/pdf/1707.09823.pdf)
  • :page_facing_up:
  • GPyM_TM - Python implementation of DMM and Poisson model
  • TopicModel4J - Java implementation using collapsed Gibbs sampling
  • jLDADMM - Java implementation using collapsed Gibbs sampling [:page_facing_up:](https://arxiv.org/pdf/1808.03835.pdf)
  • TopicModel4J - Java implementation using collapsed Gibbs sampling
  • :page_facing_up:
  • tomotopy - Python extension for C++ implementation using Gibbs sampling
  • TopicModel4J - Java implementation using collapsed Gibbs sampling
  • TopicModel4J - Java implementation using collapsed Gibbs sampling
  • BTM - Original C++ implementation using collapsed Gibbs sampling [:page_facing_up:](https://raw.githubusercontent.com/xiaohuiyan/xiaohuiyan.github.io/master/paper/BTM-WWW13.pdf)
  • BurstyBTM - Original C++ implementation of the Bursty BTM (BBTM) [:page_facing_up:](https://raw.githubusercontent.com/xiaohuiyan/xiaohuiyan.github.io/master/paper/BBTM-AAAI15.pdf)
  • R-BTM - R package wrapping the C++ code from BTM
  • STTM - Java implementation and evaluation of DMM, WNTM, PTM, ETM, GPU-DMM, GPU-DPMM, LF-DMM [:page_facing_up:](https://arxiv.org/pdf/1904.07695.pdf)
  • SATM - Java implementation of Self-Aggregation Topic Model [:page_facing_up:](https://dl.acm.org/doi/10.5555/2832415.2832564)
  • shorttext - Python implementation of various algorithms for Short Text Mining
  • trLDA - Python implementation of streaming LDA based on trust-regions [:page_facing_up:](http://proceedings.mlr.press/v37/theis15.pdf)
  • Logistic LDA - Tensorflow implementation of Discriminative Topic Modeling with Logistic LDA [:page_facing_up:](https://proceedings.neurips.cc/paper/2019/file/54ebdfbbfe6c31c39aaba9a1ee83860a-Paper.pdf)
  • EnsTop - Python implementation of *ENS*emble *TOP*ic modelling with pLSA
  • Dual-Sparse Topic Model - implemented in TopicModel4J using collapsed variational Bayes inference [:page_facing_up:](https://dl.acm.org/doi/10.1145/2566486.2567980)
  • Multi-Grain-LDA - MG-LDA implemented in tomotopy using collapsed Gibbs sampling [:page_facing_up:](https://dl.acm.org/doi/10.1145/1367497.1367513)
  • lda++ - C++ library for LDA and (fast) supervised LDA (sLDA/fsLDA) using variational inference [:page_facing_up:](https://dl.acm.org/doi/pdf/10.1145/2964284.2967237) [:page_facing_up:](http://www.cs.columbia.edu/~blei/papers/WangBleiFeiFei2009.pdf)
  • discLDA - C++ implementation of discLDA based on GibbsLDA++ [:page_facing_up:](https://papers.nips.cc/paper/2008/file/7b13b2203029ed80337f27127a9f1d28-Paper.pdf)
  • GuidedLDA - Python implementation that can be guided by setting some seed words per topic (using Gibbs sampling) [:page_facing_up:](https://www.aclweb.org/anthology/E12-1021.pdf)
  • seededLDA - R package that implements seeded-LDA for semi-supervised topic modeling
  • keyATM - R package for Keyword Assisted Topic Models.
  • hca - C implementation of non-parametric topic models (HDP, HPYP-LDA, etc.) with focus on hyperparameter tuning
  • BayesPA - Python interface for streaming implementation of MedLDA, maximum entropy discrimination LDA (max-margin supervised topic model) [:page_facing_up:](http://proceedings.mlr.press/v32/shi14.pdf)
  • sailing-pmls - Parallel LDA and medLDA implementation
  • BigTopicModel - C++ engine for running large-scale MedLDA models [:page_facing_up:](https://dl.acm.org/doi/10.1145/2487575.2487658)
  • DAPPER - Python implementation of Dynamic Author Persona (DAP) topic model [:page_facing_up:](https://arxiv.org/pdf/1811.01931.pdf)
  • ToT - Python implementation of Topics Over Time (A Non-Markov Continuous-Time Model of Topical Trends) [:page_facing_up:](https://dl.acm.org/doi/10.1145/1150402.1150450)
  • MLTM - C implementation of multilabel topic model (MLTM) [:page_facing_up:](https://www.mitpressjournals.org/doi/pdf/10.1162/NECO_a_00939)
  • sequence-models - Java implementation of block HMM and the mixed membership Markov model (M4)
  • Entropy-Based Topic Modeling - Java implementation of Entropy-Based Topic Modeling on Multiple Domain-Specific Text Collections
  • ST-LDA - ST-LDA: Single Topic LDA [:page_facing_up:](https://ywwbill.github.io/files/2016_socinfo_topicDynamic.pdf)
  • MTM - Java implementation of Multilingual Topic Model [:page_facing_up:](https://www.aclweb.org/anthology/D19-1120.pdf)
  • YWWTools - Java-based package for various topic models by Weiwei Yang
  • TEM - Topic Expertise Model [:page_facing_up:](https://dl.acm.org/doi/pdf/10.1145/2505515.2505720)
  • PTM - Prescription Topic Model for Traditional Chinese Medicine Prescriptions [:page_facing_up:](https://ieeexplore.ieee.org/abstract/document/8242679) (interesting benchmark models)
  • KGE-LDA - Knowledge Graph Embedding LDA [:page_facing_up:](https://www.aaai.org/ocs/index.php/AAAI/AAAI17/paper/viewFile/14170/14086)
  • LDA-SP - A Latent Dirichlet Allocation Method for Selectional Preferences [:page_facing_up:](https://www.aclweb.org/anthology/P10-1044.pdf)
  • LDA+FFT - LDA and FFTs (Fast and Frugal Trees) for better comprehensibility [:page_facing_up:](https://arxiv.org/pdf/1804.10657.pdf)
  • BERTopic - BERTopic supports guided, (semi-) supervised, and dynamic topic modeling and visualization [:page_facing_up:](https://arxiv.org/pdf/2203.05794.pdf)
  • CTM - CTMs combine contextualized embeddings (e.g., BERT) with topic models
  • ETM - Embedded Topic Model [:page_facing_up:](https://arxiv.org/pdf/1907.04907.pdf)
  • D-ETM - Dynamic Embedded Topic Model [:page_facing_up:](https://arxiv.org/pdf/1907.05545.pdf)
  • ProdLDA - Original TensorFlow implementation of Autoencoding Variational Inference (AEVI) for Topic Models [:page_facing_up:](https://arxiv.org/pdf/1703.01488.pdf)
  • pytorch-ProdLDA - PyTorch implementation of ProdLDA [:page_facing_up:](https://arxiv.org/pdf/1703.01488.pdf)
  • CatE - Discriminative Topic Mining via Category-Name Guided Text Embedding [:page_facing_up:](https://arxiv.org/pdf/1908.07162.pdf)
  • Top2Vec - Python implementation that learns jointly embedded topic, document and word vectors [:page_facing_up:](https://arxiv.org/pdf/2008.09470.pdf)
  • lda2vec - Mixing dirichlet topic models and word embeddings to make lda2vec [:page_facing_up:](https://arxiv.org/pdf/1605.02019.pdf)
  • lda2vec-pytorch - PyTorch implementation of lda2vec
  • G-LDA - Java implementation of Gaussian LDA using word embeddings [:page_facing_up:](https://www.aclweb.org/anthology/P15-1077.pdf)
  • MG-LDA - Python implementation of (Multi-lingual) Gaussian LDA [:page_facing_up:](https://raw.githubusercontent.com/EliasKB/Multilingual-Gaussian-Latent-Dirichlet-Allocation-MGLDA/master/MGLDA.pdf)
  • MetaLDA - Java implementation using Gibbs sampling that leverages document metadata and word embeddings [:page_facing_up:](https://arxiv.org/pdf/1709.06365.pdf)
  • LFTM - Java implementation of latent feature topic models (improving LDA and DMM with word embeddings) [:page_facing_up:](https://www.aclweb.org/anthology/Q15-1022.pdf)
  • CorEx - Recover latent factors with Correlation Explanation (CorEx) [:page_facing_up:](https://arxiv.org/pdf/1406.1222.pdf)
  • Anchored CorEx - Hierarchical Topic Modeling with Minimal Domain Knowledge [:page_facing_up:](https://arxiv.org/pdf/1611.10277.pdf)
  • Linear CorEx - Latent Factor Models Based on Linear Total CorEx [:page_facing_up:](https://arxiv.org/pdf/1706.03353v3.pdf)
  • Stan - Platform for statistical modeling and high-performance statistical computation, e.g., [LDA](https://mc-stan.org/docs/2_26/stan-users-guide/latent-dirichlet-allocation.html) [:page_facing_up:](https://files.eric.ed.gov/fulltext/ED590311.pdf)
  • PyMC3 - Python package for Bayesian statistical modeling and probabilistic machine learning, e.g., [LDA](http://docs.pymc.io/notebooks/lda-advi-aevb.html) [:page_facing_up:](https://peerj.com/articles/cs-55.pdf)
  • Turing.jl - Julia library for general-purpose probabilistic programming [:page_facing_up:](http://proceedings.mlr.press/v84/ge18b/ge18b.pdf)
  • TFP - Probabilistic reasoning and statistical analysis in TensorFlow, e.g., [LDA](https://github.com/tensorflow/probability/blob/master/tensorflow_probability/examples/latent_dirichlet_allocation_distributions.py) [:page_facing_up:](https://arxiv.org/pdf/2001.11819.pdf)
  • edward2 - Simple PPL with core utilities in the NumPy and TensorFlow ecosystem [:page_facing_up:](https://arxiv.org/pdf/1811.02091.pdf)
  • pyro - PPL built on PyTorch, e.g., [prodLDA](http://pyro.ai/examples/prodlda.html) [:page_facing_up:](https://www.jmlr.org/papers/volume20/18-403/18-403.pdf)
  • edward - A PPL built on TensorFlow, e.g., [LDA](http://edwardlib.org/iclr2017?Figure%2011.%20Latent%20Dirichlet%20allocation) [:page_facing_up:](https://arxiv.org/pdf/1610.09787.pdf)
  • ZhuSuan - A PPL for Bayesian deep learning, generative models, built on Tensorflow, e.g., [LDA](https://zhusuan.readthedocs.io/en/latest/tutorials/lntm.html) [:page_facing_up:](https://arxiv.org/pdf/1709.05870.pdf)
  • lda-c - C implementation using variational EM by David Blei
  • sLDA - C++ implementation of supervised topic models with a categorical response.
  • onlineldavb - Python online variational Bayes implementation by Matthew Hoffman [:page_facing_up:](https://proceedings.neurips.cc/paper/2010/file/71f6278d140af599e06ad9bf1ba03cb0-Paper.pdf)
  • HDP - C++ implementation of hierarchical Dirichlet processes by Chong Wang
  • online-hdp - Python implementation of online hierarchical Dirichlet processes by Chong Wang
  • ctr - C++ implementation of collaborative topic models by Chong Wang
  • dtm - C implementation of dynamic topic models by David Blei & Sean Gerrish
  • ctm-c - C implementation of the correlated topic model by David Blei
  • diln - C implementation of Discrete Infinite Logistic Normal (with HDP option) by John Paisley
  • hLDA - C implementation of hierarchical LDA by David Blei
  • turbotopics - Python implementation that finds significant multiword phrases in topics by David Blei
  • Stanford Topic Modeling Toolbox - Scala implementation of LDA, labeledLDA, PLDA, PLDP by Daniel Ramage and Evan Rosen
  • LDAGibbs - Java implementation of LDA using Gibbs sampling by Liu Yang
  • Matlab Topic Modeling Toolbox - Matlab implementations of LDA, ATM, HMM-LDA, LDA-COL (Collocation) models by Mark Steyvers and Tom Griffiths
  • cvbLDA - Python C extension implementation of collapsed variational Bayesian inference for LDA
  • fast - A Fast And Scalable Topic-Modeling Toolbox (Fast-LDA, CVB0) by Arthur Asuncion and colleagues [:page_facing_up:](https://arxiv.org/pdf/1205.2662.pdf)
  • Stanford Topic Modeling Toolbox - Scala implementation of LDA, labeledLDA, PLDA, PLDP by Daniel Ramage and Evan Rosen
  • Matlab Topic Modeling Toolbox - Matlab implementations of LDA, ATM, HMM-LDA, LDA-COL (Collocation) models by Mark Steyvers and Tom Griffiths
  • GibbsLDA++ - C++ implementation using Gibbs sampling [:page_facing_up:](https://dl.acm.org/doi/pdf/10.1145/1367497.1367510)
  • :fork_and_knife:
  • JGibbLDA - Java implementation using Gibbs sampling
  • Mr.LDA - Scalable Topic Modeling using Variational Inference in MapReduce [:page_facing_up:](https://dl.acm.org/doi/10.1145/2187836.2187955)
  • topic_models - Python implementation of LSA, PLSA and LDA
  • Topic-Model - Python implementation of LDA, Labeled LDA, ATM, Temporal Author-Topic Model using Gibbs sampling
  • LDAvis - R package for interactive topic model visualization
  • pyLDAvis - Python library for interactive topic model visualization
  • scalaLDAvis - Scala port of pyLDAvis
  • dtmvisual - Python package for visualizing DTM (trained with gensim)
  • TMVE online - Online Django variant of topic model visualization engine (*TMVE*)
  • TMVE - Original topic model visualization engine (LDA trained with *lda-c*) [:page_facing_up:](https://www.aaai.org/ocs/index.php/ICWSM/ICWSM12/paper/viewFile/4645/5021)
  • topicmodel-lib - Cython library for online/streaming LDA (Online VB, Online CVB0, Online CGS, Online OPE, Online FW, Streaming VB, Streaming OPE, Streaming FW, ML-OPE, ML-CGS, ML-FW)
  • wordcloud - Python package for visualizing topics via word_cloud
  • Mallet-GUI - GUI for creating and analyzing topic models produced by MALLET
  • TWiC - Topic Words in Context is a highly-interactive, browser-based visualization for MALLET topic models
  • dfr-browser - Explore Mallet's topic models of texts in a web browser
  • Termite - Explore topic models using term-topic matrix, group-in-a-box visualization or scatter plot.
  • Topics - Python library for topic modeling and visualization
  • TopicsExplorer - Explore your own text collection with a topic model – without prior knowledge [:page_facing_up:](https://dh2018.adho.org/a-graphical-user-interface-for-lda-topic-modeling)
  • topicApp - A Simple Shiny App for Topic Modeling
  • stminsights - A Shiny Application for Inspecting Structural Topic Models
  • Slice sampling
  • Minka
  • fastfit
  • dirichlet
  • lightspeed
  • lecture-notes
  • Newton-Raphson Method
  • fixed-point iteration - Wallach's PhD thesis, chapter 2.3
  • David Blei - David Blei's Homepage with introductory materials
  • awesome-machine-learning
  • awesome-datascience
  • awesome-python-data-science
  • ![CC0