https://github.com/maximtrp/tmplot
Visualization of Topic Modeling Results
https://github.com/maximtrp/tmplot
data-science data-visualization machine-learning plotting python topic-modeling visualization
Last synced: 8 days ago
JSON representation
Visualization of Topic Modeling Results
- Host: GitHub
- URL: https://github.com/maximtrp/tmplot
- Owner: maximtrp
- License: mit
- Created: 2021-05-24T15:50:46.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2025-09-13T11:13:29.000Z (5 months ago)
- Last Synced: 2025-09-13T13:37:55.370Z (5 months ago)
- Topics: data-science, data-visualization, machine-learning, plotting, python, topic-modeling, visualization
- Language: Python
- Homepage: https://tmplot.readthedocs.org
- Size: 17.4 MB
- Stars: 22
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# tmplot
[](https://app.codacy.com/gh/maximtrp/tmplot/coverage)
[](https://app.codacy.com/gh/maximtrp/tmplot)
[](https://github.com/maximtrp/tmplot/actions/workflows/python-package.yml)
[](https://tmplot.readthedocs.io/en/latest/?badge=latest)
[](https://pepy.tech/project/tmplot)
[](https://pypi.org/project/tmplot)
[](https://github.com/maximtrp/tmplot/issues)
**tmplot** is a comprehensive Python package for **topic modeling analysis and visualization**. Built for data scientists and researchers, it provides powerful interactive reports and advanced analytics that extend beyond traditional LDAvis/pyLDAvis capabilities.
**Analyze** • **Visualize** • **Compare** multiple topic models with ease

## Key Features
### Interactive Visualization
- **Topic scatter plots** with customizable coordinates and sizing
- **Term probability charts** with relevance weighting
- **Document analysis** showing top documents per topic
- **Interactive reports** with real-time parameter adjustment
### Advanced Analytics
- **Topic stability analysis** across multiple model runs
- **Model comparison** with sophisticated distance metrics
- **Saliency calculations** for term importance
- **Entropy metrics** for model optimization
### Model Support
- **[tomotopy](https://bab2min.github.io/tomotopy/)**: `LDAModel`, `LLDAModel`, `CTModel`, `DMRModel`, `HDPModel`, `PTModel`, `SLDAModel`, `GDMRModel`
- **[gensim](https://radimrehurek.com/gensim/)**: `LdaModel`, `LdaMulticore`
- **[bitermplus](https://github.com/maximtrp/bitermplus)**: `BTM`
### Distance Metrics
- Kullback-Leibler (symmetric & non-symmetric)
- Jensen-Shannon divergence
- Jeffrey's divergence
- Hellinger & Bhattacharyya distances
- Total variation distance
- Jaccard index
### Dimensionality Reduction
t-SNE, SpectralEmbedding, MDS, LocallyLinearEmbedding, Isomap
## Donate
If you find this package useful, please consider donating any amount of money. This will help me spend more time on supporting open-source software.
## Quick Start
### Installation
```bash
# From PyPI (recommended)
pip install tmplot
# Development version
pip install git+https://github.com/maximtrp/tmplot.git
```
### Basic Usage
```python
import tmplot as tmp
# Load your topic model and documents
model = your_fitted_model # tomotopy, gensim, or bitermplus
docs = your_documents
# Create interactive report
tmp.report(model, docs=docs)
# Or create individual visualizations
coords = tmp.prepare_coords(model)
tmp.plot_scatter_topics(coords, size_col='size')
```
## Advanced Examples
### Get Stable Topics
```python
import tmplot as tmp
# Find stable topics across multiple models
models = [model1, model2, model3, model4]
closest_topics, distances = tmp.get_closest_topics(models)
stable_topics, stable_distances = tmp.get_stable_topics(closest_topics, distances)
```
### Analyze Model
```python
# Calculate entropy for model selection
entropy_score = tmp.entropy(phi_matrix)
# Analyze topic stability
saliency = tmp.get_salient_terms(phi, theta)
```
### Visualize
```python
# Create topic distance matrix with different metrics
topic_dists = tmp.get_topics_dist(phi, method='jensen-shannon')
# Generate coordinates with custom algorithm
coords = tmp.get_topics_scatter(topic_dists, theta, method='tsne')
tmp.plot_scatter_topics(coords, topic=3) # Highlight topic 3
```
## Documentation & Examples
- [Complete Tutorial](https://tmplot.readthedocs.io/en/latest/tutorial.html) - Step-by-step guide
- [API Reference](https://tmplot.readthedocs.io/) - Full documentation
- [Example Notebooks](https://github.com/maximtrp/tmplot/tree/main/examples) - Jupyter examples
## Requirements
**Core dependencies:** `numpy`, `scipy`, `scikit-learn`, `pandas`, `altair`, `ipywidgets`
**Optional models:** `tomotopy`, `gensim`, `bitermplus`
