https://github.com/arnim/stm-methodshub-tutorial
https://github.com/arnim/stm-methodshub-tutorial
Last synced: 20 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/arnim/stm-methodshub-tutorial
- Owner: arnim
- License: mit
- Created: 2026-03-30T20:19:13.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2026-03-30T21:21:06.000Z (3 months ago)
- Last Synced: 2026-03-30T22:15:12.872Z (3 months ago)
- Language: R
- Size: 48.7 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Citation: CITATION.cff
Awesome Lists containing this project
README
# Structured Topic Modeling of UN General Debate Speeches in R
[](https://mybinder.org/v2/gh/arnim/stm-methodshub-tutorial/main?labpath=stm_un_tutorial.ipynb)
[](https://hub.nfdi-jupyter.de/v2/gh/arnim/stm-methodshub-tutorial/main?labpath=stm_un_tutorial.ipynb)
This repository contains a Methods Hub-style Jupyter notebook tutorial for **Structured Topic Models (STM)** in **R** using **IRkernel**.
## Tutorial source file
- `stm_un_tutorial.ipynb`
## What this tutorial does
The tutorial closely follows the original notebook in `arnim/stmdemo`:
- source notebook:
It uses:
- the **same dataset** (`UNgeneraldebate.corpus.RData`)
- the **same pre-fitted STM model** (`UNgeneraldebate.stm.RData`)
- the **same topic parametrization** (`K = 40`)
The tutorial is based on the original UN General Debate Corpus release used in the source notebook. For readers looking for the dataset record today, the persistent Harvard Dataverse DOI is:
-
This DOI refers to the same original dataset record first published in 2017, but it now resolves to newer updated versions of the corpus. So it is the same dataset lineage / persistent identifier, not necessarily the exact same file version originally used in the notebook.
The only substantive code update is a small preprocessing change required for compatibility with current versions of `quanteda`.
## Repository structure
- `stm_un_tutorial.ipynb` – tutorial notebook
- `data/` – data files used by the tutorial
- `binder/` – Binder and Methods Hub execution files
- `renv.lock` – local R environment lockfile
- `renv/` – project-local `renv` infrastructure
## Reproducibility
### Local with renv
Open the project in R and run:
```r
renv::restore()
```
To register the R Jupyter kernel locally if needed:
```r
IRkernel::installspec(user = TRUE, name = "ir-renv", displayname = "R (renv IRkernel)")
```
### Binder / Methods Hub
The repository includes:
- `binder/runtime.txt`
- `binder/install.R`
- `binder/postBuild`
These files follow the Methods Hub tutorial guidelines:
## Rendering / execution
If Quarto is available, the notebook can be rendered with:
```bash
quarto render stm_un_tutorial.ipynb --execute
```
Or executed with Jupyter:
```bash
jupyter nbconvert --to notebook --execute --inplace stm_un_tutorial.ipynb
```
## License
MIT License. See `LICENSE`.
## Citation
See `CITATION.cff`.
## Acknowledgment
Parts of this tutorial repository were prepared with assistance from the [pi coding agent](https://shittycodingagent.ai) using OpenAI GPT-5.4. Final review and responsibility remain with the author.