https://github.com/jannis-baum/topic-evolution-model
Graph-based temporal topic modeling for very small corpora
https://github.com/jannis-baum/topic-evolution-model
graph-based-model nlp topic-flow topic-modeling
Last synced: 15 days ago
JSON representation
Graph-based temporal topic modeling for very small corpora
- Host: GitHub
- URL: https://github.com/jannis-baum/topic-evolution-model
- Owner: jannis-baum
- License: gpl-3.0
- Created: 2023-01-13T08:48:20.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2024-04-16T15:31:58.000Z (about 2 years ago)
- Last Synced: 2024-04-30T02:50:21.251Z (about 2 years ago)
- Topics: graph-based-model, nlp, topic-flow, topic-modeling
- Language: C++
- Homepage:
- Size: 892 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Topic Evolution Model
TEM is a graph-based temporal topic modeling for very small corpora. Detailed
information about TEM can be found in the following publication
> Liebe L, Baum J, Cech T, Scheibel W, and Dollner J (2024). Detecting and
> Comparing LLM Capabilities to Human Writers through Linguistic Analysis
## Usage
To use TEM, first build the model executable and term distance server Docker
container.
- Build the model:
```sh
mkdir build && cd build
cmake .. && make
```
An executable will be compiled into `build/topic_evolution_model`.
- Build the term distance server:
```sh
make -C term-distance build
```
Note that this may take a while since it downloads word2vec models (total of
around 8GB).
Once this is set up, you can start using TEM.
- Run the term distance server, e.g. with
```sh
make -C term-distance run
```
This will expose the server on `localhost:8000`. To stop the server, run
```sh
make -C term-distance kill
```
- You can then run the model executable (in `build/topic_evolution_model`)
directly or use the provided Python interface in [the `script/`
directory](./script), which also features parallelization.\
Note that you need to specify the URL to the term distance server via the
environment variable `TEM_WORD_DISTANCE_ENDPOINT`, e.g. if running the
executable directly
```sh
TEM_WORD_DISTANCE_ENDPOINT=http://localhost:8000/similarity \
build/topic_evolution_model [...]
```