https://github.com/xianneuro/narraters
Human-in-the-loop pipeline and web UI for narrative recall analysis (transcribe, segment, correct, parse, match, causal rate).
https://github.com/xianneuro/narraters
cognitive-neuroscience event-segmentation flask free-recall human-rater memory-research narrative-recall nlp open-source python
Last synced: 21 days ago
JSON representation
Human-in-the-loop pipeline and web UI for narrative recall analysis (transcribe, segment, correct, parse, match, causal rate).
- Host: GitHub
- URL: https://github.com/xianneuro/narraters
- Owner: xianNeuro
- License: other
- Created: 2026-05-16T02:48:47.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2026-05-31T22:55:18.000Z (27 days ago)
- Last Synced: 2026-06-01T00:19:13.651Z (27 days ago)
- Topics: cognitive-neuroscience, event-segmentation, flask, free-recall, human-rater, memory-research, narrative-recall, nlp, open-source, python
- Language: Python
- Homepage: https://xianneuro.github.io/narRaters/
- Size: 101 MB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Citation: CITATION.cff
Awesome Lists containing this project
README
narRaters
Turn complex narratives into structured, reviewable data β with a web UI at every step
GitHub: github.com/xianNeuro/narRaters Β· PyPI: v0.3.14
π Project home
Β·
π¦ v0.3.14
Β·
π Tutorial (PDF)
Β·
π Issues
Β·
π Cite
Β·
π¬ Feedback
**narRaters** (*narrative* + *raters*) is an open-source software on [GitHub (xianNeuro/narRaters)](https://github.com/xianNeuro/narRaters) that helps process complex narratives (e.g., audio book, text-based stories, interviews, conversations, etc.) for memory, language processing, causal reasoning, and LLM research.
Imagine you ran a memory study: participants listened to a story, then recalled what they remembered (spoken or typed). Before you can analyze memory, you need structured data β what happened in the story, what each person recalled, and how those pieces connect.
**narRaters** helps you get there. It runs common narrative-processing steps (transcribe audio, split a story into events, clean up recall text, parse recalls into clauses, match recalls back to story events, rate causal links between events) and gives you a web interface to review and fix outputs before exporting.
Works for audio or text, stories or other long narratives (including movie annotations), and human-only or human-vs-LLM workflows.
| You have⦠| narRaters helps you⦠|
|---|---|
| Story audio or transcript | Transcribe it and break it into numbered **events** |
| Participant recall files | Correct spelling, split into **clauses**, and **match** each clause to story events |
| A segmented story | **Rate causal links** between event pairs (did event A lead to event B?) |
| Automated or AI outputs | **Screen and edit** them in the browser, then export signed-off files |
---
## Get started in 3 steps
1
Download & open
Get the v0.3.14 ZIP, unzip, and double-click narRater.app (macOS) or narRaters_installer.bat (Windows). Needs Python 3.10+.
2
Pick your pipeline
Your browser opens to the pipeline builder. Drag in only the steps you need (e.g. segment β match β causal rate). Bundled demo data is already loaded so you can explore immediately.
3
Run, review, export
On the dashboard, click a cell to run a step. Open the magnifying-glass icon to inspect results, edit in the browser, and export when you are satisfied.
Or via terminal (Python 3.10+; no ZIP download). Run from the folder that contains your data/ and output/ directories (or set NARRATERS_PROJECT_ROOT to that path):
```bash
python --version # must show 3.10 or newer
python3 -m pip install narraters --upgrade # wait for βSuccessfully installedβ
cd /path/to/your/project # folder with data/ and output/
narraters serve # browser opens to the pipeline builder
```
Then continue with **steps 2β3** above β pick your pipeline, run steps on the dashboard, review, and export.
> **First time?** Follow the illustrated **[Tutorial PDF](narRater_Tutorial.pdf)** or see [Installation](#installation) and [Troubleshooting](#troubleshooting).
---
## See the app
β Build a pipeline Β βΒ β‘ Dashboard Β βΒ β’ Rate causal links
β Pipeline dashboard

See every subject/story, run steps, and open results. Green = done; click a cell to process.
β‘ Event segmentation

Move the cursor through the text and click to drop boundary bars. Toggle binary or 1β5 strength (bar colored blueβred).
β’ Recall matching

Story events on the left; recall segments on the right. Click a segment, then click events to match β or type event numbers. Optionally turn on Further ratings for per-segment quality checkboxes.
β£ Causal rating

Click a grid cell to rate how strongly one story event caused another (0β3 scale).
---
## Table of contents
- What is narRaters?
- Get started in 3 steps
- See the app
-
Installation
- ZIP download (double-click launcher)
- PyPI (terminal)
- Alternate install (command line)
- Using the web UI
- Troubleshooting
-
Where to put your data
- Pipeline overview
-
Command-line pipeline
- Step 1 β
transcribe
- Step 2 β
segment
- Step 3 β
correct
- Step 4 β
parse
- Step 5 β
match
- Step 6 β
rate
-
Contributing tab
- Library / Python use
-
Project layout
- Further reading
- Citation
- Acknowledgements
- Author
- License
> On GitHub, **README**, **Contributing** (research background, prompt templates, acknowledgements, author), and **License** are the tabs in the bar above. Use this table of contents or the **Outline** menu (list icon, top-right) to jump between README sections.
---
## Installation
Needs **[Python 3.10+](https://www.python.org/downloads/)**. Windows: check **βAdd python.exe to PATHβ** in the Python installer. If anything fails, see **[Troubleshooting](#troubleshooting)**.
### ZIP download (double-click launcher)
1. **[Download the ZIP (v0.3.14)](https://github.com/xianNeuro/narRaters/archive/refs/tags/v0.3.14.zip)** and unzip it β or on the [GitHub repo page](https://github.com/xianNeuro/narRaters) use green **Code βΎ** β **Download ZIP** for the current `main` branch. You'll get **`narRaters-0.3.14`**, **`narRaters-main`**, or **`narRaters`** (if you used `git clone`).
2. **Launch:** **macOS** β double-click **`narRater.app`**. **Windows** β double-click **`narRaters_installer.bat`**. **Linux** β in Terminal, `cd` into the folder and run `bash install.sh`.
3. Your browser opens **`http://127.0.0.1:5000/pipeline-config`** with bundled examples. Put your data in **`data/`**. Restart later by double-clicking the same launcher.
**macOS Gatekeeper or quarantine issues?** See **[Troubleshooting](#troubleshooting)**.
### PyPI (terminal)
```bash
python --version # must show 3.10 or newer
python3 -m pip install narraters --upgrade # use python3 -m pip, not bare pip
narraters serve # browser opens to the pipeline builder
```
On first launch, example **`data/`** and **`output/`** folders are copied into whatever directory you run from (unless you already have a project folder, or set **`NARRATERS_PROJECT_ROOT`**). Package: [`narraters`](https://pypi.org/project/narraters/) (all lowercase). For launchers and the tutorial PDF, use the ZIP install above.
Full PyPI setup (venv from scratch)
```bash
python3 --version # must be 3.10 or newer
mkdir -p ~/narRaters-demo && cd ~/narRaters-demo
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
python3 -m pip install --upgrade pip
python3 -m pip install narraters --upgrade
narraters serve
```
### Alternate install (command line)
git clone + install.sh (project folder with bundled examples)
```bash
# macOS / Linux
cd ~ && git clone https://github.com/xianNeuro/narRaters.git && cd narRaters && bash install.sh
```
```bat
:: Windows
cd %USERPROFILE% && git clone https://github.com/xianNeuro/narRaters.git && cd narRaters && narRaters_installer.bat
```
This is what `narRater.app` does under the hood, just without the click. `git: command not found`? On macOS: `xcode-select --install`. On Windows: install [Git for Windows](https://git-scm.com/download/win).
Optional extras (Whisper, cloud APIs, local Gemma, etc.)
Inside the project folder, with the venv activated:
```bash
python3 -m pip install -e ".[audio]" # Whisper transcription
python3 -m pip install -e ".[api]" # Anthropic / OpenAI
python3 -m pip install -e ".[nlp]" # spaCy segmentation
python3 -m pip install -e ".[grammar]" # grammar checker
python3 -m pip install -e ".[local-llm]" # local Gemma
python3 -m pip install -e ".[match]" # rmatch
python3 -m pip install -e ".[all]" # api + match
```
PyPI users: `python3 -m pip install "narraters[audio]"`, etc.
Heavy methods (`audio`, `local-llm`, `match`) pull multi-GB packages β the app shows a RAM/disk preflight before downloading. **Ollama (local Gemma):** install [Ollama](https://ollama.com), then `ollama pull gemma4:e4b`. **API keys:** copy `.env.example` to `.env` and edit (see [`SETUP_API.md`](SETUP_API.md)).
Developers
`install.sh` already does an editable install. To work on the codebase:
```bash
git clone https://github.com/xianNeuro/narRaters.git
cd narRaters
python3 -m venv .venv && source .venv/bin/activate && python3 -m pip install -e .
```
Build the standalone macOS app for icon testing: `bash packaging/macos/build_app_bundle.sh`. Build the slim repo-root launcher: `bash packaging/macos/build_repo_app.sh`.
### Using the web UI
The app runs at **`http://127.0.0.1:5000`**. First visit opens **pipeline configuration**; if you already saved a pipeline, you land on the **dashboard**.
| Screen | Route | What you do there |
|--------|--------|-------------------|
| **Pipeline setup** | `/pipeline-config` | Drag steps into **Pipeline Flow**, set per-step **folders**, enter a **rater name** (or π²). **Continue** saves config and opens the dashboard. |
| **Dashboard** | `/` | Grid: **rows** = subjects or stories, **columns** = steps. **Click a cell** to run that step (pick **method / model / prompt** when offered). **Batch** runs one step across all rows. |
| **Detail view** | `/subject/β¦` or `/story/β¦` | **Tabs** per step for **one** row. Use the **version** dropdown to compare automated output vs your **`{id}_{ratername}-edit`** saves, then **edit** and **save**. |
**Flow:** setup β dashboard (bulk runs) β open a row to **inspect, hand-correct, or compare versions**.
narraters serve options
| Flag | Default | Purpose |
|---|---|---|
| `--port` | `5000` | Another port if `5000` is busy |
| `--host` | `127.0.0.1` | Bind address; use `0.0.0.0` only on a **trusted** network |
| `--no-browser` | off | Do not open a browser tab (SSH, headless) |
| `--debug` | off | Flask debug / auto-reload while hacking on the server |
```bash
narraters serve --port 8080 --no-browser
```
Before a step would load **Whisper**, **Gemma via Ollama**, **rMatch**, or other heavy local models, the UI runs a **RAM / disk preflight** and may suggest a lighter method (`rules`, `test`, `clause`) if the run looks unsafe for your machine.
### Troubleshooting
| If you see⦠| Do this |
|--------------|--------|
| `Python 3.10+ required` | Install [Python 3.10+](https://www.python.org/downloads/), close and reopen any Terminal, run again. |
| Blank page on `localhost:5000` | Visit **`http://127.0.0.1:5000/pipeline-config`** instead (IPv6/IPv4 quirk on some Macs). |
| **macOS:** Gatekeeper / βcannot check for malicious softwareβ / no **Open** in the right-click menu | **1.** In **Finder**, try **control-click** **`narRater.app`** β **Open**, then confirm **Open** if the dialog offers it β [Appleβs Gatekeeper overrides](https://support.apple.com/guide/mac-help/mh40617/mac). **2.** If that path is missing or still blocks: **System Settings** β **Privacy & Security** β scroll to **Security** β after a failed launch, macOS often shows **`narRater` was blocked** (wording varies) with **Allow Anyway** or **Open Anyway**; click it, enter your password, then launch **`narRater.app`** again (that button may only appear for a limited time after the block). **3.** Downloaded folder still quarantined: in Terminal, `xattr -dr com.apple.quarantine /path/to/narRaters-main`, then try **1** or **2** again. |
| **macOS:** βnarRater couldn't find the narRaters project folderβ | macOS **App Translocation** ran the app from a temp copy. Run `xattr -dr com.apple.quarantine ~/Downloads/narRaters-main` (adjust path) and double-click again, or use [git clone install](#alternate-install-command-line). |
| **Windows:** SmartScreen warns about `narRaters_installer.bat` | Click **More info** β **Run anyway**. |
| Port 5000 already in use | The installer auto-tries 5001β5010 and prints the URL. To free 5000: macOS β System Settings β General β AirDrop & Handoff β turn off **AirPlay Receiver**. |
---
## Where to put your data
After [installation](#installation), place files so the paths match what you configured on the **pipeline** page (defaults below are relative to the **project root**). You can **remap** any stepβs input/output folders there without moving data.
| You have⦠| Put it in⦠| Format / naming |
|---|---|---|
| Story transcript (text) | `data/2_story_transcript/` | `{story}.txt` β plain UTF-8 text, one story per file |
| Story event list (pre-segmented) | `data/3_story_events/` | `{story}_events.xlsx` β columns `event`, `story_texts` |
| Subject recall text | `data/5_recall_texts/` | `{subj_id}.txt` β e.g. `the_siren_sub-01.txt` |
| Story audio (optional, Step 1) | `data/1_story_audio/` | `.wav` / `.mp3` / `.m4a`, named by story |
| Recall audio (optional, Step 1) | `data/4_recall_audio/` | `.wav` / `.mp3` / `.m4a`, named by subject |
Outputs are written under `output/` β one subdirectory per step (`output/recall_corrected/`, `output/recall_parsed/`, `output/recall_rated/`, β¦). A smaller alternate layout lives in **`demo/data/`** (lighthouse story, three recall `.txt` files).
### Example input/output data
The repository ships **realistic sample inputs and outputs** under `data/` and `output/` so you can see accepted naming and file types before adding your own study. Your private files in those folders stay untracked (see `.gitignore`); only the examples below are committed.
**Stories:** **`pieman_edited`** (story audio + transcript + events) and **`the_siren`** (transcript, events, two recall subjects).
| Role | Folder | Example file(s) |
|------|--------|-----------------|
| Story audio (input) | `data/1_story_audio/` | `pieman_edited.wav` |
| Story transcript (input) | `data/2_story_transcript/` | `pieman_edited.txt`, `the_siren.txt` |
| Story events (input) | `data/3_story_events/` | `pieman_edited_events.xlsx`, `the_siren_events.xlsx` |
| Recall audio (input) | `data/4_recall_audio/` | Your own `.wav` / `.mp3` / `.m4a` / `.mp4` (not shipped publicly) |
| Recall text (input) | `data/5_recall_texts/` | `the_siren_sub-01.txt`, `the_siren_sub-02.txt` |
| Story transcription (output) | `output/story_audio-transcribed/` | `pieman_edited.txt` |
| Recall transcription (output) | `output/recall_audio-transcribed/` | `the_siren_sub-01.txt`, `the_siren_sub-02.txt` |
| Spell/grammar correction (output) | `output/recall_corrected/` | `the_siren_sub-01.txt`, `the_siren_sub-02.txt` |
| Parsed recall (output) | `output/recall_parsed/` | `the_siren_sub-01_parsed.xlsx`, `the_siren_sub-02_parsed.xlsx` |
| Recall β events (output) | `output/recall_rated/` | `the_siren_sub-02_rate-recall-test_mode.xlsx` (method slug in filename) |
| Causal ratings (output) | `output/causal_rated/` | `pieman_edited_causal-linguistic.xlsx`, `the_siren_causal-linguistic.xlsx` |
**Quick try:** after install, point a pipeline at the default folders above and run **`sentenceCorrect` β `textParsing` β `textMatching`** on `the_siren_sub-01` / `the_siren_sub-02`, or open the bundled **`output/`** files in Excel to inspect column layouts. Story **`pieman_edited`** is useful for **`audioTranscribe`** (large `.wav`) and **`causalRating`** on `pieman_edited_events.xlsx`.
**File versioning is a core feature.** Automated runs write `{subj_id}_{method}.ext` (or `{story}_β¦` for story-level steps); your hand-edited versions are saved as `{subj_id}_{ratername}-edit.ext` and never overwrite the originals. The web UI lets you switch between versions via a dropdown, and the `-edit` files are what you export for analysis.
---
## Pipeline overview
**Six optional steps β use any subset, in any order.** Each step can run automatically (rules, local models, or cloud APIs) and then be reviewed in the browser.
| Plain English | Step ID | Input β output (typical) |
|---|---|---|
| Transcribe audio | **`audioTranscribe`** | audio file β text transcript |
| Split story into events | **`eventSegment`** | story transcript β numbered event list |
| Fix recall spelling/grammar | **`sentenceCorrect`** | raw recall text β corrected text |
| Split recall into clauses | **`textParsing`** | corrected recall β clause segments |
| Match recall to story | **`textMatching`** | recall segments + story events β rated matches |
| Rate event causality | **`causalRating`** | story events β causeβeffect ratings |
Full step reference (commands & folders)
In typical recall work, **`audioTranscribe`** / **`eventSegment`** target the **story**, **`sentenceCorrect`**β**`textMatching`** each **subject recall**, and **`causalRating`** the **story event list** β but text-only projects skip Step 1, and you can equally run just **`eventSegment` + `causalRating`** or **`sentenceCorrect` β `textParsing` β `textMatching`**. Every step is available from the **GUI** or **`narraters` CLI**, has a lightweight default method, and supports hand-editing afterward.
| # | Step ID | What it does | Terminal command | Default in / out |
|---|---------|--------------|------------------|------------------|
| 1 | **`audioTranscribe`** | Audio recordings β text (Whisper/WhisperX); story vs recall via `audioScope` or `--kind` | `narraters transcribe` | `data/4_recall_audio/` (or `data/1_story_audio/` with `--kind story`) β `output/*_audio-transcribed/` |
| 2 | **`eventSegment`** | Story transcript β numbered events | `narraters segment` | `data/2_story_transcript/` β `data/3_story_events/` |
| 3 | **`sentenceCorrect`** | Fix spelling/grammar in recall text (no rewriting) | `narraters correct` | `data/5_recall_texts/` β `output/recall_corrected/` |
| 4 | **`textParsing`** | Corrected recall β clause-level segments | `narraters parse` | `output/recall_corrected/` β `output/recall_parsed/` |
| 5 | **`textMatching`** | Recall segments β story events | `narraters match` | `output/recall_parsed/` + `data/3_story_events/` β `output/recall_rated/` |
| 6 | **`causalRating`** | Causal strength of every story-event pair | `narraters rate` | `data/3_story_events/` β `output/causal_rated/` |
For each step, the GUI runs the same backends as the CLI. **Available methods, flags, and examples** are under **[Command-line pipeline](#command-line-pipeline)** below.
---
## Command-line pipeline
Each of the six steps is a separate **`narraters`** subcommand with its own **`--method`** (and related options). Use the CLI for **scripts**, **clusters**, or **reproducible** runsβ**with or without** the web UI, and **with any subset** of steps your study uses. General shape:
```
narraters [--method METHOD] [--model MODEL] [-i INPUT] [-o OUTPUT] [--prompt-version VERSION] ...
```
Discover what's available at any time:
```bash
narraters --help # list all subcommands
narraters --help # step-specific options
narraters segment --list-prompts # list available prompt versions for a step
narraters segment --list-models # list supported model identifiers
```
The method choices below are exactly those accepted by the CLI (`src/narraters/cli.py`).
### Step 1 β `transcribe` (audio β text)
```bash
narraters transcribe --model large-v3 --timestamps # recall audio (default)
narraters transcribe --kind story --model small # story audio instead
narraters transcribe -i path/to/audio -o path/to/out # custom directories
narraters transcribe --filter sub-01 # one item only
```
| Option | Choices | Notes |
|---|---|---|
| `--model` | `tiny`, `base`, `small`, `medium`, `large-v2`, `large-v3` | Whisper model name |
| `--timestamps` | flag | Also write Excel files with word-level timestamps |
| `--kind` | `recall` (default), `story` | Picks the conventional directories: `recall` = `data/4_recall_audio/` β `output/recall_audio-transcribed/`; `story` = `data/1_story_audio/` β `output/story_audio-transcribed/` |
| `-i, --input` | path | Input audio directory (overrides the `--kind` default) |
| `-o, --output` | path | Output directory (overrides the `--kind` default) |
| `--filter` | substring | Only transcribe files whose name matches this item id |
Requires `pip install "narraters[audio]"` (or `pip install -e ".[audio]"` from a clone). Text-only projects can skip Step 1 entirely.
### Step 2 β `segment` (story β events)
```bash
narraters segment --method clause
narraters segment --method api --model --prompt-version event_segment
narraters segment --method fine --input data/2_story_transcript/my_story.txt
```
Run `narraters segment --list-models` for the exact `--model` strings (Anthropic, OpenAI, and Ollama-backed presets).
| Option | Choices | Notes |
|---|---|---|
| `--method` | `clause`, `fine`, `coarse`, `api` | `clause` needs no model; `fine`/`coarse` use spaCy if installed; `api` calls an LLM |
| `--model` | see `narraters segment --list-models` | Only used with `--method api` (Anthropic, OpenAI, or Ollama preset keys) |
| `--prompt-version` | see `--list-prompts` | Selects a template from `scripts/prompt/event_segment*.txt` |
| `-i, --input` | path | Single transcript file or a directory (else processes all) |
| `-o, --output` | path | Output directory (default: `data/3_story_events/`) |
### Step 3 β `correct` (spell / grammar fixes)
```bash
narraters correct --method rules
narraters correct --method gemma-ollama --ollama-model gemma4:e4b
```
| Option | Choices | Notes |
|---|---|---|
| `--method` | `rules`, `gemma-ollama` | `rules` runs entirely locally with no model; `gemma-ollama` needs a local Ollama server |
| `--ollama-model` | e.g. `gemma4:e4b` | Local Ollama model tag (with `gemma-ollama`) |
| `--prompt-file` | path | Override the instructions file (default: `scripts/prompt/spell_gram.txt`) |
| `-i, --input` | path | Single recall text file |
| `-o, --output` | path | Output directory |
Minimal corrections only β Step 3 fixes spelling/grammar errors and never rewrites or paraphrases.
### Step 4 β `parse` (recall text β clause-level segments)
```bash
narraters parse --method rules
narraters parse --method ollama --model gemma4:e4b --prompt-version recall_parse_clause
narraters parse --filter-pattern sub-02 # process one subject only
```
| Option | Choices | Notes |
|---|---|---|
| `--method` | `rules`, `ollama` | `rules` is the default (regex, no model); `ollama` uses local Gemma |
| `--model` | e.g. `gemma4:e4b` | Ollama model tag (with `--method ollama`) |
| `--prompt-version` | see `scripts/prompt/recall_parse_*.txt` | Prompt template name |
| `-i, --input` | path | Input directory (default: `output/recall_corrected/`) |
| `-o, --output` | path | Output directory (default: `output/recall_parsed/`) |
| `--filter-pattern` | substring | Optional filter to process a single subject |
### Step 5 β `match` (recall segments β story events)
```bash
narraters match --test-mode # simulated keyword matching, no model/API
narraters match --method api --story-events data/3_story_events
narraters match --method gemma-ollama
narraters match --method rmatch # embedding matcher (requires [match])
```
| Option | Choices | Notes |
|---|---|---|
| `--method` | `test`, `api`, `gemma-ollama`, `rmatch` | `test` is keyword-based, free, and always available; `rmatch` needs `pip install "narraters[match]"` |
| `--story-events` | path | Directory of `{story}_events.xlsx` (default: `data/3_story_events`) |
| `-i, --input` | path | Recall-parsed input directory (default: `output/recall_parsed/`) |
| `-o, --output` | path | Output directory (default: `output/recall_rated/`) |
| `--test-mode` | flag | Equivalent to `--method test` β simulated matching, no API calls |
### Step 6 β `rate` (causal relationships between event pairs)
```bash
narraters rate --method linguistic
narraters rate --method api --model --prompt-version causal_rating
narraters rate --method manual # write an empty matrix for hand rating
```
Use `narraters rate --help` and the Step 6 model dropdown in the web UI for supported `--model` values when using `--method api`.
| Option | Choices | Notes |
|---|---|---|
| `--method` | `linguistic`, `api`, `manual` | `linguistic` is rule-based (no model); `manual` scaffolds an NΓN matrix to fill in by hand |
| `--model` | see web UI / provider docs | Only used with `--method api` |
| `--prompt-version` | see `scripts/prompt/causal_rating*.txt` | Prompt template name |
| `-i, --input` | path | Input file/directory |
| `-o, --output` | path | Output directory |
---
## Library / Python use
```python
from narraters import __version__, project_root
print(__version__, project_root())
```
Direct per-step imports are planned for a future release; for now, programmatic use should call the CLI via `subprocess` or import the modules under `scripts/`.
---
## Project layout
After unzipping, everything lives under a single **`narRaters/`** project root. Paths, contents, and naming conventions:
### Folder structure
```text
narRaters/
βββ README.md # This file β user guide & pipeline docs
βββ CONTRIBUTING.md # Research background, prompt templates, acknowledgements, author
βββ LICENSE
βββ narRater_Tutorial.pdf # Illustrated web UI tour
βββ narRater.app # macOS double-click launcher
βββ narRaters_installer.bat # Windows launcher
βββ install.sh # macOS / Linux installer
βββ pyproject.toml # Package metadata & pip extras
βββ SETUP_API.md, .env.example # API key setup
β
βββ data/ # Inputs (see Where to put your data)
β βββ 1_story_audio/ # Optional Step 1 β story audio
β β βββ {story}.wav | .mp3 | .m4a
β βββ 2_story_transcript/ # Story text
β β βββ {story}.txt # plain UTF-8, one story per file
β βββ 3_story_events/ # Pre-segmented or segmented story events
β β βββ {story}_events.xlsx # columns: event, story_texts
β βββ 4_recall_audio/ # Optional Step 1 β recall audio
β β βββ {subj_id}.wav | .mp3 | .m4a | .mp4
β βββ 5_recall_texts/ # Recall text
β βββ {subj_id}.txt # e.g. the_siren_sub-01.txt
β
βββ output/ # Pipeline outputs (one subfolder per step)
β βββ story_audio-transcribed/ # Step 1 (story) β {story}.txt
β βββ recall_audio-transcribed/# Step 1 (recall) β {subj_id}.txt
β βββ recall_corrected/ # Step 3 β {subj_id}.txt
β βββ recall_parsed/ # Step 4 β {subj_id}_parsed.xlsx
β βββ recall_rated/ # Step 5 β {subj_id}_{method}.xlsx
β βββ causal_rated/ # Step 6 β {story}_causal-{method}.xlsx
β
βββ scripts/ # Pipeline backends (CLI & web UI call these)
β βββ 1_audio-transcribe.py # audioTranscribe
β βββ 2_story-event-segment.py # eventSegment
β βββ 3_spell-grammar-correct.py # sentenceCorrect
β βββ 4_parse-texts.py # textParsing
β βββ 5_recall-rater.py # textMatching
β βββ 6_causal-rater.py # causalRating
β βββ prompt/ # LLM prompt templates (.txt)
β βββ event_segment.txt
β βββ spell_gram.txt
β βββ recall_parse_clause.txt
β βββ recall_rating.txt
β βββ causal_rating.txt
β
βββ server/ # Flask web UI
β βββ web-interface.py # Routes & subprocess orchestration
β βββ START_HERE.command # macOS launcher script
β
βββ templates/ # Web UI HTML (pipeline, dashboard, subject/story)
βββ static/ # CSS, JS, app icon
β
βββ src/narraters/ # pip package
β βββ cli.py # narraters command entry point
β βββ paths.py # Project-root resolution
β βββ runtime_install.py # Bundled-example copy on first serve
β
βββ helpers/ # Shared utilities & smoke tests
β βββ software_paths.py # Canonical path resolution
β βββ step_files.py # Flexible step input/output file recognition
β βββ resource_preflight.py # RAM / disk checks for heavy methods
β βββ test_*.py # Pipeline validation scripts
β
βββ docs/ # GitHub Pages site & README assets
β βββ index.html # Project landing page
β βββ screenshots/ # README GIFs (+ recall-matching.png for site og:image)
β
βββ demo/ # Smaller lighthouse example
β βββ data/ # the_lighthouse transcript + recall texts
β βββ output/ # Sample outputs for the demo story
β
βββ developer/ # Contributor handbook & tooling
β βββ README.md # Per-step I/O contracts & design principles
β βββ SETUP_API.md # API key setup (developer copy)
β
βββ packaging/macos/ # App bundle / DMG build scripts
βββ build_app_bundle.sh
```
Bundled examples: **`pieman_edited`**, **`the_siren`** β see [Example input/output data](#example-inputoutput-data).
**Versioning:** automated files use `{id}_{method}.ext`; hand-edited exports use `{id}_{ratername}-edit.ext` (never overwritten).
---
## Further reading
- **[Project home (GitHub Pages)](https://xianneuro.github.io/narRaters/)** β landing page for search and sharing.
- **[`narRater_Tutorial.pdf`](narRater_Tutorial.pdf)** β illustrated, click-by-click tour of the web UI; good next step after [Installation](#installation).
- **[`SETUP_API.md`](SETUP_API.md)** β API keys for Anthropic, OpenAI, and Hugging Face; which pipeline steps need which.
- **[`scripts/prompt/README.md`](scripts/prompt/README.md)** β prompt template conventions for LLM-backed methods.
---
## Citation
If you use narRaters in research, please cite the archived release.
**Reference list (APA 7):**
> Li, X. (2026). *narRaters: Naturalistic narratives processing platform* (Version 0.3.14) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.20486080
Replace the version number with the release you used (see [Zenodo](https://doi.org/10.5281/zenodo.20486080) for the latest).
**Examples in a manuscript:**
*Methods β in-text:*
> Narrative recall data were processed with narRaters (Li, 2026).
*Methods β first mention (optional):*
> We used narRaters (Li, 2026), an open-source pipeline for transcribing, segmenting, parsing, matching, and rating narrative recall data, with human review at each step.
*Software / code availability:*
> narRaters (Version 0.3.14) is available at https://doi.org/10.5281/zenodo.20486080.
*Data processing statement:*
> Story events, parsed recall clauses, recall-to-event matches, and causal ratings were produced with narRaters (Li, 2026; https://doi.org/10.5281/zenodo.20486080).
---
## Acknowledgements
- **Janice Chen** for brainstorming the causal-rating step interface and for help testing and improving package functionality.
- **Gabi Kressin Palacios** and **Dhruva Arekar** for an additional method for the recall-matching step (matching human recall text to story events). See [GabrielKP/rMatch](https://github.com/GabrielKP/rMatch) for human-dataβvalidated AI-assisted recall rating.
- **Xiyu Li (Rita)** for contributions to the `recall_rating` prompt development and for validating model performance on human recall data (commercial LLM APIs were close to human raters).
- **Sebastian Michelmann** for feedback on the event-segmentation step (see [Michelmann et al., 2023](https://arxiv.org/abs/2301.10297)).
- **Colette Youstra** and **[Quinton Covington](https://qcovington.com)** for testing the app's manual-rating functions.
- **Samira Tavassoli** and **Yuye Huang** for help testing the app's segmentation and causal-reasoning functions.
---
## Author
**[Xian Li](https://www.xian-li.com)** β [xianl.cogneuro@gmail.com](mailto:xianl.cogneuro@gmail.com)
---
## License
See **[LICENSE](LICENSE)** β **narRaters Research and Non-Commercial License**. Free for research, education, and other non-commercial use; commercial or for-profit use requires prior written permission. Contact [xianl.cogneuro@gmail.com](mailto:xianl.cogneuro@gmail.com) for commercial licensing.