https://github.com/xianneuro/narraters

Human-in-the-loop pipeline and web UI for narrative recall analysis (transcribe, segment, correct, parse, match, causal rate).
https://github.com/xianneuro/narraters
cognitive-neuroscience event-segmentation flask free-recall human-rater memory-research narrative-recall nlp open-source python
Last synced: about 1 month ago
JSON representation
Human-in-the-loop pipeline and web UI for narrative recall analysis (transcribe, segment, correct, parse, match, causal rate).
Host: GitHub
URL: https://github.com/xianneuro/narraters
Owner: xianNeuro
License: other
Created: 2026-05-16T02:48:47.000Z (2 months ago)
Default Branch: main
Last Pushed: 2026-05-31T22:55:18.000Z (about 2 months ago)
Last Synced: 2026-06-01T00:19:13.651Z (about 2 months ago)
Topics: cognitive-neuroscience, event-segmentation, flask, free-recall, human-rater, memory-research, narrative-recall, nlp, open-source, python
Language: Python
Homepage: https://xianneuro.github.io/narRaters/
Size: 101 MB
Stars: 1
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
- Citation: CITATION.cff
Awesome Lists containing this project

README

          


  



narRaters


Turn complex narratives into structured, reviewable data — with a web UI at every step


GitHub: github.com/xianNeuro/narRaters · PyPI: v0.3.14




  

  

  

  





  🏠 Project home

  ·

  📦 v0.3.14

  ·

  📖 Tutorial (PDF)

  ·

  🐛 Issues

  ·

  📚 Cite

  ·

  💬 Feedback





**narRaters** (*narrative* + *raters*) is an open-source software on [GitHub (xianNeuro/narRaters)](https://github.com/xianNeuro/narRaters) that helps process complex narratives (e.g., audio book, text-based stories, interviews, conversations, etc.) for memory, language processing, causal reasoning, and LLM research.

Imagine you ran a memory study: participants listened to a story, then recalled what they remembered (spoken or typed). Before you can analyze memory, you need structured data — what happened in the story, what each person recalled, and how those pieces connect.

**narRaters** helps you get there. It runs common narrative-processing steps (transcribe audio, split a story into events, clean up recall text, parse recalls into clauses, match recalls back to story events, rate causal links between events) and gives you a web interface to review and fix outputs before exporting.

Works for audio or text, stories or other long narratives (including movie annotations), and human-only or human-vs-LLM workflows.

| You have… | narRaters helps you… |

|---|---|

| Story audio or transcript | Transcribe it and break it into numbered **events** |

| Participant recall files | Correct spelling, split into **clauses**, and **match** each clause to story events |

| A segmented story | **Rate causal links** between event pairs (did event A lead to event B?) |

| Automated or AI outputs | **Screen and edit** them in the browser, then export signed-off files |



  



---



## Get started in 3 steps



  

    1

    Download & open
Get the v0.3.14 ZIP, unzip, and double-click narRater.app (macOS) or narRaters_installer.bat (Windows). Needs Python 3.10+.

  

  

    2

    Pick your pipeline
Your browser opens to the pipeline builder. Drag in only the steps you need (e.g. segment → match → causal rate). Bundled demo data is already loaded so you can explore immediately.

  

  

    3

    Run, review, export
On the dashboard, click a cell to run a step. Open the magnifying-glass icon to inspect results, edit in the browser, and export when you are satisfied.

  

Or via terminal (Python 3.10+; no ZIP download). Run from the folder that contains your data/ and output/ directories (or set NARRATERS_PROJECT_ROOT to that path):


```bash

python --version                              # must show 3.10 or newer

python3 -m pip install narraters --upgrade    # wait for “Successfully installed”

cd /path/to/your/project                      # folder with data/ and output/

narraters serve                               # browser opens to the pipeline builder

```

Then continue with **steps 2–3** above — pick your pipeline, run steps on the dashboard, review, and export.

> **First time?** Follow the illustrated **[Tutorial PDF](narRater_Tutorial.pdf)** or see [Installation](#installation) and [Troubleshooting](#troubleshooting).

---



## See the app





  

  


  ① Build a pipeline  →  ② Dashboard  →  ③ Rate causal links



  

    

      
① Pipeline dashboard

      


      _{See every subject/story, run steps, and open results. Green = done; click a cell to process.}

    

    

      ② Event segmentation

      


      _{Move the cursor through the text and click to drop boundary bars. Toggle binary or 1–5 strength (bar colored blue→red).}

    

    

      ③ Recall matching

      


      _{Story events on the left; recall segments on the right. Click a segment, then click events to match — or type event numbers. Optionally turn on Further ratings for per-segment quality checkboxes.}

    

    

      ④ Causal rating

      


      _{Click a grid cell to rate how strongly one story event caused another (0–3 scale).}

    

  

---



## Table of contents





What is narRaters?

Get started in 3 steps

See the app



Installation



ZIP download (double-click launcher)

PyPI (terminal)

Alternate install (command line)

Using the web UI

Troubleshooting







Where to put your data



Example input/output data





Pipeline overview



Command-line pipeline



Step 1 — transcribe

Step 2 — segment

Step 3 — correct

Step 4 — parse

Step 5 — match

Step 6 — rate







Contributing tab



Research background

Prompt templates

Acknowledgements

Author





Library / Python use



Project layout



Folder structure





Further reading

Citation

Acknowledgements

Author

License



> On GitHub, **README**, **Contributing** (research background, prompt templates, acknowledgements, author), and **License** are the tabs in the bar above. Use this table of contents or the **Outline** menu (list icon, top-right) to jump between README sections.

---



## Installation



Needs **[Python 3.10+](https://www.python.org/downloads/)**. Windows: check **“Add python.exe to PATH”** in the Python installer. If anything fails, see **[Troubleshooting](#troubleshooting)**.

### ZIP download (double-click launcher)

1. **[Download the ZIP (v0.3.14)](https://github.com/xianNeuro/narRaters/archive/refs/tags/v0.3.14.zip)** and unzip it — or on the [GitHub repo page](https://github.com/xianNeuro/narRaters) use green **Code ▾** → **Download ZIP** for the current `main` branch. You'll get **`narRaters-0.3.14`**, **`narRaters-main`**, or **`narRaters`** (if you used `git clone`).

2. **Launch:** **macOS** — double-click **`narRater.app`**. **Windows** — double-click **`narRaters_installer.bat`**. **Linux** — in Terminal, `cd` into the folder and run `bash install.sh`.

3. Your browser opens **`http://127.0.0.1:5000/pipeline-config`** with bundled examples. Put your data in **`data/`**. Restart later by double-clicking the same launcher.

**macOS Gatekeeper or quarantine issues?** See **[Troubleshooting](#troubleshooting)**.

### PyPI (terminal)

```bash

python --version                              # must show 3.10 or newer

python3 -m pip install narraters --upgrade    # use python3 -m pip, not bare pip

narraters serve                               # browser opens to the pipeline builder

```

On first launch, example **`data/`** and **`output/`** folders are copied into whatever directory you run from (unless you already have a project folder, or set **`NARRATERS_PROJECT_ROOT`**). Package: [`narraters`](https://pypi.org/project/narraters/) (all lowercase). For launchers and the tutorial PDF, use the ZIP install above.

Full PyPI setup (venv from scratch)

```bash

python3 --version        # must be 3.10 or newer

mkdir -p ~/narRaters-demo && cd ~/narRaters-demo

python3 -m venv .venv

source .venv/bin/activate     # Windows: .venv\Scripts\activate

python3 -m pip install --upgrade pip

python3 -m pip install narraters --upgrade

narraters serve

```

### Alternate install (command line)

git clone + install.sh (project folder with bundled examples)

```bash

# macOS / Linux

cd ~ && git clone https://github.com/xianNeuro/narRaters.git && cd narRaters && bash install.sh

```

```bat

:: Windows

cd %USERPROFILE% && git clone https://github.com/xianNeuro/narRaters.git && cd narRaters && narRaters_installer.bat

```

This is what `narRater.app` does under the hood, just without the click. `git: command not found`? On macOS: `xcode-select --install`. On Windows: install [Git for Windows](https://git-scm.com/download/win).

Optional extras (Whisper, cloud APIs, local Gemma, etc.)

Inside the project folder, with the venv activated:

```bash

python3 -m pip install -e ".[audio]"     # Whisper transcription

python3 -m pip install -e ".[api]"       # Anthropic / OpenAI

python3 -m pip install -e ".[nlp]"       # spaCy segmentation

python3 -m pip install -e ".[grammar]"   # grammar checker

python3 -m pip install -e ".[local-llm]" # local Gemma

python3 -m pip install -e ".[match]"     # rmatch

python3 -m pip install -e ".[all]"       # api + match

```

PyPI users: `python3 -m pip install "narraters[audio]"`, etc.

Heavy methods (`audio`, `local-llm`, `match`) pull multi-GB packages — the app shows a RAM/disk preflight before downloading. **Ollama (local Gemma):** install [Ollama](https://ollama.com), then `ollama pull gemma4:e4b`. **API keys:** copy `.env.example` to `.env` and edit (see [`SETUP_API.md`](SETUP_API.md)).

Developers

`install.sh` already does an editable install. To work on the codebase:

```bash

git clone https://github.com/xianNeuro/narRaters.git

cd narRaters

python3 -m venv .venv && source .venv/bin/activate && python3 -m pip install -e .

```

Build the standalone macOS app for icon testing: `bash packaging/macos/build_app_bundle.sh`.  Build the slim repo-root launcher: `bash packaging/macos/build_repo_app.sh`.

### Using the web UI

The app runs at **`http://127.0.0.1:5000`**. First visit opens **pipeline configuration**; if you already saved a pipeline, you land on the **dashboard**.

| Screen | Route | What you do there |

|--------|--------|-------------------|

| **Pipeline setup** | `/pipeline-config` | Drag steps into **Pipeline Flow**, set per-step **folders**, enter a **rater name** (or 🎲). **Continue** saves config and opens the dashboard. |

| **Dashboard** | `/` | Grid: **rows** = subjects or stories, **columns** = steps. **Click a cell** to run that step (pick **method / model / prompt** when offered). **Batch** runs one step across all rows. |

| **Detail view** | `/subject/…` or `/story/…` | **Tabs** per step for **one** row. Use the **version** dropdown to compare automated output vs your **`{id}_{ratername}-edit`** saves, then **edit** and **save**. |

**Flow:** setup → dashboard (bulk runs) → open a row to **inspect, hand-correct, or compare versions**.

narraters serve options

| Flag | Default | Purpose |

|---|---|---|

| `--port` | `5000` | Another port if `5000` is busy |

| `--host` | `127.0.0.1` | Bind address; use `0.0.0.0` only on a **trusted** network |

| `--no-browser` | off | Do not open a browser tab (SSH, headless) |

| `--debug` | off | Flask debug / auto-reload while hacking on the server |

```bash

narraters serve --port 8080 --no-browser

```

Before a step would load **Whisper**, **Gemma via Ollama**, **rMatch**, or other heavy local models, the UI runs a **RAM / disk preflight** and may suggest a lighter method (`rules`, `test`, `clause`) if the run looks unsafe for your machine.

### Troubleshooting

| If you see… | Do this |

|--------------|--------|

| `Python 3.10+ required` | Install [Python 3.10+](https://www.python.org/downloads/), close and reopen any Terminal, run again. |

| Blank page on `localhost:5000` | Visit **`http://127.0.0.1:5000/pipeline-config`** instead (IPv6/IPv4 quirk on some Macs). |

| **macOS:** Gatekeeper / “cannot check for malicious software” / no **Open** in the right-click menu | **1.** In **Finder**, try **control-click** **`narRater.app`** → **Open**, then confirm **Open** if the dialog offers it — [Apple’s Gatekeeper overrides](https://support.apple.com/guide/mac-help/mh40617/mac). **2.** If that path is missing or still blocks: **System Settings** → **Privacy & Security** → scroll to **Security** — after a failed launch, macOS often shows **`narRater` was blocked** (wording varies) with **Allow Anyway** or **Open Anyway**; click it, enter your password, then launch **`narRater.app`** again (that button may only appear for a limited time after the block). **3.** Downloaded folder still quarantined: in Terminal, `xattr -dr com.apple.quarantine /path/to/narRaters-main`, then try **1** or **2** again. |

| **macOS:** “narRater couldn't find the narRaters project folder” | macOS **App Translocation** ran the app from a temp copy. Run `xattr -dr com.apple.quarantine ~/Downloads/narRaters-main` (adjust path) and double-click again, or use [git clone install](#alternate-install-command-line). |

| **Windows:** SmartScreen warns about `narRaters_installer.bat` | Click **More info** → **Run anyway**. |

| Port 5000 already in use | The installer auto-tries 5001–5010 and prints the URL. To free 5000: macOS → System Settings → General → AirDrop & Handoff → turn off **AirPlay Receiver**. |

---



## Where to put your data



After [installation](#installation), place files so the paths match what you configured on the **pipeline** page (defaults below are relative to the **project root**). You can **remap** any step’s input/output folders there without moving data.

| You have… | Put it in… | Format / naming |

|---|---|---|

| Story transcript (text) | `data/2_story_transcript/` | `{story}.txt` — plain UTF-8 text, one story per file |

| Story event list (pre-segmented) | `data/3_story_events/` | `{story}_events.xlsx` — columns `event`, `story_texts` |

| Subject recall text | `data/5_recall_texts/` | `{subj_id}.txt` — e.g. `the_siren_sub-01.txt` |

| Story audio (optional, Step 1) | `data/1_story_audio/` | `.wav` / `.mp3` / `.m4a`, named by story |

| Recall audio (optional, Step 1) | `data/4_recall_audio/` | `.wav` / `.mp3` / `.m4a`, named by subject |

Outputs are written under `output/` — one subdirectory per step (`output/recall_corrected/`, `output/recall_parsed/`, `output/recall_rated/`, …). A smaller alternate layout lives in **`demo/data/`** (lighthouse story, three recall `.txt` files).

### Example input/output data

The repository ships **realistic sample inputs and outputs** under `data/` and `output/` so you can see accepted naming and file types before adding your own study. Your private files in those folders stay untracked (see `.gitignore`); only the examples below are committed.

**Stories:** **`pieman_edited`** (story audio + transcript + events) and **`the_siren`** (transcript, events, two recall subjects).

| Role | Folder | Example file(s) |

|------|--------|-----------------|

| Story audio (input) | `data/1_story_audio/` | `pieman_edited.wav` |

| Story transcript (input) | `data/2_story_transcript/` | `pieman_edited.txt`, `the_siren.txt` |

| Story events (input) | `data/3_story_events/` | `pieman_edited_events.xlsx`, `the_siren_events.xlsx` |

| Recall audio (input) | `data/4_recall_audio/` | Your own `.wav` / `.mp3` / `.m4a` / `.mp4` (not shipped publicly) |

| Recall text (input) | `data/5_recall_texts/` | `the_siren_sub-01.txt`, `the_siren_sub-02.txt` |

| Story transcription (output) | `output/story_audio-transcribed/` | `pieman_edited.txt` |

| Recall transcription (output) | `output/recall_audio-transcribed/` | `the_siren_sub-01.txt`, `the_siren_sub-02.txt` |

| Spell/grammar correction (output) | `output/recall_corrected/` | `the_siren_sub-01.txt`, `the_siren_sub-02.txt` |

| Parsed recall (output) | `output/recall_parsed/` | `the_siren_sub-01_parsed.xlsx`, `the_siren_sub-02_parsed.xlsx` |

| Recall ↔ events (output) | `output/recall_rated/` | `the_siren_sub-02_rate-recall-test_mode.xlsx` (method slug in filename) |

| Causal ratings (output) | `output/causal_rated/` | `pieman_edited_causal-linguistic.xlsx`, `the_siren_causal-linguistic.xlsx` |

**Quick try:** after install, point a pipeline at the default folders above and run **`sentenceCorrect` → `textParsing` → `textMatching`** on `the_siren_sub-01` / `the_siren_sub-02`, or open the bundled **`output/`** files in Excel to inspect column layouts. Story **`pieman_edited`** is useful for **`audioTranscribe`** (large `.wav`) and **`causalRating`** on `pieman_edited_events.xlsx`.

**File versioning is a core feature.** Automated runs write `{subj_id}_{method}.ext` (or `{story}_…` for story-level steps); your hand-edited versions are saved as `{subj_id}_{ratername}-edit.ext` and never overwrite the originals. The web UI lets you switch between versions via a dropdown, and the `-edit` files are what you export for analysis.

---



## Pipeline overview



**Six optional steps — use any subset, in any order.** Each step can run automatically (rules, local models, or cloud APIs) and then be reviewed in the browser.

| Plain English | Step ID | Input → output (typical) |

|---|---|---|

| Transcribe audio | **`audioTranscribe`** | audio file → text transcript |

| Split story into events | **`eventSegment`** | story transcript → numbered event list |

| Fix recall spelling/grammar | **`sentenceCorrect`** | raw recall text → corrected text |

| Split recall into clauses | **`textParsing`** | corrected recall → clause segments |

| Match recall to story | **`textMatching`** | recall segments + story events → rated matches |

| Rate event causality | **`causalRating`** | story events → cause–effect ratings |

Full step reference (commands & folders)

In typical recall work, **`audioTranscribe`** / **`eventSegment`** target the **story**, **`sentenceCorrect`**–**`textMatching`** each **subject recall**, and **`causalRating`** the **story event list** — but text-only projects skip Step 1, and you can equally run just **`eventSegment` + `causalRating`** or **`sentenceCorrect` → `textParsing` → `textMatching`**. Every step is available from the **GUI** or **`narraters` CLI**, has a lightweight default method, and supports hand-editing afterward.

| # | Step ID | What it does | Terminal command | Default in / out |

|---|---------|--------------|------------------|------------------|

| 1 | **`audioTranscribe`** | Audio recordings → text (Whisper/WhisperX); story vs recall via `audioScope` or `--kind` | `narraters transcribe` | `data/4_recall_audio/` (or `data/1_story_audio/` with `--kind story`) → `output/*_audio-transcribed/` |

| 2 | **`eventSegment`** | Story transcript → numbered events | `narraters segment` | `data/2_story_transcript/` → `data/3_story_events/` |

| 3 | **`sentenceCorrect`** | Fix spelling/grammar in recall text (no rewriting) | `narraters correct` | `data/5_recall_texts/` → `output/recall_corrected/` |

| 4 | **`textParsing`** | Corrected recall → clause-level segments | `narraters parse` | `output/recall_corrected/` → `output/recall_parsed/` |

| 5 | **`textMatching`** | Recall segments ↔ story events | `narraters match` | `output/recall_parsed/` + `data/3_story_events/` → `output/recall_rated/` |

| 6 | **`causalRating`** | Causal strength of every story-event pair | `narraters rate` | `data/3_story_events/` → `output/causal_rated/` |

For each step, the GUI runs the same backends as the CLI. **Available methods, flags, and examples** are under **[Command-line pipeline](#command-line-pipeline)** below.

---



## Command-line pipeline



Each of the six steps is a separate **`narraters`** subcommand with its own **`--method`** (and related options). Use the CLI for **scripts**, **clusters**, or **reproducible** runs—**with or without** the web UI, and **with any subset** of steps your study uses. General shape:

```

narraters  [--method METHOD] [--model MODEL] [-i INPUT] [-o OUTPUT] [--prompt-version VERSION] ...

```

Discover what's available at any time:

```bash

narraters --help                 # list all subcommands

narraters  --help          # step-specific options

narraters segment --list-prompts # list available prompt versions for a step

narraters segment --list-models  # list supported model identifiers

```

The method choices below are exactly those accepted by the CLI (`src/narraters/cli.py`).

### Step 1 — `transcribe` (audio → text)

```bash

narraters transcribe --model large-v3 --timestamps          # recall audio (default)

narraters transcribe --kind story --model small              # story audio instead

narraters transcribe -i path/to/audio -o path/to/out         # custom directories

narraters transcribe --filter sub-01                         # one item only

```

| Option | Choices | Notes |

|---|---|---|

| `--model` | `tiny`, `base`, `small`, `medium`, `large-v2`, `large-v3` | Whisper model name |

| `--timestamps` | flag | Also write Excel files with word-level timestamps |

| `--kind` | `recall` (default), `story` | Picks the conventional directories: `recall` = `data/4_recall_audio/` → `output/recall_audio-transcribed/`; `story` = `data/1_story_audio/` → `output/story_audio-transcribed/` |

| `-i, --input` | path | Input audio directory (overrides the `--kind` default) |

| `-o, --output` | path | Output directory (overrides the `--kind` default) |

| `--filter` | substring | Only transcribe files whose name matches this item id |

Requires `pip install "narraters[audio]"` (or `pip install -e ".[audio]"` from a clone). Text-only projects can skip Step 1 entirely.

### Step 2 — `segment` (story → events)

```bash

narraters segment --method clause

narraters segment --method api --model  --prompt-version event_segment

narraters segment --method fine --input data/2_story_transcript/my_story.txt

```

Run `narraters segment --list-models` for the exact `--model` strings (Anthropic, OpenAI, and Ollama-backed presets).

| Option | Choices | Notes |

|---|---|---|

| `--method` | `clause`, `fine`, `coarse`, `api` | `clause` needs no model; `fine`/`coarse` use spaCy if installed; `api` calls an LLM |

| `--model` | see `narraters segment --list-models` | Only used with `--method api` (Anthropic, OpenAI, or Ollama preset keys) |

| `--prompt-version` | see `--list-prompts` | Selects a template from `scripts/prompt/event_segment*.txt` |

| `-i, --input` | path | Single transcript file or a directory (else processes all) |

| `-o, --output` | path | Output directory (default: `data/3_story_events/`) |

### Step 3 — `correct` (spell / grammar fixes)

```bash

narraters correct --method rules

narraters correct --method gemma-ollama --ollama-model gemma4:e4b

```

| Option | Choices | Notes |

|---|---|---|

| `--method` | `rules`, `gemma-ollama` | `rules` runs entirely locally with no model; `gemma-ollama` needs a local Ollama server |

| `--ollama-model` | e.g. `gemma4:e4b` | Local Ollama model tag (with `gemma-ollama`) |

| `--prompt-file` | path | Override the instructions file (default: `scripts/prompt/spell_gram.txt`) |

| `-i, --input` | path | Single recall text file |

| `-o, --output` | path | Output directory |

Minimal corrections only — Step 3 fixes spelling/grammar errors and never rewrites or paraphrases.

### Step 4 — `parse` (recall text → clause-level segments)

```bash

narraters parse --method rules

narraters parse --method ollama --model gemma4:e4b --prompt-version recall_parse_clause

narraters parse --filter-pattern sub-02            # process one subject only

```

| Option | Choices | Notes |

|---|---|---|

| `--method` | `rules`, `ollama` | `rules` is the default (regex, no model); `ollama` uses local Gemma |

| `--model` | e.g. `gemma4:e4b` | Ollama model tag (with `--method ollama`) |

| `--prompt-version` | see `scripts/prompt/recall_parse_*.txt` | Prompt template name |

| `-i, --input` | path | Input directory (default: `output/recall_corrected/`) |

| `-o, --output` | path | Output directory (default: `output/recall_parsed/`) |

| `--filter-pattern` | substring | Optional filter to process a single subject |

### Step 5 — `match` (recall segments ↔ story events)

```bash

narraters match --test-mode                       # simulated keyword matching, no model/API

narraters match --method api --story-events data/3_story_events

narraters match --method gemma-ollama

narraters match --method rmatch                   # embedding matcher (requires [match])

```

| Option | Choices | Notes |

|---|---|---|

| `--method` | `test`, `api`, `gemma-ollama`, `rmatch` | `test` is keyword-based, free, and always available; `rmatch` needs `pip install "narraters[match]"` |

| `--story-events` | path | Directory of `{story}_events.xlsx` (default: `data/3_story_events`) |

| `-i, --input` | path | Recall-parsed input directory (default: `output/recall_parsed/`) |

| `-o, --output` | path | Output directory (default: `output/recall_rated/`) |

| `--test-mode` | flag | Equivalent to `--method test` — simulated matching, no API calls |

### Step 6 — `rate` (causal relationships between event pairs)

```bash

narraters rate --method linguistic

narraters rate --method api --model  --prompt-version causal_rating

narraters rate --method manual                    # write an empty matrix for hand rating

```

Use `narraters rate --help` and the Step 6 model dropdown in the web UI for supported `--model` values when using `--method api`.

| Option | Choices | Notes |

|---|---|---|

| `--method` | `linguistic`, `api`, `manual` | `linguistic` is rule-based (no model); `manual` scaffolds an N×N matrix to fill in by hand |

| `--model` | see web UI / provider docs | Only used with `--method api` |

| `--prompt-version` | see `scripts/prompt/causal_rating*.txt` | Prompt template name |

| `-i, --input` | path | Input file/directory |

| `-o, --output` | path | Output directory |

---



## Library / Python use



```python

from narraters import __version__, project_root

print(__version__, project_root())

```

Direct per-step imports are planned for a future release; for now, programmatic use should call the CLI via `subprocess` or import the modules under `scripts/`.

---



## Project layout



After unzipping, everything lives under a single **`narRaters/`** project root. Paths, contents, and naming conventions:

### Folder structure

```text

narRaters/

├── README.md                    # This file — user guide & pipeline docs

├── CONTRIBUTING.md              # Research background, prompt templates, acknowledgements, author

├── LICENSE

├── narRater_Tutorial.pdf        # Illustrated web UI tour

├── narRater.app                 # macOS double-click launcher

├── narRaters_installer.bat      # Windows launcher

├── install.sh                   # macOS / Linux installer

├── pyproject.toml               # Package metadata & pip extras

├── SETUP_API.md, .env.example   # API key setup

│

├── data/                        # Inputs (see Where to put your data)

│   ├── 1_story_audio/           # Optional Step 1 — story audio

│   │   └── {story}.wav | .mp3 | .m4a

│   ├── 2_story_transcript/      # Story text

│   │   └── {story}.txt          # plain UTF-8, one story per file

│   ├── 3_story_events/          # Pre-segmented or segmented story events

│   │   └── {story}_events.xlsx  # columns: event, story_texts

│   ├── 4_recall_audio/          # Optional Step 1 — recall audio

│   │   └── {subj_id}.wav | .mp3 | .m4a | .mp4

│   └── 5_recall_texts/          # Recall text

│       └── {subj_id}.txt        # e.g. the_siren_sub-01.txt

│

├── output/                      # Pipeline outputs (one subfolder per step)

│   ├── story_audio-transcribed/ # Step 1 (story) — {story}.txt

│   ├── recall_audio-transcribed/# Step 1 (recall) — {subj_id}.txt

│   ├── recall_corrected/        # Step 3 — {subj_id}.txt

│   ├── recall_parsed/           # Step 4 — {subj_id}_parsed.xlsx

│   ├── recall_rated/            # Step 5 — {subj_id}_{method}.xlsx

│   └── causal_rated/            # Step 6 — {story}_causal-{method}.xlsx

│

├── scripts/                     # Pipeline backends (CLI & web UI call these)

│   ├── 1_audio-transcribe.py    # audioTranscribe

│   ├── 2_story-event-segment.py # eventSegment

│   ├── 3_spell-grammar-correct.py # sentenceCorrect

│   ├── 4_parse-texts.py         # textParsing

│   ├── 5_recall-rater.py        # textMatching

│   ├── 6_causal-rater.py        # causalRating

│   └── prompt/                  # LLM prompt templates (.txt)

│       ├── event_segment.txt

│       ├── spell_gram.txt

│       ├── recall_parse_clause.txt

│       ├── recall_rating.txt

│       └── causal_rating.txt

│

├── server/                      # Flask web UI

│   ├── web-interface.py         # Routes & subprocess orchestration

│   └── START_HERE.command       # macOS launcher script

│

├── templates/                   # Web UI HTML (pipeline, dashboard, subject/story)

├── static/                      # CSS, JS, app icon

│

├── src/narraters/               # pip package

│   ├── cli.py                   # narraters command entry point

│   ├── paths.py                 # Project-root resolution

│   └── runtime_install.py       # Bundled-example copy on first serve

│

├── helpers/                     # Shared utilities & smoke tests

│   ├── software_paths.py        # Canonical path resolution

│   ├── step_files.py            # Flexible step input/output file recognition

│   ├── resource_preflight.py    # RAM / disk checks for heavy methods

│   └── test_*.py                # Pipeline validation scripts

│

├── docs/                        # GitHub Pages site & README assets

│   ├── index.html               # Project landing page

│   └── screenshots/             # README GIFs (+ recall-matching.png for site og:image)

│

├── demo/                        # Smaller lighthouse example

│   ├── data/                    # the_lighthouse transcript + recall texts

│   └── output/                  # Sample outputs for the demo story

│

├── developer/                   # Contributor handbook & tooling

│   ├── README.md                # Per-step I/O contracts & design principles

│   └── SETUP_API.md             # API key setup (developer copy)

│

└── packaging/macos/             # App bundle / DMG build scripts

    └── build_app_bundle.sh

```

Bundled examples: **`pieman_edited`**, **`the_siren`** — see [Example input/output data](#example-inputoutput-data).

**Versioning:** automated files use `{id}_{method}.ext`; hand-edited exports use `{id}_{ratername}-edit.ext` (never overwritten).

---



## Further reading



- **[Project home (GitHub Pages)](https://xianneuro.github.io/narRaters/)** — landing page for search and sharing.

- **[`narRater_Tutorial.pdf`](narRater_Tutorial.pdf)** — illustrated, click-by-click tour of the web UI; good next step after [Installation](#installation).

- **[`SETUP_API.md`](SETUP_API.md)** — API keys for Anthropic, OpenAI, and Hugging Face; which pipeline steps need which.

- **[`scripts/prompt/README.md`](scripts/prompt/README.md)** — prompt template conventions for LLM-backed methods.

---



## Citation



If you use narRaters in research, please cite the archived release.

**Reference list (APA 7):**

> Li, X. (2026). *narRaters: Naturalistic narratives processing platform* (Version 0.3.14) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.20486080

Replace the version number with the release you used (see [Zenodo](https://doi.org/10.5281/zenodo.20486080) for the latest).

**Examples in a manuscript:**

*Methods — in-text:*

> Narrative recall data were processed with narRaters (Li, 2026).

*Methods — first mention (optional):*

> We used narRaters (Li, 2026), an open-source pipeline for transcribing, segmenting, parsing, matching, and rating narrative recall data, with human review at each step.

*Software / code availability:*

> narRaters (Version 0.3.14) is available at https://doi.org/10.5281/zenodo.20486080.

*Data processing statement:*

> Story events, parsed recall clauses, recall-to-event matches, and causal ratings were produced with narRaters (Li, 2026; https://doi.org/10.5281/zenodo.20486080).

---



## Acknowledgements



- **Janice Chen** for brainstorming the causal-rating step interface and for help testing and improving package functionality.

- **Gabi Kressin Palacios** and **Dhruva Arekar** for an additional method for the recall-matching step (matching human recall text to story events). See [GabrielKP/rMatch](https://github.com/GabrielKP/rMatch) for human-data–validated AI-assisted recall rating.

- **Xiyu Li (Rita)** for contributions to the `recall_rating` prompt development and for validating model performance on human recall data (commercial LLM APIs were close to human raters).

- **Sebastian Michelmann** for feedback on the event-segmentation step (see [Michelmann et al., 2023](https://arxiv.org/abs/2301.10297)).

- **Colette Youstra** and **[Quinton Covington](https://qcovington.com)** for testing the app's manual-rating functions.

- **Samira Tavassoli** and **Yuye Huang** for help testing the app's segmentation and causal-reasoning functions.

---



## Author



**[Xian Li](https://www.xian-li.com)** — [xianl.cogneuro@gmail.com](mailto:xianl.cogneuro@gmail.com)

---



## License



See **[LICENSE](LICENSE)** — **narRaters Research and Non-Commercial License**. Free for research, education, and other non-commercial use; commercial or for-profit use requires prior written permission. Contact [xianl.cogneuro@gmail.com](mailto:xianl.cogneuro@gmail.com) for commercial licensing.