An open API service indexing awesome lists of open source software.

https://github.com/xianneuro/narraters

Human-in-the-loop pipeline and web UI for narrative recall analysis (transcribe, segment, correct, parse, match, causal rate).
https://github.com/xianneuro/narraters

cognitive-neuroscience event-segmentation flask free-recall human-rater memory-research narrative-recall nlp open-source python

Last synced: 21 days ago
JSON representation

Human-in-the-loop pipeline and web UI for narrative recall analysis (transcribe, segment, correct, parse, match, causal rate).

Awesome Lists containing this project

README

          


narRaters app icon

narRaters

Turn complex narratives into structured, reviewable data β€” with a web UI at every step

GitHub: github.com/xianNeuro/narRaters Β· PyPI: v0.3.14


PyPI version
GitHub stars
License
Issues


🏠 Project home
Β·
πŸ“¦ v0.3.14
Β·
πŸ“– Tutorial (PDF)
Β·
πŸ› Issues
Β·
πŸ“š Cite
Β·
πŸ’¬ Feedback

**narRaters** (*narrative* + *raters*) is an open-source software on [GitHub (xianNeuro/narRaters)](https://github.com/xianNeuro/narRaters) that helps process complex narratives (e.g., audio book, text-based stories, interviews, conversations, etc.) for memory, language processing, causal reasoning, and LLM research.

Imagine you ran a memory study: participants listened to a story, then recalled what they remembered (spoken or typed). Before you can analyze memory, you need structured data β€” what happened in the story, what each person recalled, and how those pieces connect.

**narRaters** helps you get there. It runs common narrative-processing steps (transcribe audio, split a story into events, clean up recall text, parse recalls into clauses, match recalls back to story events, rate causal links between events) and gives you a web interface to review and fix outputs before exporting.

Works for audio or text, stories or other long narratives (including movie annotations), and human-only or human-vs-LLM workflows.

| You have… | narRaters helps you… |
|---|---|
| Story audio or transcript | Transcribe it and break it into numbered **events** |
| Participant recall files | Correct spelling, split into **clauses**, and **match** each clause to story events |
| A segmented story | **Rate causal links** between event pairs (did event A lead to event B?) |
| Automated or AI outputs | **Screen and edit** them in the browser, then export signed-off files |


Typical workflow: story side (transcribe, segment, causal rate) and recall side (correct, parse, match to story)

---

## Get started in 3 steps


1
Download & open
Get the v0.3.14 ZIP, unzip, and double-click narRater.app (macOS) or narRaters_installer.bat (Windows). Needs Python 3.10+.


2
Pick your pipeline
Your browser opens to the pipeline builder. Drag in only the steps you need (e.g. segment β†’ match β†’ causal rate). Bundled demo data is already loaded so you can explore immediately.


3
Run, review, export
On the dashboard, click a cell to run a step. Open the magnifying-glass icon to inspect results, edit in the browser, and export when you are satisfied.

Or via terminal (Python 3.10+; no ZIP download). Run from the folder that contains your data/ and output/ directories (or set NARRATERS_PROJECT_ROOT to that path):

```bash
python --version # must show 3.10 or newer
python3 -m pip install narraters --upgrade # wait for β€œSuccessfully installed”
cd /path/to/your/project # folder with data/ and output/
narraters serve # browser opens to the pipeline builder
```

Then continue with **steps 2–3** above β€” pick your pipeline, run steps on the dashboard, review, and export.

> **First time?** Follow the illustrated **[Tutorial PDF](narRater_Tutorial.pdf)** or see [Installation](#installation) and [Troubleshooting](#troubleshooting).

---

## See the app


Animated walkthrough: building a pipeline, the dashboard status grid, and rating causal links between story events


β‘  Build a pipeline Β β†’Β  β‘‘ Dashboard Β β†’Β  β‘’ Rate causal links



β‘  Pipeline dashboard


Animated tour of the pipeline dashboard status grid

See every subject/story, run steps, and open results. Green = done; click a cell to process.


β‘‘ Event segmentation


Animated tour of segmenting a story into events by placing boundary bars

Move the cursor through the text and click to drop boundary bars. Toggle binary or 1–5 strength (bar colored blueβ†’red).


β‘’ Recall matching


Animated tour of linking recall segments to story events

Story events on the left; recall segments on the right. Click a segment, then click events to match β€” or type event numbers. Optionally turn on Further ratings for per-segment quality checkboxes.


β‘£ Causal rating


Animated tour of the causal rating grid

Click a grid cell to rate how strongly one story event caused another (0–3 scale).

---

## Table of contents

> On GitHub, **README**, **Contributing** (research background, prompt templates, acknowledgements, author), and **License** are the tabs in the bar above. Use this table of contents or the **Outline** menu (list icon, top-right) to jump between README sections.

---

## Installation

Needs **[Python 3.10+](https://www.python.org/downloads/)**. Windows: check **β€œAdd python.exe to PATH”** in the Python installer. If anything fails, see **[Troubleshooting](#troubleshooting)**.

### ZIP download (double-click launcher)

1. **[Download the ZIP (v0.3.14)](https://github.com/xianNeuro/narRaters/archive/refs/tags/v0.3.14.zip)** and unzip it β€” or on the [GitHub repo page](https://github.com/xianNeuro/narRaters) use green **Code β–Ύ** β†’ **Download ZIP** for the current `main` branch. You'll get **`narRaters-0.3.14`**, **`narRaters-main`**, or **`narRaters`** (if you used `git clone`).
2. **Launch:** **macOS** β€” double-click **`narRater.app`**. **Windows** β€” double-click **`narRaters_installer.bat`**. **Linux** β€” in Terminal, `cd` into the folder and run `bash install.sh`.
3. Your browser opens **`http://127.0.0.1:5000/pipeline-config`** with bundled examples. Put your data in **`data/`**. Restart later by double-clicking the same launcher.

**macOS Gatekeeper or quarantine issues?** See **[Troubleshooting](#troubleshooting)**.

### PyPI (terminal)

```bash
python --version # must show 3.10 or newer
python3 -m pip install narraters --upgrade # use python3 -m pip, not bare pip
narraters serve # browser opens to the pipeline builder
```

On first launch, example **`data/`** and **`output/`** folders are copied into whatever directory you run from (unless you already have a project folder, or set **`NARRATERS_PROJECT_ROOT`**). Package: [`narraters`](https://pypi.org/project/narraters/) (all lowercase). For launchers and the tutorial PDF, use the ZIP install above.

Full PyPI setup (venv from scratch)

```bash
python3 --version # must be 3.10 or newer
mkdir -p ~/narRaters-demo && cd ~/narRaters-demo
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
python3 -m pip install --upgrade pip
python3 -m pip install narraters --upgrade
narraters serve
```

### Alternate install (command line)

git clone + install.sh (project folder with bundled examples)

```bash
# macOS / Linux
cd ~ && git clone https://github.com/xianNeuro/narRaters.git && cd narRaters && bash install.sh
```

```bat
:: Windows
cd %USERPROFILE% && git clone https://github.com/xianNeuro/narRaters.git && cd narRaters && narRaters_installer.bat
```

This is what `narRater.app` does under the hood, just without the click. `git: command not found`? On macOS: `xcode-select --install`. On Windows: install [Git for Windows](https://git-scm.com/download/win).

Optional extras (Whisper, cloud APIs, local Gemma, etc.)

Inside the project folder, with the venv activated:

```bash
python3 -m pip install -e ".[audio]" # Whisper transcription
python3 -m pip install -e ".[api]" # Anthropic / OpenAI
python3 -m pip install -e ".[nlp]" # spaCy segmentation
python3 -m pip install -e ".[grammar]" # grammar checker
python3 -m pip install -e ".[local-llm]" # local Gemma
python3 -m pip install -e ".[match]" # rmatch
python3 -m pip install -e ".[all]" # api + match
```

PyPI users: `python3 -m pip install "narraters[audio]"`, etc.

Heavy methods (`audio`, `local-llm`, `match`) pull multi-GB packages β€” the app shows a RAM/disk preflight before downloading. **Ollama (local Gemma):** install [Ollama](https://ollama.com), then `ollama pull gemma4:e4b`. **API keys:** copy `.env.example` to `.env` and edit (see [`SETUP_API.md`](SETUP_API.md)).

Developers

`install.sh` already does an editable install. To work on the codebase:

```bash
git clone https://github.com/xianNeuro/narRaters.git
cd narRaters
python3 -m venv .venv && source .venv/bin/activate && python3 -m pip install -e .
```

Build the standalone macOS app for icon testing: `bash packaging/macos/build_app_bundle.sh`. Build the slim repo-root launcher: `bash packaging/macos/build_repo_app.sh`.

### Using the web UI

The app runs at **`http://127.0.0.1:5000`**. First visit opens **pipeline configuration**; if you already saved a pipeline, you land on the **dashboard**.

| Screen | Route | What you do there |
|--------|--------|-------------------|
| **Pipeline setup** | `/pipeline-config` | Drag steps into **Pipeline Flow**, set per-step **folders**, enter a **rater name** (or 🎲). **Continue** saves config and opens the dashboard. |
| **Dashboard** | `/` | Grid: **rows** = subjects or stories, **columns** = steps. **Click a cell** to run that step (pick **method / model / prompt** when offered). **Batch** runs one step across all rows. |
| **Detail view** | `/subject/…` or `/story/…` | **Tabs** per step for **one** row. Use the **version** dropdown to compare automated output vs your **`{id}_{ratername}-edit`** saves, then **edit** and **save**. |

**Flow:** setup β†’ dashboard (bulk runs) β†’ open a row to **inspect, hand-correct, or compare versions**.

narraters serve options

| Flag | Default | Purpose |
|---|---|---|
| `--port` | `5000` | Another port if `5000` is busy |
| `--host` | `127.0.0.1` | Bind address; use `0.0.0.0` only on a **trusted** network |
| `--no-browser` | off | Do not open a browser tab (SSH, headless) |
| `--debug` | off | Flask debug / auto-reload while hacking on the server |

```bash
narraters serve --port 8080 --no-browser
```

Before a step would load **Whisper**, **Gemma via Ollama**, **rMatch**, or other heavy local models, the UI runs a **RAM / disk preflight** and may suggest a lighter method (`rules`, `test`, `clause`) if the run looks unsafe for your machine.

### Troubleshooting

| If you see… | Do this |
|--------------|--------|
| `Python 3.10+ required` | Install [Python 3.10+](https://www.python.org/downloads/), close and reopen any Terminal, run again. |
| Blank page on `localhost:5000` | Visit **`http://127.0.0.1:5000/pipeline-config`** instead (IPv6/IPv4 quirk on some Macs). |
| **macOS:** Gatekeeper / β€œcannot check for malicious software” / no **Open** in the right-click menu | **1.** In **Finder**, try **control-click** **`narRater.app`** β†’ **Open**, then confirm **Open** if the dialog offers it β€” [Apple’s Gatekeeper overrides](https://support.apple.com/guide/mac-help/mh40617/mac). **2.** If that path is missing or still blocks: **System Settings** β†’ **Privacy & Security** β†’ scroll to **Security** β€” after a failed launch, macOS often shows **`narRater` was blocked** (wording varies) with **Allow Anyway** or **Open Anyway**; click it, enter your password, then launch **`narRater.app`** again (that button may only appear for a limited time after the block). **3.** Downloaded folder still quarantined: in Terminal, `xattr -dr com.apple.quarantine /path/to/narRaters-main`, then try **1** or **2** again. |
| **macOS:** β€œnarRater couldn't find the narRaters project folder” | macOS **App Translocation** ran the app from a temp copy. Run `xattr -dr com.apple.quarantine ~/Downloads/narRaters-main` (adjust path) and double-click again, or use [git clone install](#alternate-install-command-line). |
| **Windows:** SmartScreen warns about `narRaters_installer.bat` | Click **More info** β†’ **Run anyway**. |
| Port 5000 already in use | The installer auto-tries 5001–5010 and prints the URL. To free 5000: macOS β†’ System Settings β†’ General β†’ AirDrop & Handoff β†’ turn off **AirPlay Receiver**. |

---

## Where to put your data

After [installation](#installation), place files so the paths match what you configured on the **pipeline** page (defaults below are relative to the **project root**). You can **remap** any step’s input/output folders there without moving data.

| You have… | Put it in… | Format / naming |
|---|---|---|
| Story transcript (text) | `data/2_story_transcript/` | `{story}.txt` β€” plain UTF-8 text, one story per file |
| Story event list (pre-segmented) | `data/3_story_events/` | `{story}_events.xlsx` β€” columns `event`, `story_texts` |
| Subject recall text | `data/5_recall_texts/` | `{subj_id}.txt` β€” e.g. `the_siren_sub-01.txt` |
| Story audio (optional, Step 1) | `data/1_story_audio/` | `.wav` / `.mp3` / `.m4a`, named by story |
| Recall audio (optional, Step 1) | `data/4_recall_audio/` | `.wav` / `.mp3` / `.m4a`, named by subject |

Outputs are written under `output/` β€” one subdirectory per step (`output/recall_corrected/`, `output/recall_parsed/`, `output/recall_rated/`, …). A smaller alternate layout lives in **`demo/data/`** (lighthouse story, three recall `.txt` files).

### Example input/output data

The repository ships **realistic sample inputs and outputs** under `data/` and `output/` so you can see accepted naming and file types before adding your own study. Your private files in those folders stay untracked (see `.gitignore`); only the examples below are committed.

**Stories:** **`pieman_edited`** (story audio + transcript + events) and **`the_siren`** (transcript, events, two recall subjects).

| Role | Folder | Example file(s) |
|------|--------|-----------------|
| Story audio (input) | `data/1_story_audio/` | `pieman_edited.wav` |
| Story transcript (input) | `data/2_story_transcript/` | `pieman_edited.txt`, `the_siren.txt` |
| Story events (input) | `data/3_story_events/` | `pieman_edited_events.xlsx`, `the_siren_events.xlsx` |
| Recall audio (input) | `data/4_recall_audio/` | Your own `.wav` / `.mp3` / `.m4a` / `.mp4` (not shipped publicly) |
| Recall text (input) | `data/5_recall_texts/` | `the_siren_sub-01.txt`, `the_siren_sub-02.txt` |
| Story transcription (output) | `output/story_audio-transcribed/` | `pieman_edited.txt` |
| Recall transcription (output) | `output/recall_audio-transcribed/` | `the_siren_sub-01.txt`, `the_siren_sub-02.txt` |
| Spell/grammar correction (output) | `output/recall_corrected/` | `the_siren_sub-01.txt`, `the_siren_sub-02.txt` |
| Parsed recall (output) | `output/recall_parsed/` | `the_siren_sub-01_parsed.xlsx`, `the_siren_sub-02_parsed.xlsx` |
| Recall ↔ events (output) | `output/recall_rated/` | `the_siren_sub-02_rate-recall-test_mode.xlsx` (method slug in filename) |
| Causal ratings (output) | `output/causal_rated/` | `pieman_edited_causal-linguistic.xlsx`, `the_siren_causal-linguistic.xlsx` |

**Quick try:** after install, point a pipeline at the default folders above and run **`sentenceCorrect` β†’ `textParsing` β†’ `textMatching`** on `the_siren_sub-01` / `the_siren_sub-02`, or open the bundled **`output/`** files in Excel to inspect column layouts. Story **`pieman_edited`** is useful for **`audioTranscribe`** (large `.wav`) and **`causalRating`** on `pieman_edited_events.xlsx`.

**File versioning is a core feature.** Automated runs write `{subj_id}_{method}.ext` (or `{story}_…` for story-level steps); your hand-edited versions are saved as `{subj_id}_{ratername}-edit.ext` and never overwrite the originals. The web UI lets you switch between versions via a dropdown, and the `-edit` files are what you export for analysis.

---

## Pipeline overview

**Six optional steps β€” use any subset, in any order.** Each step can run automatically (rules, local models, or cloud APIs) and then be reviewed in the browser.

| Plain English | Step ID | Input β†’ output (typical) |
|---|---|---|
| Transcribe audio | **`audioTranscribe`** | audio file β†’ text transcript |
| Split story into events | **`eventSegment`** | story transcript β†’ numbered event list |
| Fix recall spelling/grammar | **`sentenceCorrect`** | raw recall text β†’ corrected text |
| Split recall into clauses | **`textParsing`** | corrected recall β†’ clause segments |
| Match recall to story | **`textMatching`** | recall segments + story events β†’ rated matches |
| Rate event causality | **`causalRating`** | story events β†’ cause–effect ratings |

Full step reference (commands & folders)

In typical recall work, **`audioTranscribe`** / **`eventSegment`** target the **story**, **`sentenceCorrect`**–**`textMatching`** each **subject recall**, and **`causalRating`** the **story event list** β€” but text-only projects skip Step 1, and you can equally run just **`eventSegment` + `causalRating`** or **`sentenceCorrect` β†’ `textParsing` β†’ `textMatching`**. Every step is available from the **GUI** or **`narraters` CLI**, has a lightweight default method, and supports hand-editing afterward.

| # | Step ID | What it does | Terminal command | Default in / out |
|---|---------|--------------|------------------|------------------|
| 1 | **`audioTranscribe`** | Audio recordings β†’ text (Whisper/WhisperX); story vs recall via `audioScope` or `--kind` | `narraters transcribe` | `data/4_recall_audio/` (or `data/1_story_audio/` with `--kind story`) β†’ `output/*_audio-transcribed/` |
| 2 | **`eventSegment`** | Story transcript β†’ numbered events | `narraters segment` | `data/2_story_transcript/` β†’ `data/3_story_events/` |
| 3 | **`sentenceCorrect`** | Fix spelling/grammar in recall text (no rewriting) | `narraters correct` | `data/5_recall_texts/` β†’ `output/recall_corrected/` |
| 4 | **`textParsing`** | Corrected recall β†’ clause-level segments | `narraters parse` | `output/recall_corrected/` β†’ `output/recall_parsed/` |
| 5 | **`textMatching`** | Recall segments ↔ story events | `narraters match` | `output/recall_parsed/` + `data/3_story_events/` β†’ `output/recall_rated/` |
| 6 | **`causalRating`** | Causal strength of every story-event pair | `narraters rate` | `data/3_story_events/` β†’ `output/causal_rated/` |

For each step, the GUI runs the same backends as the CLI. **Available methods, flags, and examples** are under **[Command-line pipeline](#command-line-pipeline)** below.

---

## Command-line pipeline

Each of the six steps is a separate **`narraters`** subcommand with its own **`--method`** (and related options). Use the CLI for **scripts**, **clusters**, or **reproducible** runsβ€”**with or without** the web UI, and **with any subset** of steps your study uses. General shape:

```
narraters [--method METHOD] [--model MODEL] [-i INPUT] [-o OUTPUT] [--prompt-version VERSION] ...
```

Discover what's available at any time:

```bash
narraters --help # list all subcommands
narraters --help # step-specific options
narraters segment --list-prompts # list available prompt versions for a step
narraters segment --list-models # list supported model identifiers
```

The method choices below are exactly those accepted by the CLI (`src/narraters/cli.py`).

### Step 1 β€” `transcribe` (audio β†’ text)

```bash
narraters transcribe --model large-v3 --timestamps # recall audio (default)
narraters transcribe --kind story --model small # story audio instead
narraters transcribe -i path/to/audio -o path/to/out # custom directories
narraters transcribe --filter sub-01 # one item only
```

| Option | Choices | Notes |
|---|---|---|
| `--model` | `tiny`, `base`, `small`, `medium`, `large-v2`, `large-v3` | Whisper model name |
| `--timestamps` | flag | Also write Excel files with word-level timestamps |
| `--kind` | `recall` (default), `story` | Picks the conventional directories: `recall` = `data/4_recall_audio/` β†’ `output/recall_audio-transcribed/`; `story` = `data/1_story_audio/` β†’ `output/story_audio-transcribed/` |
| `-i, --input` | path | Input audio directory (overrides the `--kind` default) |
| `-o, --output` | path | Output directory (overrides the `--kind` default) |
| `--filter` | substring | Only transcribe files whose name matches this item id |

Requires `pip install "narraters[audio]"` (or `pip install -e ".[audio]"` from a clone). Text-only projects can skip Step 1 entirely.

### Step 2 β€” `segment` (story β†’ events)

```bash
narraters segment --method clause
narraters segment --method api --model --prompt-version event_segment
narraters segment --method fine --input data/2_story_transcript/my_story.txt
```
Run `narraters segment --list-models` for the exact `--model` strings (Anthropic, OpenAI, and Ollama-backed presets).

| Option | Choices | Notes |
|---|---|---|
| `--method` | `clause`, `fine`, `coarse`, `api` | `clause` needs no model; `fine`/`coarse` use spaCy if installed; `api` calls an LLM |
| `--model` | see `narraters segment --list-models` | Only used with `--method api` (Anthropic, OpenAI, or Ollama preset keys) |
| `--prompt-version` | see `--list-prompts` | Selects a template from `scripts/prompt/event_segment*.txt` |
| `-i, --input` | path | Single transcript file or a directory (else processes all) |
| `-o, --output` | path | Output directory (default: `data/3_story_events/`) |

### Step 3 β€” `correct` (spell / grammar fixes)

```bash
narraters correct --method rules
narraters correct --method gemma-ollama --ollama-model gemma4:e4b
```

| Option | Choices | Notes |
|---|---|---|
| `--method` | `rules`, `gemma-ollama` | `rules` runs entirely locally with no model; `gemma-ollama` needs a local Ollama server |
| `--ollama-model` | e.g. `gemma4:e4b` | Local Ollama model tag (with `gemma-ollama`) |
| `--prompt-file` | path | Override the instructions file (default: `scripts/prompt/spell_gram.txt`) |
| `-i, --input` | path | Single recall text file |
| `-o, --output` | path | Output directory |

Minimal corrections only β€” Step 3 fixes spelling/grammar errors and never rewrites or paraphrases.

### Step 4 β€” `parse` (recall text β†’ clause-level segments)

```bash
narraters parse --method rules
narraters parse --method ollama --model gemma4:e4b --prompt-version recall_parse_clause
narraters parse --filter-pattern sub-02 # process one subject only
```

| Option | Choices | Notes |
|---|---|---|
| `--method` | `rules`, `ollama` | `rules` is the default (regex, no model); `ollama` uses local Gemma |
| `--model` | e.g. `gemma4:e4b` | Ollama model tag (with `--method ollama`) |
| `--prompt-version` | see `scripts/prompt/recall_parse_*.txt` | Prompt template name |
| `-i, --input` | path | Input directory (default: `output/recall_corrected/`) |
| `-o, --output` | path | Output directory (default: `output/recall_parsed/`) |
| `--filter-pattern` | substring | Optional filter to process a single subject |

### Step 5 β€” `match` (recall segments ↔ story events)

```bash
narraters match --test-mode # simulated keyword matching, no model/API
narraters match --method api --story-events data/3_story_events
narraters match --method gemma-ollama
narraters match --method rmatch # embedding matcher (requires [match])
```

| Option | Choices | Notes |
|---|---|---|
| `--method` | `test`, `api`, `gemma-ollama`, `rmatch` | `test` is keyword-based, free, and always available; `rmatch` needs `pip install "narraters[match]"` |
| `--story-events` | path | Directory of `{story}_events.xlsx` (default: `data/3_story_events`) |
| `-i, --input` | path | Recall-parsed input directory (default: `output/recall_parsed/`) |
| `-o, --output` | path | Output directory (default: `output/recall_rated/`) |
| `--test-mode` | flag | Equivalent to `--method test` β€” simulated matching, no API calls |

### Step 6 β€” `rate` (causal relationships between event pairs)

```bash
narraters rate --method linguistic
narraters rate --method api --model --prompt-version causal_rating
narraters rate --method manual # write an empty matrix for hand rating
```
Use `narraters rate --help` and the Step 6 model dropdown in the web UI for supported `--model` values when using `--method api`.

| Option | Choices | Notes |
|---|---|---|
| `--method` | `linguistic`, `api`, `manual` | `linguistic` is rule-based (no model); `manual` scaffolds an NΓ—N matrix to fill in by hand |
| `--model` | see web UI / provider docs | Only used with `--method api` |
| `--prompt-version` | see `scripts/prompt/causal_rating*.txt` | Prompt template name |
| `-i, --input` | path | Input file/directory |
| `-o, --output` | path | Output directory |

---

## Library / Python use

```python
from narraters import __version__, project_root
print(__version__, project_root())
```

Direct per-step imports are planned for a future release; for now, programmatic use should call the CLI via `subprocess` or import the modules under `scripts/`.

---

## Project layout

After unzipping, everything lives under a single **`narRaters/`** project root. Paths, contents, and naming conventions:

### Folder structure

```text
narRaters/
β”œβ”€β”€ README.md # This file β€” user guide & pipeline docs
β”œβ”€β”€ CONTRIBUTING.md # Research background, prompt templates, acknowledgements, author
β”œβ”€β”€ LICENSE
β”œβ”€β”€ narRater_Tutorial.pdf # Illustrated web UI tour
β”œβ”€β”€ narRater.app # macOS double-click launcher
β”œβ”€β”€ narRaters_installer.bat # Windows launcher
β”œβ”€β”€ install.sh # macOS / Linux installer
β”œβ”€β”€ pyproject.toml # Package metadata & pip extras
β”œβ”€β”€ SETUP_API.md, .env.example # API key setup
β”‚
β”œβ”€β”€ data/ # Inputs (see Where to put your data)
β”‚ β”œβ”€β”€ 1_story_audio/ # Optional Step 1 β€” story audio
β”‚ β”‚ └── {story}.wav | .mp3 | .m4a
β”‚ β”œβ”€β”€ 2_story_transcript/ # Story text
β”‚ β”‚ └── {story}.txt # plain UTF-8, one story per file
β”‚ β”œβ”€β”€ 3_story_events/ # Pre-segmented or segmented story events
β”‚ β”‚ └── {story}_events.xlsx # columns: event, story_texts
β”‚ β”œβ”€β”€ 4_recall_audio/ # Optional Step 1 β€” recall audio
β”‚ β”‚ └── {subj_id}.wav | .mp3 | .m4a | .mp4
β”‚ └── 5_recall_texts/ # Recall text
β”‚ └── {subj_id}.txt # e.g. the_siren_sub-01.txt
β”‚
β”œβ”€β”€ output/ # Pipeline outputs (one subfolder per step)
β”‚ β”œβ”€β”€ story_audio-transcribed/ # Step 1 (story) β€” {story}.txt
β”‚ β”œβ”€β”€ recall_audio-transcribed/# Step 1 (recall) β€” {subj_id}.txt
β”‚ β”œβ”€β”€ recall_corrected/ # Step 3 β€” {subj_id}.txt
β”‚ β”œβ”€β”€ recall_parsed/ # Step 4 β€” {subj_id}_parsed.xlsx
β”‚ β”œβ”€β”€ recall_rated/ # Step 5 β€” {subj_id}_{method}.xlsx
β”‚ └── causal_rated/ # Step 6 β€” {story}_causal-{method}.xlsx
β”‚
β”œβ”€β”€ scripts/ # Pipeline backends (CLI & web UI call these)
β”‚ β”œβ”€β”€ 1_audio-transcribe.py # audioTranscribe
β”‚ β”œβ”€β”€ 2_story-event-segment.py # eventSegment
β”‚ β”œβ”€β”€ 3_spell-grammar-correct.py # sentenceCorrect
β”‚ β”œβ”€β”€ 4_parse-texts.py # textParsing
β”‚ β”œβ”€β”€ 5_recall-rater.py # textMatching
β”‚ β”œβ”€β”€ 6_causal-rater.py # causalRating
β”‚ └── prompt/ # LLM prompt templates (.txt)
β”‚ β”œβ”€β”€ event_segment.txt
β”‚ β”œβ”€β”€ spell_gram.txt
β”‚ β”œβ”€β”€ recall_parse_clause.txt
β”‚ β”œβ”€β”€ recall_rating.txt
β”‚ └── causal_rating.txt
β”‚
β”œβ”€β”€ server/ # Flask web UI
β”‚ β”œβ”€β”€ web-interface.py # Routes & subprocess orchestration
β”‚ └── START_HERE.command # macOS launcher script
β”‚
β”œβ”€β”€ templates/ # Web UI HTML (pipeline, dashboard, subject/story)
β”œβ”€β”€ static/ # CSS, JS, app icon
β”‚
β”œβ”€β”€ src/narraters/ # pip package
β”‚ β”œβ”€β”€ cli.py # narraters command entry point
β”‚ β”œβ”€β”€ paths.py # Project-root resolution
β”‚ └── runtime_install.py # Bundled-example copy on first serve
β”‚
β”œβ”€β”€ helpers/ # Shared utilities & smoke tests
β”‚ β”œβ”€β”€ software_paths.py # Canonical path resolution
β”‚ β”œβ”€β”€ step_files.py # Flexible step input/output file recognition
β”‚ β”œβ”€β”€ resource_preflight.py # RAM / disk checks for heavy methods
β”‚ └── test_*.py # Pipeline validation scripts
β”‚
β”œβ”€β”€ docs/ # GitHub Pages site & README assets
β”‚ β”œβ”€β”€ index.html # Project landing page
β”‚ └── screenshots/ # README GIFs (+ recall-matching.png for site og:image)
β”‚
β”œβ”€β”€ demo/ # Smaller lighthouse example
β”‚ β”œβ”€β”€ data/ # the_lighthouse transcript + recall texts
β”‚ └── output/ # Sample outputs for the demo story
β”‚
β”œβ”€β”€ developer/ # Contributor handbook & tooling
β”‚ β”œβ”€β”€ README.md # Per-step I/O contracts & design principles
β”‚ └── SETUP_API.md # API key setup (developer copy)
β”‚
└── packaging/macos/ # App bundle / DMG build scripts
└── build_app_bundle.sh
```

Bundled examples: **`pieman_edited`**, **`the_siren`** β€” see [Example input/output data](#example-inputoutput-data).

**Versioning:** automated files use `{id}_{method}.ext`; hand-edited exports use `{id}_{ratername}-edit.ext` (never overwritten).

---

## Further reading

- **[Project home (GitHub Pages)](https://xianneuro.github.io/narRaters/)** β€” landing page for search and sharing.
- **[`narRater_Tutorial.pdf`](narRater_Tutorial.pdf)** β€” illustrated, click-by-click tour of the web UI; good next step after [Installation](#installation).
- **[`SETUP_API.md`](SETUP_API.md)** β€” API keys for Anthropic, OpenAI, and Hugging Face; which pipeline steps need which.
- **[`scripts/prompt/README.md`](scripts/prompt/README.md)** β€” prompt template conventions for LLM-backed methods.

---

## Citation

If you use narRaters in research, please cite the archived release.

**Reference list (APA 7):**

> Li, X. (2026). *narRaters: Naturalistic narratives processing platform* (Version 0.3.14) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.20486080

Replace the version number with the release you used (see [Zenodo](https://doi.org/10.5281/zenodo.20486080) for the latest).

**Examples in a manuscript:**

*Methods β€” in-text:*
> Narrative recall data were processed with narRaters (Li, 2026).

*Methods β€” first mention (optional):*
> We used narRaters (Li, 2026), an open-source pipeline for transcribing, segmenting, parsing, matching, and rating narrative recall data, with human review at each step.

*Software / code availability:*
> narRaters (Version 0.3.14) is available at https://doi.org/10.5281/zenodo.20486080.

*Data processing statement:*
> Story events, parsed recall clauses, recall-to-event matches, and causal ratings were produced with narRaters (Li, 2026; https://doi.org/10.5281/zenodo.20486080).

---

## Acknowledgements

- **Janice Chen** for brainstorming the causal-rating step interface and for help testing and improving package functionality.
- **Gabi Kressin Palacios** and **Dhruva Arekar** for an additional method for the recall-matching step (matching human recall text to story events). See [GabrielKP/rMatch](https://github.com/GabrielKP/rMatch) for human-data–validated AI-assisted recall rating.
- **Xiyu Li (Rita)** for contributions to the `recall_rating` prompt development and for validating model performance on human recall data (commercial LLM APIs were close to human raters).
- **Sebastian Michelmann** for feedback on the event-segmentation step (see [Michelmann et al., 2023](https://arxiv.org/abs/2301.10297)).
- **Colette Youstra** and **[Quinton Covington](https://qcovington.com)** for testing the app's manual-rating functions.
- **Samira Tavassoli** and **Yuye Huang** for help testing the app's segmentation and causal-reasoning functions.

---

## Author

**[Xian Li](https://www.xian-li.com)** β€” [xianl.cogneuro@gmail.com](mailto:xianl.cogneuro@gmail.com)

---

## License

See **[LICENSE](LICENSE)** β€” **narRaters Research and Non-Commercial License**. Free for research, education, and other non-commercial use; commercial or for-profit use requires prior written permission. Contact [xianl.cogneuro@gmail.com](mailto:xianl.cogneuro@gmail.com) for commercial licensing.