An open API service indexing awesome lists of open source software.

https://github.com/huggon1/class-atlas

Turn lecture videos into searchable, multilingual study companions.
https://github.com/huggon1/class-atlas

education langchain lecture-notes streamlit whisper

Last synced: 1 day ago
JSON representation

Turn lecture videos into searchable, multilingual study companions.

Awesome Lists containing this project

README

          

# ClassAtlas

Streamlit + LangChain toolkit for turning lecture videos into searchable, multilingual study companions.

## Highlights

- End-to-end lecture pipeline from video transcription to outline generation
- PDF slide alignment for lecture playback and study review
- Multilingual workflow with a single Streamlit interface
- Built-in Q&A over processed lecture materials

## Features

- Transcribe lecture videos with Faster-Whisper
- Build structured lecture outlines with an LLM
- Align lecture slides from an uploaded PDF
- Explore artifacts in a Streamlit UI
- Ask questions over processed lecture content
- Store outputs per lecture under `data/lectures/`

## Repository Layout

```text
class-atlas/
app/
pipeline/
scripts/
data/
process_lecture.py
streamlit_app.py
requirements.txt
```

## Requirements

- Python 3.10+
- FFmpeg available in `PATH`
- A valid LLM API endpoint compatible with the configured settings

## Install

```bash
python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt
```

## Configure

Copy `settings.example.json` to `settings.json`, or let the app create it on first run.

Example settings:

```json
{
"llm_model": "Qwen/Qwen3-8B",
"llm_base_url": "https://api-inference.modelscope.cn/v1/",
"llm_api_key": "",
"whisper_model": "medium",
"lecture_root": "data/lectures"
}
```

You can also point to a custom settings file with `CLASS_EXTRACT_SETTINGS`.

## Quick Start

Process a lecture from the command line:

```bash
python process_lecture.py ^
--video path\to\lecture.mp4 ^
--output data\lectures\lecture-a ^
--lecture-id lecture-a ^
--title "Lecture A"
```

Launch the UI:

```bash
streamlit run streamlit_app.py
```

Open the app, upload a lecture video, and browse the generated transcript, outline, slides, and Q&A views from one workspace.

## Typical Workflow

1. Configure `settings.json`
2. Process a lecture video from CLI or through the UI
3. Inspect transcript, outline, and aligned slide artifacts
4. Run Q&A over the processed lecture materials

## Notes

- `settings.json`, logs, caches, and generated lecture artifacts are intentionally not committed.
- `scripts/test_pipeline_cache.py` is preserved as a lightweight regression helper.
- The Chinese project readme is preserved in [README_ZH.md](README_ZH.md).