https://github.com/huggon1/class-atlas
Turn lecture videos into searchable, multilingual study companions.
https://github.com/huggon1/class-atlas
education langchain lecture-notes streamlit whisper
Last synced: 1 day ago
JSON representation
Turn lecture videos into searchable, multilingual study companions.
- Host: GitHub
- URL: https://github.com/huggon1/class-atlas
- Owner: huggon1
- License: mit
- Created: 2026-03-14T13:49:40.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2026-03-14T14:00:34.000Z (4 months ago)
- Last Synced: 2026-03-15T09:13:25.313Z (4 months ago)
- Topics: education, langchain, lecture-notes, streamlit, whisper
- Language: Python
- Size: 38.1 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ClassAtlas
Streamlit + LangChain toolkit for turning lecture videos into searchable, multilingual study companions.
## Highlights
- End-to-end lecture pipeline from video transcription to outline generation
- PDF slide alignment for lecture playback and study review
- Multilingual workflow with a single Streamlit interface
- Built-in Q&A over processed lecture materials
## Features
- Transcribe lecture videos with Faster-Whisper
- Build structured lecture outlines with an LLM
- Align lecture slides from an uploaded PDF
- Explore artifacts in a Streamlit UI
- Ask questions over processed lecture content
- Store outputs per lecture under `data/lectures/`
## Repository Layout
```text
class-atlas/
app/
pipeline/
scripts/
data/
process_lecture.py
streamlit_app.py
requirements.txt
```
## Requirements
- Python 3.10+
- FFmpeg available in `PATH`
- A valid LLM API endpoint compatible with the configured settings
## Install
```bash
python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt
```
## Configure
Copy `settings.example.json` to `settings.json`, or let the app create it on first run.
Example settings:
```json
{
"llm_model": "Qwen/Qwen3-8B",
"llm_base_url": "https://api-inference.modelscope.cn/v1/",
"llm_api_key": "",
"whisper_model": "medium",
"lecture_root": "data/lectures"
}
```
You can also point to a custom settings file with `CLASS_EXTRACT_SETTINGS`.
## Quick Start
Process a lecture from the command line:
```bash
python process_lecture.py ^
--video path\to\lecture.mp4 ^
--output data\lectures\lecture-a ^
--lecture-id lecture-a ^
--title "Lecture A"
```
Launch the UI:
```bash
streamlit run streamlit_app.py
```
Open the app, upload a lecture video, and browse the generated transcript, outline, slides, and Q&A views from one workspace.
## Typical Workflow
1. Configure `settings.json`
2. Process a lecture video from CLI or through the UI
3. Inspect transcript, outline, and aligned slide artifacts
4. Run Q&A over the processed lecture materials
## Notes
- `settings.json`, logs, caches, and generated lecture artifacts are intentionally not committed.
- `scripts/test_pipeline_cache.py` is preserved as a lightweight regression helper.
- The Chinese project readme is preserved in [README_ZH.md](README_ZH.md).