https://github.com/ybakhan/lecture-assistant
Real-time lecture transcription and AI-powered Q&A. Record a live lecture, get an instant transcript, and chat with an AI about the content.
https://github.com/ybakhan/lecture-assistant
aws docker express nodejs openai parcel real-time speech-to-text typescript websocket
Last synced: 1 day ago
JSON representation
Real-time lecture transcription and AI-powered Q&A. Record a live lecture, get an instant transcript, and chat with an AI about the content.
- Host: GitHub
- URL: https://github.com/ybakhan/lecture-assistant
- Owner: ybakhan
- License: mit
- Created: 2024-06-09T01:46:17.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2026-06-06T22:21:50.000Z (17 days ago)
- Last Synced: 2026-06-07T00:11:40.539Z (17 days ago)
- Topics: aws, docker, express, nodejs, openai, parcel, real-time, speech-to-text, typescript, websocket
- Language: JavaScript
- Size: 27.3 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# LECTURE ASSISTANT
> Real-time lecture transcription and AI-powered Q&A — record a live lecture, get an instant transcript, and chat with an AI about the content.
[](https://nodejs.org/)
[](https://www.typescriptlang.org/)
[](https://developer.mozilla.org/en-US/docs/Web/JavaScript)
[](https://expressjs.com/)
[](https://aws.amazon.com/transcribe/)
[](https://platform.openai.com/)
[](https://www.docker.com/)
[](LICENSE)
## Overview
**Lecture Assistant** streams microphone audio from the browser to **AWS Transcribe** for real-time speech-to-text, then lets you ask questions about the transcript using **OpenAI GPT**. The live transcript appears word-by-word as you speak, and the AI chat has full context of everything said in the lecture.
```
Browser mic → WebSocket → Node/Express → AWS Transcribe → live transcript
↓
OpenAI GPT Q&A chat
```
## Features
- **Real-time transcription** — live captions stream to the screen as you speak via AWS Transcribe Streaming
- **AI chat** — ask anything about the recorded lecture; GPT answers with full transcript context
- **Server-vended credentials** — the server obtains short-lived AWS tokens via STS and serves them to the client; no secrets in the browser bundle
- **Transcript persistence** — completed transcripts and chat history are stored in S3; local files are cleaned up automatically
- **WebSocket pipeline** — low-latency audio streaming directly from browser to cloud
- **Docker-ready server** — single-file production bundle in a minimal Alpine image
## Tech Stack
| Layer | Technology |
|---|---|
| Frontend | Vanilla JavaScript, Parcel |
| Backend | Node.js 20, TypeScript, Express, ws (WebSocket) |
| Speech-to-text | AWS Transcribe Streaming |
| AI chat | OpenAI GPT-3.5-turbo |
| Cloud infra | AWS S3, AWS Secrets Manager, AWS STS |
| Container | Docker (Alpine) |
| Build | esbuild, ESLint, Prettier |
## Prerequisites
- Node.js 20+
- AWS account with Transcribe, S3, Secrets Manager, and STS access
- OpenAI API key stored in AWS Secrets Manager under the key `OPENAI_API_KEY`
## Getting Started
### 1. Server
```bash
cd server
yarn install
yarn bundle:prod
node ./dist/index.js
```
The server starts on **port 8080**.
The server resolves credentials from the environment (IAM role, instance profile, or environment variables). Configure the following:
| Variable | Description |
|---|---|
| `AWS_REGION` | AWS region (e.g. `us-east-2`) |
| `TRANSCRIBE_ROLE_ARN` | ARN of the IAM role the server assumes to vend client credentials |
| `AWS_ACCESS_KEY_ID` | AWS credentials (not needed when using an IAM role) |
| `AWS_SECRET_ACCESS_KEY` | AWS credentials (not needed when using an IAM role) |
The OpenAI API key is fetched automatically from AWS Secrets Manager at runtime — no environment variable needed.
### 2. Client
Copy the example env file and set the server host:
```bash
cd client
cp .env.example .env
# edit .env and set PARCEL_PUBLIC_SERVICE_HOST if not running locally
npm install
npm start
```
Open [http://localhost:1234](http://localhost:1234) and allow microphone access when prompted.
### Docker (server only)
```bash
cd server
yarn bundle:prod
docker build -t lecture-assistant .
docker run -p 8080:8080 lecture-assistant
```
## Usage
1. Click **Start Recording** — the client fetches short-lived AWS credentials from the server, then begins streaming your mic to AWS Transcribe
2. Speak; the live transcript appears on screen in real time
3. Click **Stop Recording** — the transcript is uploaded to S3 and the local copy is cleaned up
4. Type a question in the chat box and press **Send** or **Enter**
5. GPT answers based on the full lecture transcript, read from S3
## Project Structure
```
.
├── client/ # Browser frontend (Parcel, vanilla JS)
│ ├── .env.example # Environment variable template
│ └── src/
│ ├── app.js # Main UI logic and event handling
│ ├── config.js # Host/port config (reads from env)
│ ├── microphone.js # Microphone capture
│ ├── transcribe-client.js # AWS Transcribe Streaming client
│ ├── socket.js # WebSocket connection to server
│ └── question.js # OpenAI Q&A API calls
└── server/ # Node.js backend (TypeScript, Express)
└── src/
├── index.ts # Express + WebSocket server entry point
├── aws/
│ ├── s3-client.ts # S3 read/write helpers
│ ├── sts-client.ts # STS credential vending
│ └── secrets/ # Secrets Manager client with cache
├── openai/ # GPT chat handler
└── transcribe/ # WebSocket transcription handler
```
## License
MIT — see [LICENSE](LICENSE).