https://github.com/hadv/yitam-admin
A minimalist React web application that allows users to upload documents and store their vector embeddings in a Qdrant vector database.
https://github.com/hadv/yitam-admin
Last synced: 10 months ago
JSON representation
A minimalist React web application that allows users to upload documents and store their vector embeddings in a Qdrant vector database.
- Host: GitHub
- URL: https://github.com/hadv/yitam-admin
- Owner: hadv
- License: apache-2.0
- Created: 2025-04-03T09:53:15.000Z (about 1 year ago)
- Default Branch: master
- Last Pushed: 2025-08-01T05:00:29.000Z (11 months ago)
- Last Synced: 2025-08-01T06:32:22.769Z (11 months ago)
- Language: TypeScript
- Homepage:
- Size: 528 KB
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 2
-
Metadata Files:
- Readme: README-JOB-QUEUE.md
- License: LICENSE
Awesome Lists containing this project
README
# YouTube Video Processing with Self-Contained Job Queue
This implementation solves the timeout issues when uploading long YouTube videos by using a self-contained job queue system that doesn't require any external dependencies.
## Overview
Instead of processing YouTube videos in the HTTP request handler (which can lead to timeouts), we now:
1. Accept the upload request
2. Queue the processing job in an in-memory queue with file persistence
3. Immediately return a response to the client
4. Process the video in a separate worker thread
5. Provide status updates via WebSockets
## Features
- **No external dependencies** - Uses Node.js built-in worker threads
- **Persistent job queue** - Jobs are saved to disk and can survive server restarts
- **Concurrent processing** - Multiple videos can be processed simultaneously
- **Real-time progress tracking** - Continues to use WebSockets for progress updates
## How It Works
1. **Client-side**: The user submits a YouTube URL for processing
2. **Server-side**:
- The request is received and a job is added to the queue
- A 202 (Accepted) response is sent back immediately with a job ID
- The actual processing happens in a separate worker thread
3. **Progress Tracking**:
- WebSockets continue to be used for real-time progress updates
- An additional endpoint is available for checking job status
## Technical Implementation
- Uses Node.js worker threads for background processing
- Stores job data in memory with JSON file persistence
- Automatically recovers and resumes pending jobs on server restart
- Limits the number of concurrent jobs to avoid overloading the server
## API Endpoints
- `POST /api/youtube/process` - Queue a YouTube video for processing
- `GET /api/youtube/job/:jobId` - Check the status of a processing job
## Benefits
- No more HTTP timeouts for long-running processes
- Better user experience with immediate feedback
- No external dependencies (like Redis) to manage
- System resources are managed more efficiently
- Better error handling and recovery
- Processing continues even if the user closes their browser