An open API service indexing awesome lists of open source software.

https://github.com/hadv/yitam-admin

A minimalist React web application that allows users to upload documents and store their vector embeddings in a Qdrant vector database.
https://github.com/hadv/yitam-admin

Last synced: 10 months ago
JSON representation

A minimalist React web application that allows users to upload documents and store their vector embeddings in a Qdrant vector database.

Awesome Lists containing this project

README

          

# YouTube Video Processing with Self-Contained Job Queue

This implementation solves the timeout issues when uploading long YouTube videos by using a self-contained job queue system that doesn't require any external dependencies.

## Overview

Instead of processing YouTube videos in the HTTP request handler (which can lead to timeouts), we now:

1. Accept the upload request
2. Queue the processing job in an in-memory queue with file persistence
3. Immediately return a response to the client
4. Process the video in a separate worker thread
5. Provide status updates via WebSockets

## Features

- **No external dependencies** - Uses Node.js built-in worker threads
- **Persistent job queue** - Jobs are saved to disk and can survive server restarts
- **Concurrent processing** - Multiple videos can be processed simultaneously
- **Real-time progress tracking** - Continues to use WebSockets for progress updates

## How It Works

1. **Client-side**: The user submits a YouTube URL for processing
2. **Server-side**:
- The request is received and a job is added to the queue
- A 202 (Accepted) response is sent back immediately with a job ID
- The actual processing happens in a separate worker thread
3. **Progress Tracking**:
- WebSockets continue to be used for real-time progress updates
- An additional endpoint is available for checking job status

## Technical Implementation

- Uses Node.js worker threads for background processing
- Stores job data in memory with JSON file persistence
- Automatically recovers and resumes pending jobs on server restart
- Limits the number of concurrent jobs to avoid overloading the server

## API Endpoints

- `POST /api/youtube/process` - Queue a YouTube video for processing
- `GET /api/youtube/job/:jobId` - Check the status of a processing job

## Benefits

- No more HTTP timeouts for long-running processes
- Better user experience with immediate feedback
- No external dependencies (like Redis) to manage
- System resources are managed more efficiently
- Better error handling and recovery
- Processing continues even if the user closes their browser