{"id":29436887,"url":"https://github.com/eddaoust/chatwithstarterstory","last_synced_at":"2026-04-10T11:02:28.412Z","repository":{"id":303807546,"uuid":"1008210519","full_name":"Eddaoust/ChatWithStarterStory","owner":"Eddaoust","description":"A basic Retrieval-Augmented Generation (RAG) implementation for testing purposes, built to enable conversational interactions with Starter Story YouTube video.","archived":false,"fork":false,"pushed_at":"2025-07-09T13:53:24.000Z","size":113,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-09T14:51:54.473Z","etag":null,"topics":["daisyui","docker-compose","elasticsearch","embedings","llphant","openai","php8","postgresql","rag","rag-chatbot","symfony","tailwindcss"],"latest_commit_sha":null,"homepage":"","language":"PHP","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Eddaoust.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-25T07:40:55.000Z","updated_at":"2025-07-09T13:53:28.000Z","dependencies_parsed_at":"2025-07-09T14:52:13.796Z","dependency_job_id":"263873aa-03f9-4111-a28e-27863352ec96","html_url":"https://github.com/Eddaoust/ChatWithStarterStory","commit_stats":null,"previous_names":["eddaoust/chatwithstarterstory"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Eddaoust/ChatWithStarterStory","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Eddaoust%2FChatWithStarterStory","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Eddaoust%2FChatWithStarterStory/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Eddaoust%2FChatWithStarterStory/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Eddaoust%2FChatWithStarterStory/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Eddaoust","download_url":"https://codeload.github.com/Eddaoust/ChatWithStarterStory/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Eddaoust%2FChatWithStarterStory/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265091734,"owners_count":23710033,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["daisyui","docker-compose","elasticsearch","embedings","llphant","openai","php8","postgresql","rag","rag-chatbot","symfony","tailwindcss"],"created_at":"2025-07-13T05:08:08.112Z","updated_at":"2026-04-10T11:02:28.396Z","avatar_url":"https://github.com/Eddaoust.png","language":"PHP","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Chat with Starter Story - RAG Implementation\n\nA basic **Retrieval-Augmented Generation (RAG)** implementation for testing purposes, built to enable conversational interactions with [Starter Story YouTube video](https://www.youtube.com/@starterstory).\n\n## 🚀 Technology Stack\n\n- **Backend**: Symfony 7.3 (PHP 8.4)\n- **Database**: PostgreSQL 17\n- **Search Engine**: Elasticsearch 8.13.2\n- **AI/ML**: OpenAI API with [LLPhant library](https://github.com/LLPhant/LLPhant)\n- **Frontend**: Tailwind (4.1) \u0026 DaisyUI\n- **Web Server**: Caddy\n\n## 📋 Prerequisites\n\n- Docker and Docker Compose\n- [OpenAI](https://platform.openai.com/) API key\n- [Supadata](https://supadata.ai/) API access (for YouTube data fetching)\n- [Youtube](https://console.cloud.google.com/) API key\n\n## 🖥️ Demo\n\nYou can try a demo [here](https://chat.eddaoust.com/)\n\n## 🛠️ Installation Setup\n\n### 1. Clone the Repository\n```bash\ngit clone \u003crepository-url\u003e\ncd ChatWithStarterStory\n```\n\n### 2. Environment Configuration\nCopy the environment files and configure them:\n```bash\ncp .env .env.local\n```\n\nAdd your API keys to `.env.local`:\n```env\nOPENAI_API_KEY=your_openai_api_key_here\nSUPADATA_API_KEY=your_supadata_api_key_here\nYOUTUBE_API_KEY=your_youtube_api_key_here\n```\n\n### 3. Start the Application\n```bash\n# Build and start all services\ndocker compose --env-file .env.docker up -d --build\n\n# Access the PHP container\ndocker exec -ti php /bin/bash\n```\n\n### 4. Install Dependencies \u0026 Setup Database\nInside the PHP container:\n```bash\n# Install Composer dependencies\ncomposer install\n\n# Create database and run migrations\nbin/console doctrine:database:create\nbin/console doctrine:migrations:migrate\n\n# Build Tailwind CSS (in a separate terminal)\nbin/console tailwind:build --watch\n```\n\n### 5. Access the Application\n- **Web Interface**: http://localhost:8080 (Caddy will proxy to the Symfony app)\n- **Elasticsearch**: http://localhost:9200\n- **Database**: PostgreSQL on default port with credentials from `.env.docker`\n\n## 📊 Data Generation for Embeddings\n\nThe RAG system requires a three-step data preparation process:\n\n### Step 1: Import YouTube Videos\n```bash\nbin/console app:import-youtube-videos\n```\nThis command:\n- Fetches videos from the Starter Story YouTube channel\n- Retrieves video metadata (title, description, thumbnail, etc.)\n- Stores video information in the PostgreSQL database\n- Processes up to 100 videos in batches\n\n### Step 2: Create Transcription Chunks\n```bash\nbin/console app:create-transcription-chunks\n```\nThis command:\n- Fetches transcriptions for each imported video using Supadata API\n- Breaks transcriptions into manageable chunks with timestamps\n- Creates `TranscriptionChunk` entities with content, offset, and duration\n- Respects API rate limits with built-in delays\n\n### Step 3: Generate Embeddings\n```bash\nbin/console app:generate-embeddings\n```\nThis command:\n- Processes transcription chunks that don't have embeddings\n- Generates vector embeddings using OpenAI's embedding model\n- Stores embeddings for semantic search capabilities\n- Processes chunks in batches of 25 for optimal performance\n\n## 🧠 How It Works\n\n### RAG Architecture Overview\n\n1. **Data Ingestion**: YouTube videos are imported and transcribed into searchable chunks\n2. **Vector Storage**: Text chunks are converted to embeddings and stored in Elasticsearch\n3. **Query Processing**: User questions are converted to embeddings for similarity search\n4. **Context Retrieval**: Most relevant video chunks are retrieved based on semantic similarity\n5. **Response Generation**: OpenAI LLM generates answers using retrieved context\n6. **Result Presentation**: Responses include relevant video links with timestamps\n\n### Data Flow\n```\nUser Question → Embedding → Vector Search → Context Building → LLM Query → Response + Video Links\n```\n\n## 🔧 Development Commands\n\n### Docker Management\n```bash\n# Stop all services\ndocker compose down --remove-orphans\n\n# View logs\ndocker compose logs -f\n\n# Rebuild specific service\ndocker compose up -d --build php\n```\n\n### Asset Management\n```bash\n# Build Tailwind CSS\nbin/console tailwind:build\n\n# Watch for changes\nbin/console tailwind:build --watch\n```\n\n## 📄 License\n\nThis project is for testing and educational purposes. Please ensure compliance with YouTube's Terms of Service and OpenAI's usage policies when using this application.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feddaoust%2Fchatwithstarterstory","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Feddaoust%2Fchatwithstarterstory","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feddaoust%2Fchatwithstarterstory/lists"}