https://github.com/mxmarchal/youtube-subtitles-search
YouTube Subtitles Search: A Go-based web app that fetches and displays YouTube video transcripts. Built for learning Go and experimenting with AI/ML concepts. Features include transcript retrieval, display with timecodes, and a simple UI. Future plans include vector search and audio-to-text functionality.
https://github.com/mxmarchal/youtube-subtitles-search
docker gin-gonic go golang nlp subtitles transcripts vector-search video-processing weaviate youtube-api
Last synced: about 2 months ago
JSON representation
YouTube Subtitles Search: A Go-based web app that fetches and displays YouTube video transcripts. Built for learning Go and experimenting with AI/ML concepts. Features include transcript retrieval, display with timecodes, and a simple UI. Future plans include vector search and audio-to-text functionality.
- Host: GitHub
- URL: https://github.com/mxmarchal/youtube-subtitles-search
- Owner: mxmarchal
- License: mit
- Created: 2024-09-15T00:25:11.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-09-15T00:26:55.000Z (about 1 year ago)
- Last Synced: 2024-10-31T04:42:08.538Z (11 months ago)
- Topics: docker, gin-gonic, go, golang, nlp, subtitles, transcripts, vector-search, video-processing, weaviate, youtube-api
- Language: JavaScript
- Homepage:
- Size: 12.7 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# YouTube Subtitles Search
This project is a web application that allows users to fetch and display transcripts from YouTube videos. It's built with Go for the backend and vanilla JavaScript for the frontend.
## Features
- Fetch transcripts from YouTube videos using their URL
- Display transcripts with timecodes
- Simple and responsive user interface## Tech Stack
- Backend: Go (Gin framework)
- Frontend: HTML, CSS, JavaScript
- Docker for containerization
- Weaviate for vector search (not implemented yet)## Project Structure
```
.
├── cmd
│ └── app
│ └── main.go
├── internal
│ ├── api
│ │ ├── handler.go
│ │ └── router.go
│ ├── config
│ │ └── config.go
│ └── service
│ └── transcript_service.go
├── static
│ ├── css
│ │ └── style.css
│ ├── js
│ │ └── app.js
│ └── index.html
├── tests
│ └── youtube_test.go
├── Dockerfile
├── docker-compose.yml
├── go.mod
├── go.sum
└── README.md
```## Setup and Running
1. Clone the repository
2. Make sure you have Docker and Docker Compose installed
3. Run the following command in the project root:```
docker-compose up --build
```4. Open your browser and navigate to `http://localhost:8080`
## API Endpoints
- `GET /health`: Health check endpoint
- `GET /transcript/:videoId`: Fetch transcript for a given YouTube video ID## Frontend
The frontend is a simple HTML page with JavaScript for interactivity. It allows users to input a YouTube URL and displays the fetched transcript.
## Backend
The backend is written in Go using the Gin framework. It handles API requests and interacts with the YouTube API to fetch transcripts.
## Testing
There's a basic test for the transcript fetching functionality. To run the tests, use the following command:
```
go test ./tests
```## Future Improvements
- Implement vector search using Weaviate for more advanced transcript searching
- Add functionality to download and run an audio-to-text model when YouTube transcripts are not available
- Improve error handling and user feedback
- Implement caching for frequently accessed transcripts## Note
This project is for learning purposes and experimentation with Go and AI/ML concepts. It should not be used in a production environment.