https://github.com/victor141516/yourarch
YouTube subtitles scraper/indexer
https://github.com/victor141516/yourarch
chrome-extension indexer scraper search searching subtitles youtube
Last synced: 2 months ago
JSON representation
YouTube subtitles scraper/indexer
- Host: GitHub
- URL: https://github.com/victor141516/yourarch
- Owner: victor141516
- Created: 2022-01-07T00:09:55.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2022-01-10T13:30:59.000Z (over 3 years ago)
- Last Synced: 2025-01-29T09:40:57.431Z (4 months ago)
- Topics: chrome-extension, indexer, scraper, search, searching, subtitles, youtube
- Language: TypeScript
- Homepage:
- Size: 266 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# YourArch
Remember that YouTube video you watched 6 months ago? Now you need it and you don't remember who uploaded it or what the title was. Looking for it was a pain until now
YourArch is a service that watches your YouTube history and saves all video captions so that you can search and find the exact video and time
## Install
There is a script called `ctl.sh` in the root of this project. It can do most of the management stuff.
For example you can run the project with:
```sh
./ctl.sh local build && \
./ctl.sh local up && \
./ctl.sh local migrate
```The usage is `./ctl.sh [env] [action]`
`env` is the docker-compose file suffix, so for now there are `local` and `prod`
## Usage
This repo contains only the backend and (for now) a simple frontend, but you have to provide the backend with the videos you watched. This was easy some time ago when YouTube had an api to get users' history, but now that Google decided to remove that endpoint, you have to use the [YourArch Chrome Extension](https://github.com/victor141516/YourArch-Chrome-Extension)
Simply install it and go to any YouTube page. The extension will inject an iframe with the history page in any page you visit, then it'll scrape it and remove the iframe. This is done once every hour, but that's probably too much since it'll scrape ~200 videos each time
You can change the backend URL by clicking the extension icon in Chrome. It expects something like `https://yourarch.example.com` (without the path)