https://github.com/bigdata5911/youtube-video-anlayzer
A sophisticated Node.js application that analyzes YouTube videos for legal compliance by transcribing audio content and comparing it against predefined legal rules.
https://github.com/bigdata5911/youtube-video-anlayzer
analysis deepgram ffmpeg gpt-4 legal legal-document-analyzer openai patreon patreon-scraper rag youtube youtube-dl-exec youtube-downloader
Last synced: 7 months ago
JSON representation
A sophisticated Node.js application that analyzes YouTube videos for legal compliance by transcribing audio content and comparing it against predefined legal rules.
- Host: GitHub
- URL: https://github.com/bigdata5911/youtube-video-anlayzer
- Owner: BigData5911
- License: mit
- Created: 2025-01-09T21:46:19.000Z (9 months ago)
- Default Branch: master
- Last Pushed: 2025-01-09T22:30:16.000Z (9 months ago)
- Last Synced: 2025-01-09T23:26:48.522Z (9 months ago)
- Topics: analysis, deepgram, ffmpeg, gpt-4, legal, legal-document-analyzer, openai, patreon, patreon-scraper, rag, youtube, youtube-dl-exec, youtube-downloader
- Language: TypeScript
- Homepage:
- Size: 104 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Youtube-video-analyzer
A sophisticated Node.js application that analyzes YouTube videos for legal compliance by transcribing audio content and comparing it against predefined legal rules.
## Core Features
- YouTube video audio extraction and processing
- Speech-to-text transcription using Deepgram API
- Legal rules extraction from regulatory articles
- Automated compliance analysis using GPT-4
- Multi-language support (optimized for Czech)
- Token cost tracking and optimization## Technical Stack
- **Runtime**: Node.js
- **Language**: TypeScript
- **APIs**:
- OpenAI GPT-4
- Deepgram Speech-to-Text
- YouTube Data API## Key Components
1. **Video Processing Pipeline**
- Downloads YouTube videos as audio files
- Supports chunked processing for large files
- Handles multi-speaker transcription2. **Transcription Engine**
- Uses Deepgram's Nova-2 model
- Provides paragraph segmentation
- Speaker diarization
- Punctuation and formatting3. **Legal Analysis System**
- Extracts rules from regulatory documents
- Performs compliance checking
- Generates detailed violation reports## Environment Setup
Required environment variables:
```bash
OPENAI_API_KEY=your_openai_key
DEEPGRAM_API_KEY=your_deepgram_key
```## Usage
```bash
const videoUrls = [
"https://www.youtube.com/watch?v=example1",
"https://www.youtube.com/watch?v=example2"
];
const articleUrl = "https://regulatory-article-url";await main(videoUrls, articleUrl);
```## Contributing
- Fork the repository
- Create your feature branch (git checkout -b feature/AmazingFeature)
- Commit your changes (git commit -m 'Add some AmazingFeature')
- Push to the branch (git push origin feature/AmazingFeature)
- Open a Pull Request## License
This project is licensed under the [LICENSE](https://github.com/BigData5911/youtube-video-anlayzer/blob/master/LICENSE) - see the LICENSE file for details.