https://github.com/0xichikawa/youtube-video-analyzer
A sophisticated Node.js application that analyzes YouTube videos for legal compliance. It transcribes the audio content of the videos using the Deepgram API and then compares it against predefined legal rules using the GPT-4 language model.
https://github.com/0xichikawa/youtube-video-analyzer
deepgram gpt4 nodejs openai youtube
Last synced: about 1 year ago
JSON representation
A sophisticated Node.js application that analyzes YouTube videos for legal compliance. It transcribes the audio content of the videos using the Deepgram API and then compares it against predefined legal rules using the GPT-4 language model.
- Host: GitHub
- URL: https://github.com/0xichikawa/youtube-video-analyzer
- Owner: 0xichikawa
- License: mit
- Created: 2024-12-10T06:34:06.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2025-02-07T20:57:52.000Z (over 1 year ago)
- Last Synced: 2025-04-01T05:11:11.415Z (about 1 year ago)
- Topics: deepgram, gpt4, nodejs, openai, youtube
- Language: TypeScript
- Homepage:
- Size: 105 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Youtube-video-analyzer
A sophisticated Node.js application that analyzes YouTube videos for legal compliance by transcribing audio content and comparing it against predefined legal rules.
## Core Features
- YouTube video audio extraction and processing
- Speech-to-text transcription using Deepgram API
- Legal rules extraction from regulatory articles
- Automated compliance analysis using GPT-4
- Multi-language support (optimized for Czech)
- Token cost tracking and optimization
## Technical Stack
- **Runtime**: Node.js
- **Language**: TypeScript
- **APIs**:
- OpenAI GPT-4
- Deepgram Speech-to-Text
- YouTube Data API
## Key Components
1. **Video Processing Pipeline**
- Downloads YouTube videos as audio files
- Supports chunked processing for large files
- Handles multi-speaker transcription
2. **Transcription Engine**
- Uses Deepgram's Nova-2 model
- Provides paragraph segmentation
- Speaker diarization
- Punctuation and formatting
3. **Legal Analysis System**
- Extracts rules from regulatory documents
- Performs compliance checking
- Generates detailed violation reports
## Environment Setup
Required environment variables:
```bash
OPENAI_API_KEY=your_openai_key
DEEPGRAM_API_KEY=your_deepgram_key
```
## Usage
```bash
const videoUrls = [
"https://www.youtube.com/watch?v=example1",
"https://www.youtube.com/watch?v=example2"
];
const articleUrl = "https://regulatory-article-url";
await main(videoUrls, articleUrl);
```
## Contributing
- Fork the repository
- Create your feature branch (git checkout -b feature/AmazingFeature)
- Commit your changes (git commit -m 'Add some AmazingFeature')
- Push to the branch (git push origin feature/AmazingFeature)
- Open a Pull Request
## License
This project is licensed under the [LICENSE](https://github.com/0xichikawa/Youtube-video-analyzer/blob/master/LICENSE) - see the LICENSE file for details.