https://github.com/yesutkarsh/notebook-ml-open-source
OPEN SOURCE NOTEBOOK ML - PODCAST GENERATOR
https://github.com/yesutkarsh/notebook-ml-open-source
artificial-intelligence google-cloud javascript llm notebookml
Last synced: 5 months ago
JSON representation
OPEN SOURCE NOTEBOOK ML - PODCAST GENERATOR
- Host: GitHub
- URL: https://github.com/yesutkarsh/notebook-ml-open-source
- Owner: yesutkarsh
- License: mit
- Created: 2025-02-18T07:24:10.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-18T07:44:07.000Z (about 1 year ago)
- Last Synced: 2025-02-18T08:35:53.300Z (about 1 year ago)
- Topics: artificial-intelligence, google-cloud, javascript, llm, notebookml
- Language: JavaScript
- Homepage:
- Size: 32 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# 🎙️ NOTEBOOK ML WITH NODE JS OPEN SOURCE
An open-source project that dynamically generates AI-driven podcast-style conversations using **Google Cloud Text-to-Speech** (SSML), saves audio files, and combines them into a final podcast episode.
## ✨ Features
- 🔹 **Text-to-Speech Conversion**: Generates natural conversations using Google Cloud's Studio voices. [Right now hardcoded but you can use llm to dynamically generate data]
- 🔹 **SSML Markup**: Enhances speech with pauses, intonations, and expressive tones.
- 🔹 **Dynamic Audio Processing**: Saves each segment as an individual file.
- 🔹 **Final Podcast Compilation**: Merges all generated audio into a single **final.mp3**.
- 🔹 **Firebase Integration**: Automatically uploads the final podcast (Not detailed here). [Working on client side too]
---
## 📂 Project Structure
```plaintext
📁 server/
├── 📁 public/
│ ├── 📁 audio/ # Stores all generated MP3 files
│ │ ├── 1.mp3
│ │ ├── 2.mp3
│ │ ├── ...
│ │ ├── final.mp3 # The combined podcast
│
├── 📁 routes/
│ ├── generate.js # Main script for generating & combining audio
│
├── 📁 config/
│ ├── google-credentials.json # Google Cloud authentication
│
├── package.json # Dependencies
├── server.js # Express API server
```
---
## ⚙️ How It Works
### 1️⃣ Generate AI-Driven Speech
The conversation is structured in an array, where **male** and **female** AI voices interact using **SSML-enhanced speech**.
```javascript
const conversationData = [
{
male: 'Hey! Have you heard about AI-driven learning?',
female: 'Hmm, sounds interesting! Tell me more.'
},
...
];
```
Each segment is processed using Google Cloud **Text-to-Speech** and saved as an MP3 file in `/public/audio/`.
---
### 2️⃣ Combining Audio Files
Once all audio segments are successfully generated, they are merged into a single **final.mp3**.
```javascript
const command = `ffmpeg -f concat -safe 0 -i "${fileListPath}" -c copy "${finalFilePath}"`;
exec(command, (error, stdout, stderr) => {
if (error) {
console.error('Error combining audio:', error);
} else {
console.log('Final podcast created:', finalFilePath);
}
});
```
📌 **Ensure FFmpeg is installed** for this process to work.
---
### 3️⃣ 🎧 Enjoy Your AI Podcast!
After completion, you can access your full AI-powered podcast in:
```
/public/audio/final.mp3
```
---
## 🛠️ Setup & Run
### Install Dependencies
```bash
npm install
```
### Start the Server
```bash
node server.js
```
### Generate Podcast
Hit the API endpoint:
```http
GET /generate-all
```
---
## 🚀 Contributions & Improvements
This project is **open-source** and welcomes contributions! Feel free to add new features, improve SSML markup, or optimize audio merging.
Happy coding! 🎵