{"id":22165445,"url":"https://github.com/cypher-o/voice-bridge","last_synced_at":"2025-10-08T21:01:53.692Z","repository":{"id":262525323,"uuid":"887534657","full_name":"Cypher-O/voice-bridge","owner":"Cypher-O","description":"APIs for Text-to-Speech (TTS), Speech-to-Text (STT), and Document Reading (DOCX/PDF)","archived":false,"fork":false,"pushed_at":"2024-11-12T22:56:30.000Z","size":83,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-29T21:50:46.850Z","etag":null,"topics":["document-reader","stt-api","tts-api","typescript"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Cypher-O.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-12T21:34:25.000Z","updated_at":"2024-11-12T22:56:34.000Z","dependencies_parsed_at":"2024-11-19T20:48:12.051Z","dependency_job_id":null,"html_url":"https://github.com/Cypher-O/voice-bridge","commit_stats":null,"previous_names":["cypher-o/voice-bridge"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Cypher-O%2Fvoice-bridge","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Cypher-O%2Fvoice-bridge/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Cypher-O%2Fvoice-bridge/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Cypher-O%2Fvoice-bridge/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Cypher-O","download_url":"https://codeload.github.com/Cypher-O/voice-bridge/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245304873,"owners_count":20593626,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["document-reader","stt-api","tts-api","typescript"],"created_at":"2024-12-02T05:14:55.461Z","updated_at":"2025-10-08T21:01:53.681Z","avatar_url":"https://github.com/Cypher-O.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Document Processing and Speech API\n\nA TypeScript-based Express API for document processing (PDF/DOCX), speech-to-text, and text-to-speech conversion using Google Cloud services.\n\n## Features\n\n- 📄 Document Processing (PDF \u0026 DOCX)\n- 🎤 Speech to Text Conversion\n- 🔊 Text to Speech Conversion\n- ⚡ Rate Limiting\n- 🔒 Type Safety\n- 📝 Standardized API Responses\n\n## Prerequisites\n\n- Node.js (v14 or higher)\n- TypeScript (v4 or higher)\n- Google Cloud Account with Speech \u0026 Text-to-Speech APIs enabled\n- Service Account Key from Google Cloud\n\n## Installation\n\n1. Clone the repository:\n\n```bash\ngit clone \u003crepository-url\u003e\ncd document-speech-api\n```\n\n2. Install dependencies:\n\n```bash\nnpm install\n```\n\n3. Set up environment variables:\nCreate a `.env` file in the root directory:\n\n```env\nPORT=3000\nGOOGLE_APPLICATION_CREDENTIALS=\"path/to/your/service-account-key.json\"\nRATE_LIMIT_WINDOW_MS=900000\nRATE_LIMIT_MAX_REQUESTS=100\n```\n\n## Project Structure\n\n```\nsrc/\n├── controllers/\n│   ├── documentReaderController.ts\n│   ├── speechToTextController.ts\n│   └── textToSpeechController.ts\n├── services/\n│   ├── documentReaderService.ts\n│   ├── speechToTextService.ts\n│   └── textToSpeechService.ts\n├── routes/\n│   └── apiRoutes.ts\n├── types/\n│   ├── express.d.ts\n│   └── api_response.ts\n├── utils/\n|   ├── api_response.ts\n│   └── logger.ts\n├── middlewares/\n|   ├── errorHandlerMiddleware.ts\n│   └── rateLimitMiddleware.ts\n├── app.ts\n└── server.ts\n```\n\n## Dependencies\n\n```json\n{\n  \"dependencies\": {\n    \"@google-cloud/speech\": \"^latest\",\n    \"@google-cloud/text-to-speech\": \"^latest\",\n    \"express\": \"^latest\",\n    \"multer\": \"^latest\",\n    \"pdf-parse\": \"^latest\",\n    \"mammoth\": \"^latest\",\n    \"express-rate-limit\": \"^latest\",\n    \"dotenv\": \"^latest\"\n  },\n  \"devDependencies\": {\n    \"@types/express\": \"^latest\",\n    \"@types/multer\": \"^latest\",\n    \"@types/node\": \"^latest\",\n    \"typescript\": \"^latest\",\n    \"ts-node\": \"^latest\",\n    \"nodemon\": \"^latest\"\n  }\n}\n```\n\n## API Endpoints\n\n### 1. Document Reading\n\n```http\nPOST /api/read-document\nContent-Type: multipart/form-data\n```\n\n#### Request\n\n- `file`: PDF or DOCX file\n\n#### Response\n\n```json\n{\n  \"code\": 0,\n  \"status\": \"success\",\n  \"message\": \"Document read successfully\",\n  \"data\": {\n    \"text\": \"extracted text content\",\n    \"fileName\": \"document.pdf\",\n    \"fileType\": \"application/pdf\"\n  }\n}\n```\n\n### 2. Speech to Text\n\n```http\nPOST /api/speech-to-text\nContent-Type: multipart/form-data\n```\n\n#### Request\n\n- `audio`: Audio file (MP3)\n\n#### Response\n\n```json\n{\n  \"code\": 0,\n  \"status\": \"success\",\n  \"message\": \"Audio transcribed successfully\",\n  \"data\": {\n    \"text\": \"transcribed text\",\n    \"audioFileName\": \"audio.mp3\",\n    \"duration\": 10.5\n  }\n}\n```\n\n### 3. Text to Speech\n\n```http\nPOST /api/text-to-speech\nContent-Type: application/json\n```\n\n#### Request\n\n```json\n{\n  \"text\": \"Text to convert to speech\",\n  \"voice\": \"en-US\",  // optional\n  \"speed\": 1.0       // optional\n}\n```\n\n#### Response\n\n- Audio stream (audio/mpeg) if successful\n- Error response if failed:\n\n```json\n{\n  \"code\": 1,\n  \"status\": \"error\",\n  \"message\": \"Error message\"\n}\n```\n\n## Error Codes\n\n- 0: Success\n- 400: Bad Request\n- 401: Unauthorized\n- 403: Forbidden\n- 404: Not Found\n- 500: Internal Server Error\n\n## Usage Examples\n\n### Using axios\n\n```typescript\nimport axios from 'axios';\n\n// Document Reading\nconst readDocument = async (file: File) =\u003e {\n  const formData = new FormData();\n  formData.append('file', file);\n  \n  try {\n    const response = await axios.post('/api/read-document', formData, {\n      headers: {\n        'Content-Type': 'multipart/form-data'\n      }\n    });\n    return response.data;\n  } catch (error) {\n    console.error('Error reading document:', error);\n    throw error;\n  }\n};\n\n// Speech to Text\nconst convertSpeechToText = async (audioFile: File) =\u003e {\n  const formData = new FormData();\n  formData.append('audio', audioFile);\n  \n  try {\n    const response = await axios.post('/api/speech-to-text', formData, {\n      headers: {\n        'Content-Type': 'multipart/form-data'\n      }\n    });\n    return response.data;\n  } catch (error) {\n    console.error('Error converting speech to text:', error);\n    throw error;\n  }\n};\n\n// Text to Speech\nconst convertTextToSpeech = async (text: string) =\u003e {\n  try {\n    const response = await axios.post('/api/text-to-speech', \n      { text },\n      { responseType: 'blob' }\n    );\n    return response.data;\n  } catch (error) {\n    console.error('Error converting text to speech:', error);\n    throw error;\n  }\n};\n```\n\n## Running the Application\n\n1. Development mode:\n\n```bash\nnpm run dev\n```\n\n2. Production mode:\n\n```bash\nnpm run build\nnpm start\n```\n\n## Setting Up Google Cloud Credentials\n\n1. Create a project in Google Cloud Console\n2. Enable Speech-to-Text and Text-to-Speech APIs\n3. Create a service account and download the key file\n4. Set the path to your key file in the `GOOGLE_APPLICATION_CREDENTIALS` environment variable\n\n## Rate Limiting\n\nThe API includes rate limiting to prevent abuse. Default settings:\n\n- 100 requests per 15 minutes window\n- Customize these values in the `.env` file\n\n## Contributing\n\n1. Fork the repository\n2. Create your feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit your changes (`git commit -m 'Add some amazing feature'`)\n4. Push to the branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcypher-o%2Fvoice-bridge","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcypher-o%2Fvoice-bridge","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcypher-o%2Fvoice-bridge/lists"}