https://github.com/malob/article-to-audio-cloud-function
Google Cloud Function that takes a url, converts the article at that url to audio using Cloud Text-To-Speech, then stores it in a Cloud Storage bucket.
https://github.com/malob/article-to-audio-cloud-function
cloud-function google-cloud nodejs podcast podcast-generator text-to-speech
Last synced: 6 months ago
JSON representation
Google Cloud Function that takes a url, converts the article at that url to audio using Cloud Text-To-Speech, then stores it in a Cloud Storage bucket.
- Host: GitHub
- URL: https://github.com/malob/article-to-audio-cloud-function
- Owner: malob
- Created: 2018-05-14T01:45:41.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2018-05-16T15:47:54.000Z (almost 8 years ago)
- Last Synced: 2025-04-04T10:52:00.380Z (12 months ago)
- Topics: cloud-function, google-cloud, nodejs, podcast, podcast-generator, text-to-speech
- Language: JavaScript
- Homepage:
- Size: 31.3 KB
- Stars: 23
- Watchers: 3
- Forks: 5
- Open Issues: 1
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
# Article to Audio Google Cloud Function
This is a [Google Cloud Function](https://cloud.google.com/functions/) I hacked together that takes a url to an article on the web, and generates an audio file of it using Google's new [Cloud Text-To-Speech](https://cloud.google.com/text-to-speech/) API which has been updated with access to DeepMind's [WaveNet](https://deepmind.com/blog/wavenet-generative-model-raw-audio/) voices.
I created it as part of a project to generate a personal podcast of articles I want to consume. To get the full thing working see my [other repository](https://github.com/malob/generate-podcast-rss-cloud-function) with the Cloud Function that generates the Podcast RSS.
## Sketch of how it works
* The function accepts a POST request with json in the body.
* E.g. `{"url": "http://example.com/somearticle"}`
* It then uses the free [Mercury Web Parser](https://mercury.postlight.com/web-parser/) API to get the body of the article and some metadata.
* Since the body is returned as HTML it then converts it to plain text. I also add some of the metadata at the top of the article, since I wanted this in the audio.
* Then it slits up the body into chunks of no larger then 5,000 characters, since that's the limit on what the TTS API can handle per request.
* From there is then sends each chunk of text to Google's TTS API which returns the audio encoded as MP3, and writes them to a temporary location.
* Since having multiple files for parts of the article is annoying, it then uses FFMPEG to concatenate the audio chunks into one file.
* Finally, it stores the audio file as and object in a Google Cloud Storage bucket, along with some of the metadata.
## Configuration details
To get this working you need a Google Cloud Project with a Cloud Storage bucket setup, and the Cloud Text-To-Speech API enabled.
You'll then need to create a new Cloud Function (see configuration details below), and replace the `undefined` global constants in the code, `gcpProjectID`, `gcpBucketName`, and `mercuryApiKey`, with the appropriate values.
### Cloud Function configuration
* Trigger type: HTTP trigger
* Memory allocated: 256 MB
* Timeout: 240s
* I had to extend this from default of 60s.