Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/umuthopeyildirim/puppateer-screenshot

The Google Cloud function that can take screenshots of websites.
https://github.com/umuthopeyildirim/puppateer-screenshot

google-cloud-functions google-cloud-storage javascript puppeteer

Last synced: 13 days ago
JSON representation

The Google Cloud function that can take screenshots of websites.

Awesome Lists containing this project

README

        

# Puppateer Screenshot

![Website Screenshot API](./images/website-screenshot-api.png)

In this blog post, I describe the steps I took to set up this API, letโ€™s dive in!

## Puppeteer
[Puppeteer](https://developers.google.com/web/tools/puppeteer) is a node package that allows you to control a headless chrome browser using Javascript. A headless chrome browser is just a browser without a window.

I can use this package to spin up a headless chrome instance, navigate to a website and take a screenshot.

To start Iโ€™m going to create a local node project and install the puppeteer package.

```bash
npm init
npm install puppeteer
```

Now I can create a file called `index.js` and add the following code.

```js
const puppeteer = require('puppeteer');

takeScreenshot()
.then(() => {
console.log("Screenshot taken");
})
.catch((err) => {
console.log("Error occured!");
console.dir(err);
});

async function takeScreenshot() {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto("https://medium.com", {waitUntil: 'networkidle2'});

const buffer = await page.screenshot({
path: './screenshot.png'
});

await page.close();
await browser.close();
}
```

Note that I am making the `takeScreenshot()` function `async`. This way I can use the await keyword in the function to wait for all the promises.

After running the code I get the following screenshot! ๐ŸŽ‰

![Screenshot of Medium](./images/medium-screenshot.png)

## Google Cloud Functions
So I now have a local script that I can call to take a screenshot, but I want to build an API. The next logical step is to put this script on a server somewhere.

I donโ€™t want to worry about my server running out of memory, so Iโ€™m going to put it on [Google Cloud Functions](https://cloud.google.com/functions/). This way it can handle a huge number of requests without me having to worry about buying more RAM memory.

Once I have the cloud function running, I can call it with an HTTP request โ€” meaning that I will have a working screenshot API ๐Ÿš€

Letโ€™s port the previous code to the Google Cloud Function format. The cloud function I created is `async` and called `run()`.

So far I have a working screenshot API. But Iโ€™m going to extend it by uploading the screenshots directly to Google Storage.

Iโ€™m going to use the `@google-cloud/storage` npm package for this.
Note that I have created a Google Cloud Storage bucket called `screenshot-api` checkout [this page](https://cloud.google.com/storage/docs/quickstart-console) for how to set up a storage bucket.

```js
const puppeteer = require('puppeteer');
const { Storage } = require('@google-cloud/storage');

const GOOGLE_CLOUD_PROJECT_ID = "portfolio-umut-yildirim";
const BUCKET_NAME = "screenshot-jobs-portfolio-umut-yildirim";

exports.run = async (req, res) => {
res.setHeader("content-type", "application/json");

try {
const buffer = await takeScreenshot(req.body);

let screenshotUrl = await uploadToGoogleCloud(buffer, req.body.name+".png");

res.status(200).send(JSON.stringify({
'screenshotUrl': screenshotUrl
}));

} catch(error) {
res.status(422).send(JSON.stringify({
error: error.message,
}));
}
};

async function uploadToGoogleCloud(buffer, filename) {
const storage = new Storage({
projectId: GOOGLE_CLOUD_PROJECT_ID,
});

const bucket = storage.bucket(BUCKET_NAME);

const file = bucket.file(filename);
await uploadBuffer(file, buffer, filename);

await file.makePublic();

return `https://${BUCKET_NAME}.storage.googleapis.com/${filename}`;
}

async function takeScreenshot(params) {
const browser = await puppeteer.launch({
args: ['--no-sandbox']
});
const page = await browser.newPage();
await page.goto(params.url, {waitUntil: 'networkidle2'});

const buffer = await page.screenshot();

await page.close();
await browser.close();

return buffer;
}

async function uploadBuffer(file, buffer, filename) {
return new Promise((resolve) => {
file.save(buffer, { destination: filename }, () => {
resolve();
});
})
}
```
The new result โ€” My postman client is showing the URL to the screenshot ๐Ÿš€

Note that in the code above each screenshot is saved as screenshot.png on Google Storage. In the real world, you would need to generate a random id for each image.

## Conclusion
Hereโ€™s the source of a [Google Cloud function](https://cloud.google.com/functions/) that, using [Puppeteer](https://pptr.dev/), takes a screenshot of a given website and store the resulting screenshot in a bucket on Google Cloud Storage.
This was a fun project to do.

You can find the my blog post [here](https://portfolio-umut-yildirim.web.app/blog/building-a-website-screenshot-api-with-puppeteer-and-google-cloud-functions).

Thanks for reading!