Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/snirsh/aws-scraper-js
A full JS scraper implemented with JS simply running on AWS Lambda configured with SAM and GitHub workflows
https://github.com/snirsh/aws-scraper-js
Last synced: 11 days ago
JSON representation
A full JS scraper implemented with JS simply running on AWS Lambda configured with SAM and GitHub workflows
- Host: GitHub
- URL: https://github.com/snirsh/aws-scraper-js
- Owner: snirsh
- Created: 2024-06-30T12:49:12.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2024-07-04T11:17:19.000Z (4 months ago)
- Last Synced: 2024-10-11T10:25:16.834Z (28 days ago)
- Language: JavaScript
- Size: 110 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# JS Scraping example
## Introduction
This is a simple example of how to scrape a website using JavaScript.
I didn't implement anything too complex so that you'll have a skeleton to work with and experiment.The main scraper is in the `scraper.js` file. It uses the `requests` and parses the HTML using regex.
This could've been done using a library like `cheerio` but I wanted to keep it simple.The `index.js` file is the entry point of the application. It uses the `scraper.js` to scrape the website and then saves the data to a file.
## What's SAM?
AWS SAM is a framework for developing and deploying serverless applications. It's an open-source framework that you can use to build serverless applications on AWS.
You can check the `template.yaml` file to see how the resources are defined.