Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/cdk-dev/link-scraper
Extract Preview Data from Websites
https://github.com/cdk-dev/link-scraper
aws-cdk cdk cdk8s cdktf constructs jsii terraform-cdk
Last synced: 3 months ago
JSON representation
Extract Preview Data from Websites
- Host: GitHub
- URL: https://github.com/cdk-dev/link-scraper
- Owner: cdk-dev
- License: apache-2.0
- Created: 2020-10-05T08:13:46.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2022-10-14T21:07:48.000Z (about 2 years ago)
- Last Synced: 2024-06-09T20:38:22.911Z (5 months ago)
- Topics: aws-cdk, cdk, cdk8s, cdktf, constructs, jsii, terraform-cdk
- Language: TypeScript
- Homepage: https://cdk.dev
- Size: 611 KB
- Stars: 8
- Watchers: 5
- Forks: 2
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Content Preview Scraper
This uses [Playwright](https://github.com/microsoft/playwright) to extract content previews from a givenn url. This includes:
- Generic Metadata from Dom
- Open Graph Metadata
- Twitter Tags Metadata
- Screenshot (viewport / full)Still to do: Scrape author data from a social media profile such as Twitter, Github, LinkedIn