https://github.com/owainlewis/webquery
An API exposing web pages are pure data (GraphQL + JSON)
https://github.com/owainlewis/webquery
graphql webscraper webscraper-api
Last synced: about 1 year ago
JSON representation
An API exposing web pages are pure data (GraphQL + JSON)
- Host: GitHub
- URL: https://github.com/owainlewis/webquery
- Owner: owainlewis
- Created: 2019-07-03T20:03:52.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2024-09-03T19:29:51.000Z (over 1 year ago)
- Last Synced: 2025-02-01T04:42:11.418Z (about 1 year ago)
- Topics: graphql, webscraper, webscraper-api
- Language: Java
- Size: 70.3 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Web Query
An API exposing web pages are pure data (GraphQL + JSON)
## API
A simple API allows you to pull data from web pages with a simple query language.
To get all the stories from Hacker News
```
POST /api/v1/query \
-H 'Content-Type: application/json' \
-d '{"uri": "https://news.ycombinator.com", "selector": ".storylink"}"'
```
Which will return
```json
{
"elements":[
{
"tag":"a",
"text":"How the Dat Protocol Works",
"attributes":{
"href":"https://datprotocol.github.io/how-dat-works/",
"class":"storylink"
}
},
{
"tag":"a",
"text":"Understanding Kafka with Factorio",
"attributes":{
"href":"https://hackernoon.com/understanding-kafka-with-factorio-74e8fc9bf181",
"class":"storylink"
}
}
]
}
```