Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/praneethkarnena/django-scraping-api
A Django API project to scrape web page when requested through an API.
https://github.com/praneethkarnena/django-scraping-api
api api-server django django-rest-framework python python3
Last synced: about 2 months ago
JSON representation
A Django API project to scrape web page when requested through an API.
- Host: GitHub
- URL: https://github.com/praneethkarnena/django-scraping-api
- Owner: PraneethKarnena
- Created: 2019-08-06T17:20:39.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2022-12-08T05:59:12.000Z (about 2 years ago)
- Last Synced: 2024-04-17T00:59:19.418Z (9 months ago)
- Topics: api, api-server, django, django-rest-framework, python, python3
- Language: Python
- Homepage:
- Size: 18.6 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Django - Asynchronous Web Scraping
A simple Django project to demonstrate the scraping of a website asynchronously.
**Demo:** [https://django-async-web-scraping.herokuapp.com/](https://django-async-web-scraping.herokuapp.com/)
**Tools**:
- Django
- Django Rest Framework
- Heroku**How this works:**
- User makes a `POST` request to the API.
- Sends a list of `URLs` and an `Email ID` as payload
- The system will then download each `URL` along with static assets and HTML files, asynchronously
- Compresses the downloaded files and sends an email to the address in the payload**Routes:**
- Base URL: `https://django-async-web-scraping.herokuapp.com/api/v1/`
- Scrape Request: `https://django-async-web-scraping.herokuapp.com/api/v1/scrape/`
- Request Method: `POST`
- Scheme:
```json
{
"urls": [
"https:/www.google.co.in/",
"https://www.apple.com/"
],
"email": "[email protected]"
}
```