Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/hamidrabedi/digikala-crawler
a crawler for digikala with django framework, selenium and rest api. also scraping data from gathered urls
https://github.com/hamidrabedi/digikala-crawler
crawler digikala digikala-crawler django python scraper
Last synced: 24 days ago
JSON representation
a crawler for digikala with django framework, selenium and rest api. also scraping data from gathered urls
- Host: GitHub
- URL: https://github.com/hamidrabedi/digikala-crawler
- Owner: hamidrabedi
- Created: 2022-03-06T13:40:37.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2022-08-07T21:20:44.000Z (over 2 years ago)
- Last Synced: 2024-02-23T20:25:17.795Z (11 months ago)
- Topics: crawler, digikala, digikala-crawler, django, python, scraper
- Language: Python
- Homepage:
- Size: 20.5 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# DigiKala Crawler
This is a django app for scraping data from specified digikala search urls.
**Install dependencies:**
`pip install -r requirements.txt`
**Make migrations:**
`python manage.py makemigrations`
**Migrate:**
`python manage.py migrate`
**Run project:**
`python manage.py runserver`
# API Guide
**You can use for swagger documentation**
- **URL**
- **Method**
<_The request type_>
`POST`
- **Data Params**
- requierd
digikala search url such as:`category=https://www.digikala.com/search/?q=%D8%B4%DB%8C%D8%A7%D8%A6%D9%88%D9%85%DB%8C`
- optional:
amount of pages that you want to crawl:
`pages=15`
(default is 5)
Features to improve project:
- Tests for api,data and models validations
- Crawl products based on availability of packs
- Automize crawl based on categories
- more specific models
- Crawl all of the ratings and comments for better view of product