Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/zytedata/extract-summit-contest-solutions
Example solutions for the practice and contest websites of the code contest of Web Data Extraction Summit.
https://github.com/zytedata/extract-summit-contest-solutions
Last synced: 2 months ago
JSON representation
Example solutions for the practice and contest websites of the code contest of Web Data Extraction Summit.
- Host: GitHub
- URL: https://github.com/zytedata/extract-summit-contest-solutions
- Owner: zytedata
- License: mit
- Created: 2022-09-25T21:00:33.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-10-21T14:51:55.000Z (3 months ago)
- Last Synced: 2024-10-22T00:48:53.061Z (3 months ago)
- Language: Python
- Size: 16.6 KB
- Stars: 4
- Watchers: 6
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.rst
- License: LICENSE
Awesome Lists containing this project
README
===============================================================
Example solution for the Extract Summit 2024 Coding Competition
===============================================================There are 2 different solution spiders, one that uses AI parsing by default and
only uses custom parsing where AI fails, and one that uses custom parsing code
only, no AI.Both solutions are implemented with the `e-commerce spider`_ from
zyte-spider-templates... _e-commerce spider: https://zyte-spider-templates.readthedocs.io/en/latest/templates/e-commerce.html
To run the AI solution::
scrapy crawl ecommerce -s SOLUTION=ai -a url="https://zzcvcpnfzoogpxiqupsergvrmdopqgrk-744852047878.us-south1.run.app/navigation"
To run the non-AI solution::
scrapy crawl ecommerce -s SOLUTION=non_ai -a url="https://zzcvcpnfzoogpxiqupsergvrmdopqgrk-744852047878.us-south1.run.app/navigation"