Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/maximiliancw/crawlio
Asynchronous web crawling and scraping with Python for minimalists
https://github.com/maximiliancw/crawlio
asyncio crawler fastapi framework picocss python scraper vuejs
Last synced: about 2 months ago
JSON representation
Asynchronous web crawling and scraping with Python for minimalists
- Host: GitHub
- URL: https://github.com/maximiliancw/crawlio
- Owner: maximiliancw
- License: gpl-3.0
- Created: 2021-11-23T21:18:44.000Z (about 3 years ago)
- Default Branch: master
- Last Pushed: 2023-06-15T00:04:42.000Z (over 1 year ago)
- Last Synced: 2024-08-10T19:04:29.531Z (5 months ago)
- Topics: asyncio, crawler, fastapi, framework, picocss, python, scraper, vuejs
- Language: Python
- Homepage:
- Size: 714 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: COPYING
Awesome Lists containing this project
README
# crawlio
Asynchronous web crawling and scraping with Python for minimalists## Features
- Crawling: download an entire website in just a few seconds
- Scraping: Customizable XPath & CSS data selectors
- Zero-configuration: get up and running with ~5 LoC
- Interfaces: Python script + Web UI + JSON APIBuilt with:
- `asyncio`
- `aiohttp`
- `parsel`
- `FastAPI`
- `VueJS`## Setup
```bash
pip install crawlio
```## Usage
Create a custom `Crawler` instance and run it
```python
import asyncio
from crawlio import Crawler, Selectorbot = Crawler(
url='https://quotes.toscrape.com/',
selectors=[
Selector('links', '//a/@href'),
Selector('heading', type='xpath', query='//h3//text()', process=lambda items: ' '.join(items))
]
)
output = asyncio.run(bot.run())
for item in output["data"]:
print(item)
```## License
Copyright (C) 2022-2023 Maximilian WolfThis program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.You should have received a copy of the GNU General Public License
along with this program. If not, see .