https://github.com/not-raspberry/aio_crawler
AIO single website crawler
https://github.com/not-raspberry/aio_crawler
asyncio crawler python3
Last synced: about 1 year ago
JSON representation
AIO single website crawler
- Host: GitHub
- URL: https://github.com/not-raspberry/aio_crawler
- Owner: not-raspberry
- License: mit
- Created: 2016-08-13T23:39:22.000Z (almost 10 years ago)
- Default Branch: master
- Last Pushed: 2016-08-16T10:26:51.000Z (almost 10 years ago)
- Last Synced: 2025-01-29T06:50:03.755Z (over 1 year ago)
- Topics: asyncio, crawler, python3
- Language: Python
- Size: 12.7 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.rst
- License: LICENSE
Awesome Lists containing this project
README
aio_crawler |status|
====================
.. |status| image:: https://travis-ci.org/not-raspberry/aio_crawler.svg?branch=master
:target: https://travis-ci.org/not-raspberry/aio_crawler
Single site web crawler using aiohttp.
Usage
-----
Install from source::
./setup.py install
.. code::
$ aio_crawler --help
Usage: aio_crawler [OPTIONS] SITE_ADDRESS
Crawl the website and print results to stdout.
Options:
-c, --concurrency INTEGER Number of parallel downloads.
-t, --timeout FLOAT Timeout of each single request.
-v, --verbose
--help Show this message and exit.
Development
-----------
It's strongly advised to use a virtualenv.
Install dependencies and the CLI hook::
./setup.py develop
Install test dependencies::
pip install -e '.[tests]'
System requirements
-------------------
Python 3.5.
If your OS does not ship Python 3.5, use pyenv_. It's miserable but better than nothing.
.. _pyenv: https://github.com/yyuu/pyenv