https://github.com/sio/scrapehelper
Helpful library for scraping information from web
https://github.com/sio/scrapehelper
Last synced: 24 days ago
JSON representation
Helpful library for scraping information from web
- Host: GitHub
- URL: https://github.com/sio/scrapehelper
- Owner: sio
- License: apache-2.0
- Created: 2019-06-12T10:49:28.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2023-05-05T13:25:03.000Z (almost 2 years ago)
- Last Synced: 2025-02-11T20:49:48.796Z (3 months ago)
- Language: Python
- Size: 18.6 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Helpful library for scraping information from web
## Project status
New project, used by one person. The API changes should be backwards
compatible most of the time.## Overview
This library provides helper tools for efficient and polite web scraping:
- Thread safe `RateLimiter` object
- Nice `BaseDataFetcher` class for creating custom data fetchers## Installation
1. As a standalone Python package:
`pip install "https://github.com/sio/scrapehelper/tarball/master"`2. As a dependency in your setup.py:
```python
install_requires=[
'scrapehelper @ https://github.com/sio/scrapehelper/tarball/master',
# other dependencies
],
```## Usage
API docs are yet to be written. The primary objects provided by this library
are `scrapehelper.fetch.BaseDataFetcher` and `scrapehelper.limit.RateLimiter`.Check the code of the corresponding modules for more information. Submitting
documentation improvements via pull requests is very welcome!## Support and contributing
If you need help with including this library into your Python project, please
create **[an issue](https://github.com/sio/scrapehelper/issues)**. Issues are
also the primary venue for reporting bugs and posting feature requests.
General discussion related to this project is also acceptable and very
welcome!In case you wish to contribute code or documentation, feel free to open **[a
pull request](https://github.com/sio/scrapehelper/pulls)**. That would certainly
make my day!I'm open to dialog and I promise to behave responsibly and treat all
contributors with respect. Please try to do the same, and treat others the way
you want to be treated.If for some reason you'd rather not use the issue tracker, contacting me via
email is OK too. Please use a descriptive subject line to enhance visibility
of your message. Also please keep in mind that public discussion channels are
preferable because that way many other people may benefit from reading past
conversations. My email is visible under the GitHub profile and in the commit
log.## License and copyright
Copyright 2019 Vitaly Potyarkin
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.