https://github.com/skykery/urlfetcher
Microservice for HTTP requests with Tor proxies that can handle HTML parsing
https://github.com/skykery/urlfetcher
lxml proxies python requests tor
Last synced: 5 months ago
JSON representation
Microservice for HTTP requests with Tor proxies that can handle HTML parsing
- Host: GitHub
- URL: https://github.com/skykery/urlfetcher
- Owner: skykery
- Created: 2022-09-14T18:56:49.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2022-10-23T14:22:40.000Z (over 3 years ago)
- Last Synced: 2024-12-30T00:34:47.395Z (over 1 year ago)
- Topics: lxml, proxies, python, requests, tor
- Language: Python
- Homepage:
- Size: 16.6 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
# URLFetcher
A microservice that can handle HTTP requests and can serve parsed elements using CSS Selectors and XPath.
The microservice is running on FastAPI as web framework, relies on LXML for the HTML parsing part and is successfully using Tor nodes and proxies to hide the initial request.
To run it locally
`docker-compose -f stack.yaml up -d`
I published a project [URLWorker](https://urlworker.techwetrust.com/) which is based on this microservice and you can try it yourself, for free of course.
### Supports:
- Retries
- Free proxies by default
- Custom proxy
- CSS selectors
- JavaScript rendering
You can make a free account on [https://urlworker.techwetrust.com/](https://urlworker.techwetrust.com/) or watch my tutorials on how to use it right on the [Examples](https://urlworker.techwetrust.com/examples/) section.
Don't forget to star the repo for further updates.