https://github.com/wetrycode/tegenaria
Tegenaria is a crawler framework based on golang
https://github.com/wetrycode/tegenaria
crawler crawler-engine crawler-framework framework go golang spider spiders
Last synced: 3 months ago
JSON representation
Tegenaria is a crawler framework based on golang
- Host: GitHub
- URL: https://github.com/wetrycode/tegenaria
- Owner: wetrycode
- License: mit
- Created: 2021-12-04T17:15:24.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2023-12-23T10:13:18.000Z (over 2 years ago)
- Last Synced: 2024-06-19T20:51:30.511Z (almost 2 years ago)
- Topics: crawler, crawler-engine, crawler-framework, framework, go, golang, spider, spiders
- Language: Go
- Homepage: https://wetrycode.github.io/tegenaria/
- Size: 1.45 MB
- Stars: 9
- Watchers: 2
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# Tegenaria crawl framework
[](https://goreportcard.com/report/github.com/wetrycode/tegenaria)
[](https://codecov.io/gh/wetrycode/tegenaria)
[](https://github.com/wetrycode/tegenaria/actions/workflows/go.yml/badge.svg)
[](https://github.com/wetrycode/tegenaria/actions/workflows/codeql-analysis.yml)
tegenaria是一个基于golang开发的快速、高效率的网络爬虫框架
# 特性
- 支持分布式
- 支持自定义分布式组件,包括去重、request缓存队列和基本统计指标的分布式运行
- 支持自定义的事件监控
- 支持命令行控制
- 支持gRPC和web api远程控制
- ~~支持定时轮询启动爬虫~~
## 安装
1. go 版本要求>=1.19
```bash
go get -u github.com/wetrycode/tegenaria@latest
```
2. 在您的项目中导入
```go
import "github.com/wetrycode/tegenaria"
```
# 快速开始
查看实例demo [example](example)
# 文档
- [入门](https://wetrycode.github.io/tegenaria/#/quickstart)
# TODO
- ~~管理WEB API~~
# Contribution
Feel free to PR and raise issues.
Send me an email directly, vforfreedom96@gmail.com
## License
[MIT](LICENSE) © wetrycode