https://github.com/crawlerclub/crawler
Crawler4U, a general purpose focused crawler
https://github.com/crawlerclub/crawler
crawler information-extraction spider
Last synced: 5 months ago
JSON representation
Crawler4U, a general purpose focused crawler
- Host: GitHub
- URL: https://github.com/crawlerclub/crawler
- Owner: crawlerclub
- Created: 2018-06-15T03:11:28.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2021-06-16T07:50:33.000Z (about 5 years ago)
- Last Synced: 2024-06-20T12:06:46.351Z (about 2 years ago)
- Topics: crawler, information-extraction, spider
- Language: Go
- Homepage: https://crawler.club
- Size: 34.2 KB
- Stars: 37
- Watchers: 5
- Forks: 6
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Crawler4U: A general purpose focused crawler
## Overview
Crawler4U is a general purpose focused crawling and scraping tool based on json format configurations.
## Install
The Golang way:
```sh
go get crawler.club/crawler
```
Or download the pre-built binaries from [releases](https://github.com/crawlerclub/crawler/releases) for you system.
## Build from source
Before you can build the `crawler` from source. A workable golang development environment is needed. Downlad golang [here](https://golang.org/dl/) and then execute the following commands.
```sh
go get -d crawler.club/crawler
cd $GOPATH/src/crawler.club/crawler
make
```
## Usage
[中文](usage_cn.md)
## Companies using crawler.club/crawler
* [elensdata](https://www.elensdata.com/)
* [huawei](https://www.huawei.com/)
* [baidu](https://www.baidu.com)
* [bytedance](https://www.bytedance.com/)
* [zenia](https://www.zenia.ai/)