Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/intina47/ee_error
implementation of a web crawler using c++
https://github.com/intina47/ee_error
cpp crawler curl gumbo libcurl stanford-nlp web
Last synced: about 2 months ago
JSON representation
implementation of a web crawler using c++
- Host: GitHub
- URL: https://github.com/intina47/ee_error
- Owner: Intina47
- Created: 2023-11-26T02:24:14.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2023-12-13T18:51:37.000Z (about 1 year ago)
- Last Synced: 2024-12-06T08:06:09.629Z (about 2 months ago)
- Topics: cpp, crawler, curl, gumbo, libcurl, stanford-nlp, web
- Language: C++
- Homepage:
- Size: 6.93 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Web Crawler
This is a C++ program that implements a web crawler. It uses the libcurl library for making HTTP requests and the Gumbo HTML parsing library for extracting information from HTML documents.
## Installation
1. Clone the repository:
```
git clone https://github.com/Intina47/EE_error.git
```2. Install the required dependencies. Make sure you have libcurl and Gumbo installed on your system.
3. Build the project using a C++ compiler
```
make
```
4. Run the executable```
./crawler
```