https://github.com/maxhalford/web-crawler-l3sid

Last synced: 3 months ago
JSON representation

Host: GitHub
URL: https://github.com/maxhalford/web-crawler-l3sid
Owner: MaxHalford
Created: 2015-03-20T11:32:45.000Z (almost 11 years ago)
Default Branch: master
Last Pushed: 2015-05-15T14:18:29.000Z (over 10 years ago)
Last Synced: 2025-02-08T16:32:20.998Z (11 months ago)
Language: TeX
Size: 801 KB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Perl Web Crawler

This was one of my first university assignements (``tp5.pdf``) in my junior year. The idea was to crawl through a list of links and extract the relevant data in elegant format (ie. JSON, SQL and HTML). I coded a simple terminal interface so that the user could modify how the script ran (sorry but it's in French!).

![Terminal](example1.png)

If the user runs the script with ``CTRL+D`` then the sample output is the following.

![Sample](example2.png)

A full explanation is available in the PDF file named ``Rapport.pdf``.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/maxhalford/web-crawler-l3sid

Awesome Lists containing this project

README