https://github.com/maxhalford/web-crawler-l3sid
https://github.com/maxhalford/web-crawler-l3sid
Last synced: 3 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/maxhalford/web-crawler-l3sid
- Owner: MaxHalford
- Created: 2015-03-20T11:32:45.000Z (almost 11 years ago)
- Default Branch: master
- Last Pushed: 2015-05-15T14:18:29.000Z (over 10 years ago)
- Last Synced: 2025-02-08T16:32:20.998Z (11 months ago)
- Language: TeX
- Size: 801 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Perl Web Crawler
This was one of my first university assignements (``tp5.pdf``) in my junior year. The idea was to crawl through a list of links and extract the relevant data in elegant format (ie. JSON, SQL and HTML). I coded a simple terminal interface so that the user could modify how the script ran (sorry but it's in French!).

If the user runs the script with ``CTRL+D`` then the sample output is the following.

A full explanation is available in the PDF file named ``Rapport.pdf``.