https://github.com/joonarafael/tiirascraper
Scraper to fetch, parse, and filter bird observations. Automated Telegram messaging functionality.
https://github.com/joonarafael/tiirascraper
beautifulsoup4 birdwatching bot python requests scraper server telegram telegram-bot webscraping
Last synced: 11 days ago
JSON representation
Scraper to fetch, parse, and filter bird observations. Automated Telegram messaging functionality.
- Host: GitHub
- URL: https://github.com/joonarafael/tiirascraper
- Owner: joonarafael
- License: mit
- Created: 2024-03-17T17:11:42.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-06-13T12:17:16.000Z (about 2 years ago)
- Last Synced: 2024-06-13T14:55:03.561Z (about 2 years ago)
- Topics: beautifulsoup4, birdwatching, bot, python, requests, scraper, server, telegram, telegram-bot, webscraping
- Language: Python
- Homepage: https://www.tiira.fi/
- Size: 75.2 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# TIIRASCRAPER (PYTHON WEB SCRAPER)
 [](https://codecov.io/gh/joonarafael/tiirascraper)
Check the [Installation Manual](https://github.com/joonarafael/tiirascraper/blob/main/docs/installation_manual.md "Installation Manual") and the [User Manual](https://github.com/joonarafael/tiirascraper/blob/main/docs/user_manual.md "User Manual") before advancing further. The user manual will provide you with the information on how to install dependencies, configure files for filtering, and initialize environment variables.
## About
This is a simple web scraper software built with _Python_. Automated testing included as part of CI pipeline. Coverage report uploaded to [Codecov](https://app.codecov.io/gh/joonarafael/tiirascraper/tree/main/ "Codecov Coverage Report").
It fetches the _index_ page of a popular bird observation site [Tiira](https://www.tiira.fi/ "Tiira.fi") and parses the latest most interesting bird observation records.
The program also enables the ability to **create filters for individual cities and species**, so that any records without matching criteria will be disregarded.
In addition, the program supports automated _Telegram_ messaging feature. By configuring your own Telegram bot and applying relevant environment variables, **you can get the latest records straight to your Telegram**!
The program has been built to run "as a server"; it will automatically perform the procedure explained above every 5 minutes. It will read the configuration files and history again and check for a change on the site. The program also recovers from previous errors and e.g. unsuccessful connection attempts.