https://github.com/jmyrberg/finscraper
Web scraping API for Finnish websites
https://github.com/jmyrberg/finscraper
finnish nlp scraping scrapy web
Last synced: 2 months ago
JSON representation
Web scraping API for Finnish websites
- Host: GitHub
- URL: https://github.com/jmyrberg/finscraper
- Owner: jmyrberg
- License: mit
- Created: 2020-05-02T13:47:59.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2024-05-14T22:17:17.000Z (about 2 years ago)
- Last Synced: 2025-12-15T05:30:14.059Z (6 months ago)
- Topics: finnish, nlp, scraping, scrapy, web
- Language: Python
- Homepage: https://finscraper.readthedocs.io
- Size: 357 KB
- Stars: 12
- Watchers: 1
- Forks: 1
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# finscraper
[](https://github.com/jmyrberg/finscraper/actions/workflows/spiders.yml)
[](https://finscraper.readthedocs.io/en/latest/?badge=latest)

The library provides an easy-to-use API for fetching data from various Finnish websites:
| Website | Type | Spider API class |
|----------------------------------------------------------------|-------------------|--------------------|
| [Ilta-Sanomat](https://www.is.fi) | News article | `ISArticle` |
| [Iltalehti](https://www.il.fi) | News article | `ILArticle` |
| [YLE Uutiset](https://www.yle.fi/uutiset) | News article | `YLEArticle` |
| [Suomi24](https://keskustelu.suomi24.fi) | Discussion thread | `Suomi24Page` |
| [Muusikoiden.net](https://www.muusikoiden.net) | Discussion thread | `MNetPage` |
| [Vauva](https://www.vauva.fi) | Discussion thread | `VauvaPage` |
| [Oikotie Asunnot](https://asunnot.oikotie.fi/myytavat-asunnot) | Apartment ad | `OikotieApartment` |
| [Tori](https://www.tori.fi) | Item deal | `ToriDeal` |
Documentation is available at [https://finscraper.readthedocs.io](https://finscraper.readthedocs.io) and [simple online demo here](https://jmyrberg.com/demo-projects/finscraper).
## Installation
`pip install finscraper`
## Quickstart
Fetch 10 news articles as a pandas DataFrame from [Ilta-Sanomat](https://is.fi):
```python
from finscraper.spiders import ISArticle
spider = ISArticle().scrape(10)
articles = spider.get()
```
The API is similar for all the spiders:

## Contributing
Please see [CONTRIBUTING.md](https://github.com/jmyrberg/finscraper/blob/master/CONTRIBUTING.md) for more information.
---
Jesse Myrberg (jesse.myrberg@gmail.com)