{"id":20916317,"url":"https://github.com/marcel0024/fundascraper","last_synced_at":"2026-04-13T21:31:53.343Z","repository":{"id":245858437,"uuid":"817015823","full_name":"Marcel0024/FundaScraper","owner":"Marcel0024","description":"Docker image for monitoring housing listings from Funda and getting notified by webhooks.","archived":false,"fork":false,"pushed_at":"2024-07-09T18:59:03.000Z","size":68,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-12-30T02:46:09.355Z","etag":null,"topics":["docker","docker-image","funda","housing-prices","scraper","scraping","self-hosted","webhooks","webscraper","webscraping"],"latest_commit_sha":null,"homepage":"","language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Marcel0024.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-18T21:07:29.000Z","updated_at":"2024-07-12T11:28:39.000Z","dependencies_parsed_at":"2024-06-29T08:37:57.436Z","dependency_job_id":"3be54d3c-f075-45fd-a1d7-5874e5973dd9","html_url":"https://github.com/Marcel0024/FundaScraper","commit_stats":null,"previous_names":["marcel0024/fundascraper"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Marcel0024/FundaScraper","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Marcel0024%2FFundaScraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Marcel0024%2FFundaScraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Marcel0024%2FFundaScraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Marcel0024%2FFundaScraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Marcel0024","download_url":"https://codeload.github.com/Marcel0024/FundaScraper/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Marcel0024%2FFundaScraper/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31771813,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-13T20:17:16.280Z","status":"ssl_error","status_checked_at":"2026-04-13T20:17:08.216Z","response_time":93,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker","docker-image","funda","housing-prices","scraper","scraping","self-hosted","webhooks","webscraper","webscraping"],"created_at":"2024-11-18T16:21:52.450Z","updated_at":"2026-04-13T21:31:53.310Z","avatar_url":"https://github.com/Marcel0024.png","language":"C#","funding_links":[],"categories":[],"sub_categories":[],"readme":"# FundaScraper - Automate Listings with Docker and Webhooks\n\n[![Build and Publish](https://github.com/Marcel0024/FundaScraper/actions/workflows/build-and-publish-image.yaml/badge.svg?branch=main)](https://github.com/Marcel0024/FundaScraper/actions/workflows/build-and-publish-image.yaml)\n![Static Badge](https://img.shields.io/badge/ghcr.io%2Fmarcel0024%2Ffunda--scraper-1.3.0-purple?logo=github\u0026logoSize=auto\u0026labelColor=262b30\u0026link=https%3A%2F%2Fgithub.com%2FMarcel0024%2FFundaScraper)\n\n\u003cbr /\u003e\n\n`marcel0024/funda-scraper` docker image provides the easiest way to perform web scraping on Funda, the Dutch housing website.\nYou simply provide the URL that you want to be scraped with the prefilled search criteria, and the image does the rest. \nYou can either have webhooks to be notified about new listings (works best with something like `HomeAssistant`). Or you can review the `results.csv`.\nScraping times are set by a CRON expression, so you can set it to once a day, twice a day, etc.\n\nWhat makes this scraper unique is, it imitates a real user browsing the website.\nIt opens a tab inside the browser, loads the page, and waits for the page to load and then scrapes it. Further more you can override all selectors to make it work with future changes on the website.\nThat way you don't have to wait for the image to be updated. Note the browser windows are all opened insided the container you won't physically see the browser.\n\nPlease note:\n\n1. Scraping this website is ONLY allowed for personal use (as per Funda's Terms and Conditions).\n2. Any commercial use of this package is prohibited. The author holds no liability for any misuse of the package.\n\n\n## Docker examples\n\nNote `--tty` and `--cap-add=SYS_ADMIN` are required.\n\n### Docker run\n\n```bash\ndocker run --tty \\\n    -v /data/fundascraper:/data \\\n    -e FUNDA_URL=\"https://www.funda.nl/zoeken/koop?selected_area=%5B%22amsterdam%22%5D\u0026object_type=%5B%22house%22%5D\u0026price=%22-450000%22\" \\\n    -e WEBHOOK_URL=\"http://homeassistantlocal.ip/api/webhook/123-redacted-key\" \\\n    ghcr.io/marcel0024/funda-scraper:latest\n```\n\n### Docker Compose\n\n```yaml\nservices:\n  funda-scraper:\n    image: ghcr.io/marcel0024/funda-scraper:latest\n    container_name: funda-scraper\n    tty: true\n    environment:\n      - FUNDA_URL=https://www.funda.nl/zoeken/koop?selected_area=%5B%22amsterdam%22%5D\u0026object_type=%5B%22house%22%5D\u0026price=%22-450000%22\n      - WEBHOOK_URL=http://homeassistantlocal.ip/api/webhook/123-redacted-key\n      - CRON=0 7,19 * * * # Everyday at 7am and 7pm\n    volumes:\n      - /data/fundascraper:/data\n```\n\n## Environment Variables\n\n| Variable                   | Required         | Default     | Description                                                                                                                                                                                                                                                         |\n| -------------------------- | ---------------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| `CRON`                     | No (has default) | `0 7 * * *` | Every day at 7AM in the morning.                                                                                                                                                                                                                                    |\n| `FUNDA_URL`                | Yes              | -           | The starting URL to scrape. You can build the parameters in the browser and just copy the link. Pricing, area, location, etc are all embedded in the URL, so make sure you filter it on the website before you copy it.                                             |\n| `WEBHOOK_URL`              | No               | -           | The webhook URL to send the new listings to. Note: the first run of the app of a new area you will get spammed, since everything is considered new.                                                                                                                 |\n| `ERROR_WEBHOOK_URL`        | No               | -           | The webhook URL to send errors to parsing fails and stops the app.                                                                                                                                                                                                  |\n| `START_PAGE`               | No               | 1           | The page to start with (pagination)                                                                                                                                                                                                                                 |\n| `TOTAL_PAGES`              | No               | 10          | Total pages to scrape. Increase this if you're quering a big area.                                                                                                                                                                                                  |\n| `RUN_ON_STARTUP`           | No               | false       | Run the crawl on startup. If `false` the next run depends on the `CRON` value.                                                                                                                                                                                      |\n| `TOTAL_PARALLELISM_DEGREE` | No               | 35          | Total tabs inside the browser that can be open at the same time. It's a balance with hardware specs, site limitations against scraping and how fast you want the scraping to be done. These are all done within the container you won't physically see the browser. |\n\n### Selector variables\n\n| Variable                      | Default                          | Description                                                                                                                                    |\n| ----------------------------- | -------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- |\n| `LISTING_CONTAINERS_SELECTOR` | See `FundaScraper/defaults.json` | The selector for the containers holding a listing - should return a list of objects. The rest of the selectors are from inside this container. |\n| `TITLE_SELECTOR`              | See `FundaScraper/defaults.json` | The selector for the address                                                                                                                   |\n| `ZIP_CODE_SELECTOR`           | See `FundaScraper/defaults.json` | The selector for the zipcode                                                                                                                   |\n| `URL_SELECTOR`                | See `FundaScraper/defaults.json` | The selector for the URL                                                                                                                       |\n| `PRICE_SELECTOR`              | See `FundaScraper/defaults.json` | The selector for the price                                                                                                                     |\n| `AREA_SELECTOR`               | See `FundaScraper/defaults.json` | The selector for the area                                                                                                                      |\n| `TOTAL_ROOMS_SELECTOR`        | See `FundaScraper/defaults.json` | The selector for total rooms                                                                                                                   |\n\n\n# Webhook object\nA post is done to the WEBHOOK_URL for each listing with the following JSON object:\n\n```json\n {\n  \"name\": \"Lorem Ipsum\",\n  \"price\": \"€ 12334\",\n  \"zipCode\": \"1234\",\n  \"area\": \"100 m²\",\n  \"totalRooms\": \"4\",\n  \"url\": \"https://funda.nl/koop/#example-link\"\n }\n```\n\n## HomeAssistant webhook endpoint example\n\n```yaml\nalias: \"Funda Alerts\"\ntrigger:\n  - platform: webhook\n    allowed_methods:\n      - POST\n    local_only: true\n    webhook_id: \"123-redacted-key\" # Replace with your own\naction:\n  - service: notify.mobile_app_android # Replace with your own\n    data:\n      title: Funda Alert\n      message: \"{{ trigger.json.title }} {{ trigger.json.zipCode }} is te koop voor {{ trigger.json.price }}\"\n      data:\n        clickAction: \"{{ trigger.json.url }}\"\nmode: single\n```\n\n\n## Troubleshoot/Common issues\n\n### UnauthorizedAccessException: Access to the path '/data/results.csv' is denied\n\nThe app inside the container is running as non-root user. So the application is running as the predefined `app` user which has UID 64198. \n\nFor it to be able to create files in the mounted directory, UID 64198 needs to be able to create files on the host in the `/data/fundascraper` directory (the one defined in the volume).\n\nYou can do that by giving public write access on the host using `chmod o+w /data/fundascraper`.\n\nIf that's too permissive, you can create a user on the host with UID 64198 and give that user group access to the directory.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarcel0024%2Ffundascraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmarcel0024%2Ffundascraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarcel0024%2Ffundascraper/lists"}