{"id":16162650,"url":"https://github.com/slashtechno/scrape-and-ntfy","last_synced_at":"2026-01-03T00:57:24.418Z","repository":{"id":243162402,"uuid":"811539074","full_name":"slashtechno/scrape-and-ntfy","owner":"slashtechno","description":"An extremely customizable web scraper with a modular notification system and persistent storage via SQLite.","archived":false,"fork":false,"pushed_at":"2024-09-06T22:20:02.000Z","size":182,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-17T00:09:28.243Z","etag":null,"topics":["docker-compose","framework","headless","selenium","selenium-python","web-scraping"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/scrape-and-ntfy/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/slashtechno.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-06T19:47:14.000Z","updated_at":"2024-09-06T22:20:06.000Z","dependencies_parsed_at":"2024-06-14T14:38:11.590Z","dependency_job_id":"c0ea8f0e-d023-40ee-b60f-fe1e0d550c66","html_url":"https://github.com/slashtechno/scrape-and-ntfy","commit_stats":null,"previous_names":["slashtechno/scrape-and-ntfy"],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/slashtechno%2Fscrape-and-ntfy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/slashtechno%2Fscrape-and-ntfy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/slashtechno%2Fscrape-and-ntfy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/slashtechno%2Fscrape-and-ntfy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/slashtechno","download_url":"https://codeload.github.com/slashtechno/scrape-and-ntfy/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244318560,"owners_count":20433933,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker-compose","framework","headless","selenium","selenium-python","web-scraping"],"created_at":"2024-10-10T02:32:11.312Z","updated_at":"2026-01-03T00:57:24.380Z","avatar_url":"https://github.com/slashtechno.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Scrape and Ntfy  \nAn extremely customizable web scraper with a modular notification system and persistent storage via SQLite.  \n\n![Updates sent to a Slack channel](slack.png)\n\n## Features  \n- Modular notification system  \n    - Currently supports Webhooks (e.g. Discord, Slack, etc.) and [ntfy.sh](https://ntfy.sh)  \n- Web scraping via Selenium  \n- Simple configuration of multiple scrapers with conditional notifications  \n\n\n## Usage\n### Prerequisites\n- A browser\n    - Most Chromium-based browsers and Firefox-based browsers should work\n    - Edge is not recommended  \n    - Selenium should also be able to download and cache the appropriate browser if necessary  \n### Basic Configuration  \n- Configuration for the web scraper is handled through a TOML file\n    - To see an example configuration, see `config.example.toml`  \n    - This can be copied to `config.toml` and edited to suit your needs\n    - To get the CSS selector for an element, you can use your browser's developer tools (`F12`, `Ctrl+Shift+I`, right-click -\u003e Inspect Element, etc.)  \n        1. If you're not already in inspect, you can press `Ctrl+Shift+C` to enter inspect element mode (or just click the inspect button in the developer tools)  \n        2. Click on the element you want to select  \n        3. Right-click on the element in the HTML pane\n        4. Click \"Copy\" -\u003e \"Copy selector\"\n- Some other configuration is handled through environment variables and/or command-line arguments (`--help` for more information)  \n    - For example, to set the path to the configuration file, you can set the `PATH_TO_TOML` environment variable or use the `--path-to-toml` command-line argument  \n### Docker (Recommended)  \n#### Specific perquisites  \n- Docker  \n    - [Docker](https://docs.docker.com/get-docker/) is a platform for developing, shipping, and running applications in containers  \n- Docker Compose  \n#### Installation and usage  \n1. Clone the repository  \n    - `git clone https://github.com/slashtechno/scrape-and-ntfy`\n2. Change directory into the repository\n    - `cd scrape-and-ntfy`\n3. Configure via `config.toml`  \n    - Optionally, you can configure some other options via environment variables or command-line arguments in the `docker-compose.yml` file  \n4. `docker compose up -d`\n    - The `-d` flag runs the containers in the background\n    - If you want, you can run [`sqlite-web`](https://github.com/coleifer/sqlite-web) by uncommenting the appropriate lines in `docker-compose.yml` to view the database in a browser on [localhost:5050](http://localhost:5050)  \n\n### `pip`  \n#### Specific perquisites  \n- Python (3.11+)\n#### Installation and usage  \n1. Install with `pip`  \n    - `pip install scrape-and-ntfy`  \n    - Depending on your system, you may need to use `pip3` instead of `pip` or `python3 -m pip`/`python -m pip`.  \n2. Configure  \n3. Run `scrape-and-ntfy`  \n    - This assumes `pip`-installed scripts are in your `PATH`  \n\n\n### PDM  \n#### Specific perquisites\n- Python (3.11+)  \n- [PDM](https://pdm-project.org/en/latest/)\n#### Installation and usage  \n1. Clone the repository  \n    - `git clone https://github.com/slashtechno/scrape-and-ntfy`\n2. Change directory into the repository\n    - `cd scrape-and-ntfy`  \n3. Run `pdm install`  \n    - This will install the dependencies in a virtual environment  \n    - You may need to specify an interpreter with `pdm use`  \n4. Configure  \n5. `pdm run python -m scrape_and_ntfy`  \n    - This will run the bot with the configuration in `config.toml`\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fslashtechno%2Fscrape-and-ntfy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fslashtechno%2Fscrape-and-ntfy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fslashtechno%2Fscrape-and-ntfy/lists"}