{"id":41793549,"url":"https://github.com/kkamara/php-scraper","last_synced_at":"2026-04-01T23:55:17.007Z","repository":{"id":106376391,"uuid":"556898915","full_name":"kkamara/php-scraper","owner":"kkamara","description":":office: (Live Link) (2022) Use PHP technologies to crawl and click buttons on websites with GUI. I highly recommend working with Linux (including virtual machines) or MacOs. Laravel 11.","archived":false,"fork":false,"pushed_at":"2025-02-24T20:05:16.000Z","size":17597,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-02-24T20:36:00.816Z","etag":null,"topics":["bot","crawler","laravel","scraper","spider"],"latest_commit_sha":null,"homepage":"https://github.com/kkamara/php-scraper/actions","language":"PHP","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kkamara.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-10-24T18:19:05.000Z","updated_at":"2025-02-24T19:59:36.000Z","dependencies_parsed_at":"2025-01-12T20:18:59.685Z","dependency_job_id":"ed394c34-cf5d-4451-9612-4654771328ca","html_url":"https://github.com/kkamara/php-scraper","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/kkamara/php-scraper","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kkamara%2Fphp-scraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kkamara%2Fphp-scraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kkamara%2Fphp-scraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kkamara%2Fphp-scraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kkamara","download_url":"https://codeload.github.com/kkamara/php-scraper/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kkamara%2Fphp-scraper/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28744421,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-25T02:46:29.005Z","status":"ssl_error","status_checked_at":"2026-01-25T02:44:29.968Z","response_time":113,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bot","crawler","laravel","scraper","spider"],"created_at":"2026-01-25T05:08:45.121Z","updated_at":"2026-04-01T23:55:16.992Z","avatar_url":"https://github.com/kkamara.png","language":"PHP","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cimg src=\"https://github.com/kkamara/useful/raw/main/php-scraper.gif\" alt=\"php-scraper.gif\" width=\"\"/\u003e\n\n# PhP Scraper [![API](https://github.com/kkamara/php-scraper/actions/workflows/build.yml/badge.svg)](https://github.com/kkamara/php-scraper/actions/workflows/build.yml)\n\n(2022) Use PHP technologies to crawl and click buttons on websites with GUI. I highly recommend working with Linux (including virtual machines) or MacOs. Laravel 13.\n\n* [Important note:](#note)\n\n* [Using Postman?](#postman)\n\n* [Requirements](#requirements)\n\n* [Installation](#installation)\n\n* [Usage](#usage)\n\n* [Adding a new command](#adding-commands)\n\n* [Browser Testing](#testing)\n\n* [Misc](#misc)\n\n* [Contributing](#contributing)\n\n* [License](#license)\n\n## Important note: \u003ca name=\"note\"\u003e\u003c/a\u003e\n\nBefore you try to scrape any website, go through its robots.txt file. You can access it via `domainname/robots.txt`. There, you will see a list of pages allowed and disallowed for scraping. You should not violate any terms of service of any website you scrape.\n\n\u003ca name=\"postman\"\u003e\u003c/a\u003e\n## Using Postman?\n\n[Postman client](https://www.postman.com/).\n\n[Published Postman API Collection](https://documenter.getpostman.com/view/17125932/TzzAKvVe).\n\n## Requirements\n\n* [https://laravel.com/docs](https://laravel.com/docs)\n* [Java](https://www.java.com/en/)\n\n## Installation\n\n```bash\ncp .env.example .env\n# Don't worry when the following step errors related to chromedriver binary, we will install them right after.\ncomposer install\n```\n\n#### Add chromedriver to Path\n\nMake sure Chromedriver is installed and added to your environment Path.\n\n```bash\n# install chromedriver for Panther client.\nvendor/bin/bdi detect drivers\nsudo mv drivers/chromedriver /usr/local/bin/chromedriver\n# Or\n# chromedriver_mac64\n# chromedriver_win32\n# See https://chromedriver.storage.googleapis.com\n# for drivers list.\nwget https://chromedriver.storage.googleapis.com/2.37/chromedriver_linux64.zip\nunzip chromedriver_linux64.zip\nsudo mv chromedriver /usr/local/bin/chromedriver\nchromedriver --version\n```\n\n#### Continue installation\n\n```bash\ncomposer install\nphp artisan key:generate\n# Before running the next command:\n# Update your database details in .env\nphp artisan migrate --seed\nyarn install\nyarn build\n```\n\n#### Download Selenium Server jar file\n\n[Download Selenium Server jar file](https://www.selenium.dev/documentation/grid/getting_started/).\n\nRun the following in a new terminal.\n\n```bash\njava -jar selenium-server-4.29.0.jar standalone --override-max-sessions true --max-sessions 10\n```\n\n[CLI options in the Selenium Grid](https://www.selenium.dev/documentation/grid/configuration/cli_options/).\n\n## Usage\n\nUpdate the command at [./app/Console/Commands/BrowserScrape.php](https://raw.githubusercontent.com/kkamara/php-scraper/develop/app/Console/Commands/BrowserScrape.php)\n\n```bash\nphp artisan browser:scrape\n```\n\n[BrowserInvoker.php](https://raw.githubusercontent.com/kkamara/php-scraper/develop/app/Console/Commands/BrowserInvoker.php)\n\n#### Panther Environment Variables\n\n[Panther Environment Variables](https://github.com/symfony/panther?tab=readme-ov-file#environment-variables).\n\n#### Capabilities\n\n[Capabilities](https://www.browserstack.com/docs/automate/capabilities).\n\n[Using Desired Capabilities](https://chromedriver.chromium.org/capabilities#h.p_ID_52).\n\n## Adding a new command \u003ca name=\"adding-commands\"\u003e\u003c/a\u003e\n\n```bash\nphp artisan make:crawler TestCrawler\n```\n\n## Misc\n\n[See Python Selenium web scraper.](https://github.com/kkamara/python-selenium)\n\n[See MRVL Desktop.](https://github.com/kkamara/mrvl-desktop)\n\n[See PHP ReactJS Boilerplate.](https://github.com/kkamara/php-reactjs-boilerplate)\n\n[See PHP Docker Skeleton.](https://github.com/kkamara/php-docker-skeleton)\n\n[See Python Docker Skeleton.](https://github.com/kkamara/python-docker-skeleton)\n\n## Contributing\nPull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.\n\nPlease make sure to update tests as appropriate.\n\n## License\n[BSD](https://opensource.org/licenses/BSD-3-Clause)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkkamara%2Fphp-scraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkkamara%2Fphp-scraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkkamara%2Fphp-scraper/lists"}