{"id":47719618,"url":"https://github.com/testomato/php-minicrawler","last_synced_at":"2026-04-02T19:16:16.755Z","repository":{"id":279420787,"uuid":"838707236","full_name":"testomato/php-minicrawler","owner":"testomato","description":"PHP Extension for Minicrawler","archived":false,"fork":false,"pushed_at":"2025-10-01T11:22:47.000Z","size":142,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-10-01T11:29:38.274Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://testomato.com/bot","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/testomato.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-08-06T07:25:33.000Z","updated_at":"2025-10-01T11:22:51.000Z","dependencies_parsed_at":null,"dependency_job_id":"07153084-e955-45f4-ba4b-0196cd21ecf3","html_url":"https://github.com/testomato/php-minicrawler","commit_stats":null,"previous_names":["testomato/php-minicrawler"],"tags_count":15,"template":false,"template_full_name":null,"purl":"pkg:github/testomato/php-minicrawler","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/testomato%2Fphp-minicrawler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/testomato%2Fphp-minicrawler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/testomato%2Fphp-minicrawler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/testomato%2Fphp-minicrawler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/testomato","download_url":"https://codeload.github.com/testomato/php-minicrawler/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/testomato%2Fphp-minicrawler/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31314239,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-02T12:59:32.332Z","status":"ssl_error","status_checked_at":"2026-04-02T12:54:48.875Z","response_time":89,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-04-02T19:16:16.172Z","updated_at":"2026-04-02T19:16:16.738Z","avatar_url":"https://github.com/testomato.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PHP Minicrawler\n\nPHP Minicrawler executes HTTP requests while handling cookies, network connection management and SSL/TLS protocols.\nBy default, it follows redirect locations and returns a full response, final URL, parsed cookies and more.\nIt is designed to handle *many* request in parallel in a *single thread* by opening a socket for each connection.\n\n## Build\n\nTested platforms: Debian Linux, OS X\n\nBuild and install [minicrawler](https://github.com/testomato/minicrawler) first.\n\nThen run:\n\n```\nphpize\n./configure\nmake\nsudo make install\n```\n\nAdd following to `minicrawler.ini` or change `php.ini` file and restart PHP:\n\n```\n[minicrawler]\nextension=\"/usr/local/opt/php-minicrawler/modules/minicrawler.so\"\n```\n\n## Building php-minicrawler container\n\n```shell\ndocker buildx bake --file docker-bake.hcl php-minicrawler --push # --no-cache --print\n```\n\nHow to add other tags:\n\n```shell\ndocker buildx bake \\\n  --file docker-bake.hcl \\\n  --set php-minicrawler.tags.COMMIT=\"dr.brzy.cz/testomato/php-minicrawler:$(git rev-parse --short HEAD)\" \\\n  --set php-minicrawler.tags.COMMIT=\"dr.brzy.cz/testomato/php-minicrawler:v5.2.7\" \\\n  --set php-minicrawler.tags.COMMIT=\"dr.brzy.cz/testomato/php-minicrawler:latest\" \\\n  --push\n```\n\n## Try PHP minicrawler\n\nPull latest image and run it:\n\n```shell\ndocker pull dr.brzy.cz/testomato/php-minicrawler:latest\ndocker run -it --rm dr.brzy.cz/testomato/php-minicrawler:latest /bin/bash\n```\n\nthen you can try to install minicrawler locally inside container:\n\n```shell\n# install php 8.4\n# check https://packages.sury.org/php/README.txt for more info\napt-get update\napt-get -y install lsb-release ca-certificates curl\ncurl -sSLo /tmp/debsuryorg-archive-keyring.deb https://packages.sury.org/debsuryorg-archive-keyring.deb\ndpkg -i /tmp/debsuryorg-archive-keyring.deb\nsh -c 'echo \"deb [signed-by=/usr/share/keyrings/deb.sury.org-php.gpg] https://packages.sury.org/php/ $(lsb_release -sc) main\" \u003e /etc/apt/sources.list.d/php.list'\napt-get update\napt install -qy php8.4-cli\n\n# install minicrawler\ncp -r /var/lib/php-minicrawler/usr /\ncp -r /var/lib/php-minicrawler/etc /\nphpenmod minicrawler\n\n```\n\nthen run interactive php shell:\n\n```shell\nphp -a\n```\n\nand try \n\n## Test \u0026 Development\n\n```shell\ndocker compose build php-minicrawler-dev\ndocker compose run --rm php-minicrawler-dev /bin/bash\n````\n\nInside container run:\n\n```shell\nphpize\n./configure\nmake INSTALL_ROOT=\"/var/lib/php-minicrawler\"\n\n# install minicrawler.so\ninstall -v modules/*.so $(php-config --extension-dir)\n\n# install minicrawler.ini\ninstall -v /minicrawler.ini /etc/php/8.4/mods-available\n\n# enable minicrawler\nphpenmod minicrawler\n```\n\nor just run `./rebuild` script inside container.\n\nAfter that you can run `php -m | grep minicrawler` to see if minicrawler is enabled.\nThen you can run tests:\n\n```shell\nmake test\n```\n\n## Install minicrawler into your image\n\n```dockerfile\nCOPY --from=dr.brzy.cz/testomato/php-minicrawler:latest /var/lib/php-minicrawler/usr /usr\nCOPY --from=dr.brzy.cz/testomato/php-minicrawler:latest /var/lib/php-minicrawler/etc /etc\nRUN phpenmod minicrawler\n```\n\nCommand `phpenmod` require `php-common` package to be installed.\n\n## Links\n\n* https://gitlab.int.wikidi.net/testomato/php-minicrawler\n* https://github.com/testomato/minicrawler\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftestomato%2Fphp-minicrawler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftestomato%2Fphp-minicrawler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftestomato%2Fphp-minicrawler/lists"}