{"id":25350258,"url":"https://github.com/arefshojaei/spider","last_synced_at":"2026-02-13T21:20:38.958Z","repository":{"id":277527250,"uuid":"932303336","full_name":"ArefShojaei/Spider","owner":"ArefShojaei","description":"PHP web spider","archived":false,"fork":false,"pushed_at":"2025-03-26T20:36:05.000Z","size":120,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-26T21:33:35.948Z","etag":null,"topics":["bot","crawler","crawling","php","php-library","php-tools","php8","scraper","scrapping","spider","web","web-bot"],"latest_commit_sha":null,"homepage":"","language":"PHP","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ArefShojaei.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-13T17:40:00.000Z","updated_at":"2025-03-26T20:36:09.000Z","dependencies_parsed_at":"2025-03-11T18:37:58.293Z","dependency_job_id":null,"html_url":"https://github.com/ArefShojaei/Spider","commit_stats":null,"previous_names":["arefshojaei/spider"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ArefShojaei%2FSpider","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ArefShojaei%2FSpider/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ArefShojaei%2FSpider/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ArefShojaei%2FSpider/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ArefShojaei","download_url":"https://codeload.github.com/ArefShojaei/Spider/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247934806,"owners_count":21020724,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bot","crawler","crawling","php","php-library","php-tools","php8","scraper","scrapping","spider","web","web-bot"],"created_at":"2025-02-14T17:00:06.367Z","updated_at":"2026-02-13T21:20:33.937Z","avatar_url":"https://github.com/ArefShojaei.png","language":"PHP","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"https://github.com/user-attachments/assets/4d307a59-9eff-4513-a2cc-f16375174244\" width=\"400px\" height=\"400px\" /\u003e\n\u003c/div\u003e\n\n\u003ch1 align='center'\u003e\n    PHP web spider\n\u003c/h1\u003e\n\n```php\n\u003c?php\n\nuse Spider\\Spider;\n\n$spider = new Spider;\n\n$page = $spider-\u003eloadHTML(\"http://google.com\");\n\necho $page-\u003efind(\"title\")-\u003etext() . PHP_EOL;\n\n$page-\u003efindAll(\"a\")-\u003eeach(function($key, $link) {\n    echo \"[LINK] \" . $link-\u003eattr(\"href\") . PHP_EOL;\n});\n```\n\u003cbr/\u003e\n\n## **Installation**\n\n#### Using Composer\n```bash\ncomposer create-project arefshojaei/spider\n```\n\n#### Using GIT\n```bash\ngit clone https://github.com/ArefShojaei/Spider\n```\n\n\n\n\u003e Find element\n* find()\n* findAll()\n\n```php\n$page-\u003efind(\"a\");\n\n$page-\u003efindAll(\".product\");\n```\n\n\u003e Iterate for each eleemnt\n* each()\n* map()\n* filter()\n\n```php\n$page-\u003efindAll(\"a\")-\u003eeach(function($key, $anchor) {\n    echo \"[LINK] \" . $anchor-\u003eattr(\"href\") . PHP_EOL;\n    echo \"[TITLE] \" . $anchor-\u003etext() . PHP_EOL;\n    echo \"[HTML] \" . $anchor-\u003ehtml() . PHP_EOL;\n});\n\n# ----------------------------------------\n$anchors = $page-\u003efindAll(\"a\")-\u003emap(function($key, $anchor) {\n    $anchor-\u003eattr(\"data-id\", rand());\n\n    return $anchor;\n});\n\nvar_dump($anchors);\n\n# ----------------------------------------\n$filteredAnchors = $page-\u003efindAll(\"a\")-\u003efilter(function($key, $anchor) =\u003e $anchor-\u003eattr(\"data-id\")); \n\nvar_dump($filteredAnchors);\n```\n\n\n\u003e Element traversing\n* parent()\n* after()\n* before()\n* append()\n* prepend()\n\n```php\n$parentNode = $page-\u003efind(\".product\")-\u003eparent();\n\n# Add parent Element\n$page-\u003efind(\".product\")-\u003eafter(\"\u003cp\u003eAfter Element\u003c/p\u003e\");\n$page-\u003efind(\".product\")-\u003ebefore(\"\u003cp\u003eBefore Element\u003c/p\u003e\");\n\n# Add child (local) element\n$page-\u003efind(\".product\")-\u003eappend(\"\u003cp\u003eAppend Element\u003c/p\u003e\");\n$page-\u003efind(\".product\")-\u003eprepend(\"\u003cp\u003ePrepend Element\u003c/p\u003e\");\n```\n\n\u003e Element cleaner\n* empty()\n* remove()\n\n```php\n# Clean element content\n$page-\u003efind(\"p\")-\u003eempty();\n\n# Remove element from the DOM\n$page-\u003efind(\"p\")-\u003eremove();\n```\n\n\u003e Element content\n* text()\n* html()\n\n```php\n# Getter\n$text = $page-\u003efind(\"p\")-\u003etext();\n$html = $page-\u003efind(\"p\")-\u003ehtml();\n\n# Setter\n$newText = $page-\u003efind(\"p\")-\u003etext(\"New text content\");\n$newHtml = $page-\u003efind(\"p\")-\u003ehtml(\"\u003cp id='spider'\u003eNew html content\u003c/p\u003e\");\n```\n\n\u003e Element attribute\n* attr()\n* addClass()\n* removeClass()\n* hasClass()\n* addId()\n* removeId()\n* hasId()\n\n```php\n# Getter\n$attributes = $page-\u003efind(\"a\")-\u003eattr();\n\n$link = $page-\u003efind(\"a\")-\u003eattr(\"href\");\n\n# Setter\n$page-\u003efind(\"a\")-\u003eattr(\"data-id\", rand());\n\n# Class\n$page-\u003efind(\"p\")-\u003eaddClass(\"spider\");\n$page-\u003efind(\"p\")-\u003eremoveClass(\"spider\");\n$page-\u003efind(\"p\")-\u003ehasClass(\"spider\");\n\n# ID\n$page-\u003efind(\"p\")-\u003eaddID(\"spider\");\n$page-\u003efind(\"p\")-\u003eremoveID(\"spider\");\n$page-\u003efind(\"p\")-\u003ehasID(\"spider\");\n```\n\n\n\u003e Export current page content\n```php\n$filename = \"app\";\n\n$path = __DIR__ . \"\\\\html\\\\\" . $filename . rand() . \".html\";\n\n$page-\u003eexport($path);\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farefshojaei%2Fspider","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Farefshojaei%2Fspider","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farefshojaei%2Fspider/lists"}