{"id":13464490,"url":"https://github.com/FriendsOfPHP/Goutte","last_synced_at":"2025-03-25T11:31:38.616Z","repository":{"id":878537,"uuid":"622166","full_name":"FriendsOfPHP/Goutte","owner":"FriendsOfPHP","description":"Goutte, a simple PHP Web Scraper","archived":true,"fork":false,"pushed_at":"2023-04-01T09:06:44.000Z","size":2977,"stargazers_count":9252,"open_issues_count":138,"forks_count":1005,"subscribers_count":346,"default_branch":"master","last_synced_at":"2025-03-21T00:04:38.798Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"PHP","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/FriendsOfPHP.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2010-04-21T19:21:54.000Z","updated_at":"2025-03-15T20:57:51.000Z","dependencies_parsed_at":"2023-07-05T15:16:47.331Z","dependency_job_id":null,"html_url":"https://github.com/FriendsOfPHP/Goutte","commit_stats":{"total_commits":216,"total_committers":77,"mean_commits":"2.8051948051948052","dds":0.662037037037037,"last_synced_commit":"1e6989df37b3a4a74b29c8369db3b42d8e9a1c6b"},"previous_names":["fabpot/goutte"],"tags_count":28,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FriendsOfPHP%2FGoutte","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FriendsOfPHP%2FGoutte/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FriendsOfPHP%2FGoutte/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FriendsOfPHP%2FGoutte/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/FriendsOfPHP","download_url":"https://codeload.github.com/FriendsOfPHP/Goutte/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245454010,"owners_count":20617961,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T14:00:44.579Z","updated_at":"2025-03-25T11:31:38.297Z","avatar_url":"https://github.com/FriendsOfPHP.png","language":"PHP","funding_links":[],"categories":["PHP","All","File abstraction","爬虫 Scraping","Table of Contents","目录","Core Libraries","HTTP"],"sub_categories":["Laravel  docs","Scraping","爬虫 Scraping","PHP","网络请求"],"readme":"Goutte, a simple PHP Web Scraper\n================================\n\nGoutte is a screen scraping and web crawling library for PHP.\n\nGoutte provides a nice API to crawl websites and extract data from the HTML/XML\nresponses.\n\n**WARNING**: This library is deprecated. As of v4, Goutte became a simple proxy\nto the `HttpBrowser class\n\u003chttps://symfony.com/doc/current/components/browser_kit.html#making-external-http-requests\u003e`_\nfrom the `Symfony BrowserKit \u003chttps://symfony.com/browser-kit\u003e`_ component. To\nmigrate, replace ``Goutte\\Client`` by\n``Symfony\\Component\\BrowserKit\\HttpBrowser`` in your code.\n\nRequirements\n------------\n\nGoutte depends on PHP 7.1+.\n\nInstallation\n------------\n\nAdd ``fabpot/goutte`` as a require dependency in your ``composer.json`` file:\n\n.. code-block:: bash\n\n    composer require fabpot/goutte\n\nUsage\n-----\n\nCreate a Goutte Client instance (which extends\n``Symfony\\Component\\BrowserKit\\HttpBrowser``):\n\n.. code-block:: php\n\n    use Goutte\\Client;\n\n    $client = new Client();\n\nMake requests with the ``request()`` method:\n\n.. code-block:: php\n\n    // Go to the symfony.com website\n    $crawler = $client-\u003erequest('GET', 'https://www.symfony.com/blog/');\n\nThe method returns a ``Crawler`` object\n(``Symfony\\Component\\DomCrawler\\Crawler``).\n\nTo use your own HTTP settings, you may create and pass an HttpClient\ninstance to Goutte. For example, to add a 60 second request timeout:\n\n.. code-block:: php\n\n    use Goutte\\Client;\n    use Symfony\\Component\\HttpClient\\HttpClient;\n\n    $client = new Client(HttpClient::create(['timeout' =\u003e 60]));\n\nClick on links:\n\n.. code-block:: php\n\n    // Click on the \"Security Advisories\" link\n    $link = $crawler-\u003eselectLink('Security Advisories')-\u003elink();\n    $crawler = $client-\u003eclick($link);\n\nExtract data:\n\n.. code-block:: php\n\n    // Get the latest post in this category and display the titles\n    $crawler-\u003efilter('h2 \u003e a')-\u003eeach(function ($node) {\n        print $node-\u003etext().\"\\n\";\n    });\n\nSubmit forms:\n\n.. code-block:: php\n\n    $crawler = $client-\u003erequest('GET', 'https://github.com/');\n    $crawler = $client-\u003eclick($crawler-\u003eselectLink('Sign in')-\u003elink());\n    $form = $crawler-\u003eselectButton('Sign in')-\u003eform();\n    $crawler = $client-\u003esubmit($form, ['login' =\u003e 'fabpot', 'password' =\u003e 'xxxxxx']);\n    $crawler-\u003efilter('.flash-error')-\u003eeach(function ($node) {\n        print $node-\u003etext().\"\\n\";\n    });\n\nMore Information\n----------------\n\nRead the documentation of the `BrowserKit`_, `DomCrawler`_, and `HttpClient`_\nSymfony Components for more information about what you can do with Goutte.\n\nPronunciation\n-------------\n\nGoutte is pronounced ``goot`` i.e. it rhymes with ``boot`` and not ``out``.\n\nTechnical Information\n---------------------\n\nGoutte is a thin wrapper around the following Symfony Components:\n`BrowserKit`_, `CssSelector`_, `DomCrawler`_, and `HttpClient`_.\n\nLicense\n-------\n\nGoutte is licensed under the MIT license.\n\n.. _`Composer`: https://getcomposer.org\n.. _`BrowserKit`: https://symfony.com/components/BrowserKit\n.. _`DomCrawler`: https://symfony.com/doc/current/components/dom_crawler.html\n.. _`CssSelector`: https://symfony.com/doc/current/components/css_selector.html\n.. _`HttpClient`: https://symfony.com/doc/current/components/http_client.html\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FFriendsOfPHP%2FGoutte","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FFriendsOfPHP%2FGoutte","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FFriendsOfPHP%2FGoutte/lists"}