{"id":13567226,"url":"https://github.com/scrapehero/yellowpages-scraper","last_synced_at":"2025-04-04T01:31:25.200Z","repository":{"id":56201947,"uuid":"124041218","full_name":"scrapehero/yellowpages-scraper","owner":"scrapehero","description":"Yellowpages.com Web Scraper written in Python and LXML to extract business details available based on a particular category and location.","archived":false,"fork":false,"pushed_at":"2020-11-20T19:30:30.000Z","size":17,"stargazers_count":73,"open_issues_count":5,"forks_count":62,"subscribers_count":5,"default_branch":"master","last_synced_at":"2024-11-04T22:36:28.385Z","etag":null,"topics":["business-directory","extract","html","lxml","parsing","python","scraper","web-scraper","yellow-pages","yellow-pages-scraper"],"latest_commit_sha":null,"homepage":"https://www.scrapehero.com/how-to-scrape-business-details-from-yellowpages-com-using-python-and-lxml/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/scrapehero.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-03-06T07:50:39.000Z","updated_at":"2024-10-31T06:23:20.000Z","dependencies_parsed_at":"2022-08-15T14:31:49.526Z","dependency_job_id":null,"html_url":"https://github.com/scrapehero/yellowpages-scraper","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scrapehero%2Fyellowpages-scraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scrapehero%2Fyellowpages-scraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scrapehero%2Fyellowpages-scraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scrapehero%2Fyellowpages-scraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/scrapehero","download_url":"https://codeload.github.com/scrapehero/yellowpages-scraper/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247107816,"owners_count":20884793,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["business-directory","extract","html","lxml","parsing","python","scraper","web-scraper","yellow-pages","yellow-pages-scraper"],"created_at":"2024-08-01T13:02:26.310Z","updated_at":"2025-04-04T01:31:22.189Z","avatar_url":"https://github.com/scrapehero.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Yellow Pages Business Details Scraper\n\nYellowpages.com Web Scraper written in Python and LXML to extract business details available based on a particular category and location.\n\nIf you would like to know more about this scraper you can check it out at the blog post 'How to Scrape Business Details from Yellow Pages using Python and LXML' - https://www.scrapehero.com/how-to-scrape-business-details-from-yellowpages-com-using-python-and-lxml/\n\n## Getting Started\n\nThese instructions will get you a copy of the project up and running on your local machine for development and testing purposes.\n\n### Fields to Extract\n\nThis yellow pages scraper can extract the fields below:\n\n1. Rank\n2. Business Name\n3. Phone Number\n4. Business Page\n5. Category\n6. Website\n7. Rating\n8. Street name\n9. Locality\n10. Region\n11. Zipcode\n12. URL\n\n### Prerequisites\n\nFor this web scraping tutorial using Python 3, we will need some packages for downloading and parsing the HTML. \nBelow are the package requirements:\n\n - lxml\n - requests\n\n### Installation\n\nPIP to install the following packages in Python (https://pip.pypa.io/en/stable/installing/) \n\nPython Requests, to make requests and download the HTML content of the pages (http://docs.python-requests.org/en/master/user/install/)\n\nPython LXML, for parsing the HTML Tree Structure using Xpaths (Learn how to install that here – http://lxml.de/installation.html)\n\n## Running the scraper\nWe would execute the code with the script name followed by the positional arguments **keyword** and **place**. Here is an example\nto find the business details for restaurants in Boston. MA.\n\n```\npython3 yellow_pages.py restaurants Boston,MA\n```\n## Sample Output\n\nThis will create a csv file:\n\n[Sample Output](https://raw.githubusercontent.com/scrapehero/yellow_pages/master/restaurants-boston-yellowpages-scraped-data.csv)\n \n \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscrapehero%2Fyellowpages-scraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fscrapehero%2Fyellowpages-scraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscrapehero%2Fyellowpages-scraper/lists"}