{"id":25155468,"url":"https://github.com/ddayto21/lead-scraper","last_synced_at":"2025-09-03T10:36:12.376Z","repository":{"id":190870066,"uuid":"515367876","full_name":"ddayto21/Lead-Scraper","owner":"ddayto21","description":"Repository contains a web crawler that searches for emails in a webpage, along with a webscraping script that collects leads from various webpages online filters those links based on some criteria and adds the new links to a queue. All the HTML or some specific information is extracted to be processed by a different pipeline.","archived":false,"fork":false,"pushed_at":"2022-07-19T00:22:42.000Z","size":52,"stargazers_count":15,"open_issues_count":0,"forks_count":5,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-30T08:54:05.070Z","etag":null,"topics":["beautifulsoup4","python","requests","webcrawler","webscraper","yellow-pages"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ddayto21.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2022-07-18T23:10:19.000Z","updated_at":"2025-01-19T19:28:11.000Z","dependencies_parsed_at":"2023-08-26T20:25:14.616Z","dependency_job_id":"6196e631-ff4f-4348-a0f4-e00458bc7d05","html_url":"https://github.com/ddayto21/Lead-Scraper","commit_stats":null,"previous_names":["ddayto21/lead-scraper"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ddayto21/Lead-Scraper","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ddayto21%2FLead-Scraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ddayto21%2FLead-Scraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ddayto21%2FLead-Scraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ddayto21%2FLead-Scraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ddayto21","download_url":"https://codeload.github.com/ddayto21/Lead-Scraper/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ddayto21%2FLead-Scraper/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273431154,"owners_count":25104487,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-03T02:00:09.631Z","response_time":76,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["beautifulsoup4","python","requests","webcrawler","webscraper","yellow-pages"],"created_at":"2025-02-09T00:51:54.895Z","updated_at":"2025-09-03T10:36:12.353Z","avatar_url":"https://github.com/ddayto21.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Repository Overview\nThis repository was built to provide business owners a way to save time by collecting thousands of business leads from Yellow Pages, a website that contains over 27 million businesses in the United States. \n\n![Python-Cover](images/python-image.jpg)\n\nWe use 'requests', a Python library to collect large amounts of unstructured data from Yellow Pages. Then, we use BeautifulSoup to parse relevant information from HTML format. After this process, we use Pandas to create dataframes and save those leads to .CSV files that can be used for marketing campaigns. \n\n## Install the 'Requests' Library\n```\n$ pip install requests\n```\n\n## Import the Requests Library \n```python\n\nimport requests\n\n```\n## Send HTTP Request to Server \n```python\n\nresponse = requests.get(url)\n\n```\n\n## Extract Relevant Data from Response \nWe use BeautifulSoup, a Python library that makes it easy to parse data in HTML files.\n\n### Install the Beautiful Soup Library\n```\n$ pip install beautifulsoup4\n```\n\n### Import the Beautiful Soup Library\n\n```python\n from bs4 import BeautifulSoup\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fddayto21%2Flead-scraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fddayto21%2Flead-scraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fddayto21%2Flead-scraper/lists"}