{"id":20494632,"url":"https://github.com/codelucas/yelpcrawl","last_synced_at":"2025-04-13T17:43:05.732Z","repository":{"id":12919508,"uuid":"15597006","full_name":"codelucas/yelpcrawl","owner":"codelucas","description":"Crawl and scrape Yelp's restaurant data for every zip code in the United States (or a specified zipcode). Yelp Crawler.","archived":false,"fork":false,"pushed_at":"2017-05-12T04:49:17.000Z","size":205,"stargazers_count":55,"open_issues_count":2,"forks_count":41,"subscribers_count":10,"default_branch":"master","last_synced_at":"2025-03-27T08:45:07.432Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/codelucas.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2014-01-03T00:08:42.000Z","updated_at":"2024-12-19T10:24:55.000Z","dependencies_parsed_at":"2022-08-28T14:50:18.958Z","dependency_job_id":null,"html_url":"https://github.com/codelucas/yelpcrawl","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codelucas%2Fyelpcrawl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codelucas%2Fyelpcrawl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codelucas%2Fyelpcrawl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codelucas%2Fyelpcrawl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/codelucas","download_url":"https://codeload.github.com/codelucas/yelpcrawl/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248756184,"owners_count":21156727,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-15T17:42:16.400Z","updated_at":"2025-04-13T17:43:05.701Z","avatar_url":"https://github.com/codelucas.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"YelpCrawl: Exhaustive Yelp! Scraper\n===================================\n\nExample usage for `yelp`_ extraction.\n\nExtract all restaurant data from a specific zipcode.\n\n::\n\n    $ python2.7 crawler.py -z 98029\n\n    ===== Attempting extraction for zipcode \u003c 98029 \u003e=====\n    \n    title: Issaquah Coffee Company\n    categories: Coffee \u0026 Tea\n    rating: 4.0 star rating\n    ...\n\n\nExtract all restaurant data from America (all American zipcodes).\n\n::\n\n    $ python2.7 crawler.py\n\n    **We are attempting to extract all zipcodes in Amerrica!**\n\n    ===== Attempting extraction for zipcode \u003c 35004 \u003e=====\n\n    title: Brasher Sam Tire \u0026amp; Auto Service Inc\n    categories: Tires\n    rating: 5.0 star rating\n    ...\n\n\nInstallation:\n-------------\n\n::\n\n    $ git clone https://github.com/codelucas/yelpcrawl\n    $ cd yelpcrawl\n    $ pip install -r requirements.txt\n\nAnd now you can begin!\n\n::\n\n    $ python2.7 crawler.py -z 98029\n\nFeel free to send in pull requests. We need some test cases please :)\n\nThis code was written when the two of us were still relatively new at python \nso excuse the shittyness. This was open sourced just for keepsake, it's nothing\nfancy and there are definitely better scraping solutions out there.\n\nWe used slower parsers like `beautifulsoup`_ and no multithreading\nbecause `yelp`_ would've rate limited us anyways :)\n\nBy: `Lucas`_, `Mathew`_\n\n.. _`yelp`: http://www.yelp.com\n.. _`beautifulsoup`: http://www.crummy.com/software/BeautifulSoup/\n.. _`Lucas`: http://codelucas.com\n.. _`Mathew`: https://www.facebook.com/matsprehn\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodelucas%2Fyelpcrawl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcodelucas%2Fyelpcrawl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodelucas%2Fyelpcrawl/lists"}