{"id":29867166,"url":"https://github.com/codingforentrepreneurs/web-scraping","last_synced_at":"2025-07-30T13:22:25.503Z","repository":{"id":82300953,"uuid":"127453575","full_name":"codingforentrepreneurs/Web-Scraping","owner":"codingforentrepreneurs","description":"Learn how to leverage Python's amazing tools to scrape data from other websites.  The end goal of this course is to scrape blogs to analyze trending keywords and phrases.  We'll be using Python 3.6, Requests, BeautifulSoup, Asyncio, Pandas, Numpy, and more!","archived":false,"fork":false,"pushed_at":"2018-12-14T18:54:09.000Z","size":23,"stargazers_count":114,"open_issues_count":0,"forks_count":65,"subscribers_count":10,"default_branch":"master","last_synced_at":"2025-07-14T15:13:40.130Z","etag":null,"topics":["aysncio","beautifulsoup","beautifulsoup4","joincfe","numpy","pandas","python","python-requests","python3","requests","scraper","sraping","tutorial","web-scraping"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/codingforentrepreneurs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2018-03-30T17:13:45.000Z","updated_at":"2024-12-18T03:58:42.000Z","dependencies_parsed_at":null,"dependency_job_id":"c8e5c220-fc71-4db4-932a-f46d5808c782","html_url":"https://github.com/codingforentrepreneurs/Web-Scraping","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/codingforentrepreneurs/Web-Scraping","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codingforentrepreneurs%2FWeb-Scraping","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codingforentrepreneurs%2FWeb-Scraping/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codingforentrepreneurs%2FWeb-Scraping/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codingforentrepreneurs%2FWeb-Scraping/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/codingforentrepreneurs","download_url":"https://codeload.github.com/codingforentrepreneurs/Web-Scraping/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/codingforentrepreneurs%2FWeb-Scraping/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267875013,"owners_count":24158767,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-30T02:00:09.044Z","response_time":70,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aysncio","beautifulsoup","beautifulsoup4","joincfe","numpy","pandas","python","python-requests","python3","requests","scraper","sraping","tutorial","web-scraping"],"created_at":"2025-07-30T13:22:24.377Z","updated_at":"2025-07-30T13:22:25.488Z","avatar_url":"https://github.com/codingforentrepreneurs.png","language":"Python","readme":"[![Web Scraping Logo](https://cfe2-static.s3-us-west-2.amazonaws.com/media/courses/web-scraping/images/Web-Scraping.jpg)](https://www.codingforentrepreneurs.com/courses/web-scraping/)\n\nLearn how to leverage Python's amazing tools to scrape data from other websites.\n\nThe end goal of this course is to scrape blogs to analyze trending keywords and phrases.\n\nWe'll be using Python 3.6, Requests, BeautifulSoup, Asyncio, Pandas, Numpy, and more!\n\n## Section 1: [Your First Scraping Program](https://www.codingforentrepreneurs.com/courses/web-scraping/your-first-scraping-program/)\nWatch [here](https://www.codingforentrepreneurs.com/courses/web-scraping/your-first-scraping-program/)\n\nFinal code is [first-web-scraping-program.zip](./first-web-scraping-program.zip)\n\n#### Install Guides\nWindows: https://kirr.co/6r8wr9\n\nMac: https://kirr.co/386c7f\n\nLinux: https://kirr.co/c3uvuu\n\n#### Goals of Your First Scraping Program:\n\n1. Enter any url (webpage)\n2. Open and scrape that webpage's words each word\n3. Save that info into a csv\n\n###### Third party Packages\n\n- Python Requests : http://docs.python-requests.org/en/master/\n\n    ```\n    pip install requests\n    ```\n    Basically, it opens the webpage for us in this one.\n\n- BeautifulSoup 4 : https://www.crummy.com/software/BeautifulSoup/bs4/doc/\n\n    ```\n    pip install beautifulsoup4\n    ```\n    This allows us to search \u0026 extract content from an HTML webpage\n\n\n\n\n\n\n## Section 2: [Advancing Scraping](https://www.codingforentrepreneurs.com/courses/web-scraping/advancing-scraping/)\n\n\n#### Goals of Advancing Scraping:\n1. Refine scraping code\n2. Scrape Links\n3. Add Scrape Depth\n4. Scrape \u0026 Parse words in a Post\n\n\n[1 - Welcome](../../tree/118bda3462c7452a828702f3e13a573aa5d28b4a/)\n\n[2 - Get URL Input](../../tree/118bda3462c7452a828702f3e13a573aa5d28b4a/)\n\n[3 - Regular Expression Validation](../../tree/2523039e67cf91ed6552dc31fcc2240b2be30c58/)\n\n[4 - Force Quit Program](../../tree/72c74d214655642bb442e6391a09ca6ab57e1e59/)\n\n[5 - Usability](../../tree/d583c77c7013c0399f51e8052d9c9a1bc0ab044e/)\n\n[6 - Fetch URL](../../tree/38506bc8d45722df18087c624f3910bfc6b61f23/)\n\n[7 - Soupify](../../tree/6b6d4a7d384d49f1f7d69ad1beb6317f8547a99b/)\n\n[8 - Extract Data](../../tree/6fc67c4e424ca64600813dd8a4b16916186e149e/)\n\n[9 - Parse Links](../../tree/beb8beef00da709310267c6e6b94c67f71540b93/)\n\n[10 - Get Local Paths](../../tree/056f95c20c4fc447ff840178dff9abe1cc973880/)\n\n[11 - Local Paths by Regular Expression](../../tree/9909d19a9e2bc2b6934bf571253a3661158ed417/)\n\n[12 - Some Lookup Errors](../../tree/32b91cc58332a01b57406453c5802368f25d6f1c/)\n\n[13 - Scrape Local Paths](../../tree/b6da1d3b02148099514ff8446e0fe535f140a030/)\n\n[14 - Parse Words](../../tree/3a808719fd2f343a9a2e279d65a5d71826d40c30/)\n\n[15 - Python Set](../../tree/2bd7b5fc47cd9d4dddb38b0c63b236e2069845d3/)\n\n[16 - A Recursive Function](../../tree/8bae867a8a89d851333a6ab13aa46f4ba2f76930/)\n\n[17 - Mock Fetching](../../tree/898d69d8e6edab45f266c848b9907192546c5e06/)\n\n[18 - All together](../../tree/fa8de79f2bc4cce3142ff3254ec4aa415eb824d4/)\n\n\n\n## Section 3: [Asyncio \u0026 Web Scraping](https://www.codingforentrepreneurs.com/courses/web-scraping/asyncio-web-scraping/)\n_code coming soon_\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodingforentrepreneurs%2Fweb-scraping","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcodingforentrepreneurs%2Fweb-scraping","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodingforentrepreneurs%2Fweb-scraping/lists"}