{"id":14978741,"url":"https://github.com/gajendrasharma-github/web-scraping","last_synced_at":"2026-01-25T01:32:26.298Z","repository":{"id":253414573,"uuid":"843436624","full_name":"gajendrasharma-github/Web-Scraping","owner":"gajendrasharma-github","description":"Using Selenium and Beautiful Soup","archived":false,"fork":false,"pushed_at":"2024-08-19T18:22:55.000Z","size":738,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-16T07:10:04.387Z","etag":null,"topics":["beautifulsoup","python","scraping-websites","selenium"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gajendrasharma-github.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-16T14:08:05.000Z","updated_at":"2024-08-19T18:22:59.000Z","dependencies_parsed_at":"2024-08-19T20:45:52.892Z","dependency_job_id":null,"html_url":"https://github.com/gajendrasharma-github/Web-Scraping","commit_stats":null,"previous_names":["gajendrasharma-github/web-scraping"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gajendrasharma-github%2FWeb-Scraping","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gajendrasharma-github%2FWeb-Scraping/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gajendrasharma-github%2FWeb-Scraping/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gajendrasharma-github%2FWeb-Scraping/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gajendrasharma-github","download_url":"https://codeload.github.com/gajendrasharma-github/Web-Scraping/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254485043,"owners_count":22078767,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["beautifulsoup","python","scraping-websites","selenium"],"created_at":"2024-09-24T13:58:16.906Z","updated_at":"2026-01-25T01:32:26.259Z","avatar_url":"https://github.com/gajendrasharma-github.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Guide to the repositories:\n\n`1.` Scraping Ambition Box to gather company reviews, ratings, and other important information [Link](https://github.com/gajendrasharma-github/Web-Scraping/blob/main/Ambition%20Box%20Data%20Scraping.ipynb)\n\n`2.` Scraping Ajio using Selenium with step by step process and guiding notes [Link](https://github.com/gajendrasharma-github/Web-Scraping/blob/main/Webscraping%20Selenium%20Ajio%20with%20Notes.ipynb)\n\n`3.` Scraping Laptop Details from Amazon for a comprehensive analysis [Link](https://github.com/gajendrasharma-github/Web-Scraping/blob/main/Scraping%20Laptop%20Details%20from%20Amazon.ipynb)\n\n`4.` Scraping Laptop Details from Amazon for the brand Asus [Link](https://github.com/gajendrasharma-github/Web-Scraping/blob/main/Extracting%20Laptop%20Details%20for%20Brand%20Asus%20Using%20Selenium.ipynb)\n\n`5.` Scraping Election Outcomes from Election Commision of India Results Website [Link](https://github.com/gajendrasharma-github/Web-Scraping/blob/main/Election%20Results%20Scraping.ipynb)\n\n\n## Introduction to Web Scraping\n\nWeb scraping is the automated process of extracting information from websites. It involves fetching the HTML content of a web page, parsing the data, and extracting the desired information for analysis or further processing. Web scraping is a powerful technique for gathering large amounts of data quickly and efficiently, often used for tasks such as price comparison, product reviews analysis, job listings aggregation, and more.\n\n\n### Key Components of Web Scraping:\n\n1. **HTML Parsing:** The process of breaking down the HTML structure of a webpage to access specific elements like text, images, and links.\n2. **Data Extraction:** Identifying and extracting the relevant information from the parsed HTML, such as product prices, names, reviews, and more.\n3. **Handling Dynamic Content:** Many websites use JavaScript to load content dynamically. Scraping such sites often requires simulating a real browser to capture all data.\n4. **Data Storage:** Once the data is extracted, it is often stored in a structured format such as CSV, JSON, or a database for further analysis.\n\n## Project Overview\n\nThis repository contains a series of scripts and notebooks developed from scratch to scrape data from various popular websites including Amazon, Ambition Box, Ajio, and others. Each script demonstrates a practical approach to web scraping, from simple static pages to more complex dynamic websites that require advanced techniques.\n\n### Websites Scraped:\n\n- **Amazon:** Extracting product details such as names, prices, ratings, and reviews.\n- **Ambition Box:** Gathering company reviews, ratings, and employee feedback.\n- **Ajio:** Scraping product listings, prices, discounts, and availability.\n\n### Libraries and Tools Used:\n\n- **BeautifulSoup:** For parsing HTML and navigating the page structure to extract specific elements.\n- **Requests:** For sending HTTP requests to fetch the HTML content of web pages.\n- **Selenium:** Used for handling dynamic content and interacting with JavaScript-rendered pages.\n- **Pandas:** For storing and manipulating the scraped data in a tabular format.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgajendrasharma-github%2Fweb-scraping","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgajendrasharma-github%2Fweb-scraping","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgajendrasharma-github%2Fweb-scraping/lists"}