{"id":26339277,"url":"https://github.com/khushi-sabarad/web_scraping","last_synced_at":"2026-04-29T13:36:05.624Z","repository":{"id":282648854,"uuid":"938141919","full_name":"khushi-sabarad/Web_Scraping","owner":"khushi-sabarad","description":"This project is a Python-based web scraper that extracts the menu from a cafe and saves it to an Excel file. It was created to automate the process of retrieving and updating menu prices, a task that was observed to be done manually  at the hostel.","archived":false,"fork":false,"pushed_at":"2025-04-16T18:23:56.000Z","size":11,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-03T20:43:52.838Z","etag":null,"topics":["beautifulsoup","data-analysis","data-visualization","market-analysis","pandas","python","requests","web-scraping","wordcloud"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/khushi-sabarad.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-02-24T13:42:53.000Z","updated_at":"2025-04-16T18:26:40.000Z","dependencies_parsed_at":"2025-06-08T02:45:35.266Z","dependency_job_id":null,"html_url":"https://github.com/khushi-sabarad/Web_Scraping","commit_stats":null,"previous_names":["khushi-sabarad/web_scraping"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/khushi-sabarad/Web_Scraping","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/khushi-sabarad%2FWeb_Scraping","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/khushi-sabarad%2FWeb_Scraping/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/khushi-sabarad%2FWeb_Scraping/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/khushi-sabarad%2FWeb_Scraping/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/khushi-sabarad","download_url":"https://codeload.github.com/khushi-sabarad/Web_Scraping/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/khushi-sabarad%2FWeb_Scraping/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32427869,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-29T13:34:34.882Z","status":"ssl_error","status_checked_at":"2026-04-29T13:34:29.830Z","response_time":110,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["beautifulsoup","data-analysis","data-visualization","market-analysis","pandas","python","requests","web-scraping","wordcloud"],"created_at":"2025-03-16T03:16:57.925Z","updated_at":"2026-04-29T13:36:05.618Z","avatar_url":"https://github.com/khushi-sabarad.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Zostel Vagamon Menu Scraper\n\n## Project Description\nThis project is a Python-based web scraper that extracts the menu from the Monkey Tribe Cafe at Zostel Vagamon and saves it to an Excel file.  It was created to automate the process of retrieving and updating menu prices, a task that was observed to be done manually and inefficiently at the hostel.\n\n## Background Story\nDuring my time volunteering at Zostel Vagamon for two months, I witnessed the manual process of updating the cafe menu prices.  Whenever there was a change in pricing (such as during peak season), the receptionist would copy and paste the menu items and their prices from the online menu into a text file. This file was then sent to the property manager, who would again manually retype the data, adding the new prices in a bracket after each item.\n\nThis repetitive, time-consuming process struck me as a perfect example of a task that could be easily automated with web scraping.  Instead of error-prone manual entry, a simple script could:\n\n1. Fetch the latest menu data directly from the source (the online cafe menu).\n2. Generate a clean Excel file.\n3. This file could then be used to calculate new prices by applying a percentage increase using Excel functions, significantly reducing the workload and potential for errors.\n\nThis project aims to provide a quick and efficient way to get the menu in a structured format, ready for further processing.\n\n## Features\n- Web Scraping: Extracts menu data (item names and prices) from the Monkey Tribe Cafe Vagamon website.\n- Excel Output: Saves the scraped data into an Excel file (.xlsx) using the pandas library.\n- Structured Data: The Excel file is formatted for easy use, with item names and prices in separate columns.\n- Error Handling: Includes error handling for web page fetching and file saving.\n\n## Code Explanation\nThe script is written in Python and uses the following libraries:\n- `requests`: For sending HTTP requests to fetch the HTML content of the webpage.\n- `BeautifulSoup`: For parsing the HTML content and navigating the document structure to locate the menu data.\n- `pandas`: For creating a DataFrame to store the scraped data and saving it to an Excel file.\n- `openpyxl`: For advanced manipulation of the Excel file, specifically for formatting the output (making category names bold).\n\n## How the Script Works\n1. Fetching the Webpage:\n- The script uses the `requests.get()` function to retrieve the HTML content from the Monkey Tribe Cafe Vagamon website.\n- It includes error handling using `response.raise_for_status()` to catch any potential issues with the request (e.g., the website is down, or the page is not found).\n\n2. Parsing the HTML:\n- The `BeautifulSoup` library is used to parse the HTML content, creating a navigable tree structure.\n- The script then uses BeautifulSoup's methods (e.g., `find_all()`, `find())` to locate the specific HTML elements that contain the menu data.  This is done by identifying HTML tags and CSS classes.\n\n3. Locating Menu Data:\n- The script targets specific `\u003cspan\u003e` tags with the class wixui-rich-text__text.\n- It uses the `style` attribute of these tags, specifically the `font-family`, to differentiate between category headings and menu item names/prices.\n\n4. Data Extraction:\nThe script iterates through the located elements and extracts the relevant text for the category, item name, and price, cleaning the data by removing extra whitespace.\n\n5. Data Structuring:\n- The extracted data is organized into a Python dictionary, where the keys are the menu categories (e.g., \"Appetizers\", \"Soups\"), and the values are lists of dictionaries, with each dictionary containing the \"item\" name and \"price\".\n- This structured data is then converted into a pandas DataFrame.\n\n6. Saving to Excel:\n- The `pandas` library is used to create a DataFrame and save it to an Excel file (.xlsx) using the to_excel() method.\n- The `openpyxl` library is used to further format the excel file (The category rows are made bold).\n\nOpen the menu.xlsx file with Microsoft Excel or another spreadsheet program.\n- The data will be organized into three columns:\n  - Name: Contains the Category name and Item Name.\n  - Price: Contains the price of the item.\n  - Category: \"Category\" or \"Item\" to help with filtering.\n- You can then use Excel functions to add a column for the new price (e.g., by applying a percentage increase) as required.\n\n## Disclaimer\n* The structure of the website may change. If the script fails, you may need to modify the element selectors in the code.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkhushi-sabarad%2Fweb_scraping","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkhushi-sabarad%2Fweb_scraping","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkhushi-sabarad%2Fweb_scraping/lists"}