{"id":18553547,"url":"https://github.com/quantumudit/ebooks-extractor-app","last_synced_at":"2026-05-08T05:36:04.796Z","repository":{"id":200181137,"uuid":"704988325","full_name":"quantumudit/Ebooks-Extractor-App","owner":"quantumudit","description":"An efficient web scraping tool for gathering detailed book information from eBooks.com, with user-friendly selection options for categories, subjects, and topics, culminating in the generation of downloadable CSV files.","archived":false,"fork":false,"pushed_at":"2023-10-17T09:20:31.000Z","size":3623,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-12-26T08:42:27.767Z","etag":null,"topics":["python","streamlit","webapp","websraping"],"latest_commit_sha":null,"homepage":"https://ebooks-extractor-app.streamlit.app/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/quantumudit.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-10-14T17:55:17.000Z","updated_at":"2023-10-14T18:12:24.000Z","dependencies_parsed_at":null,"dependency_job_id":"21ae9611-b705-4c0e-9e74-8be50ed1d65c","html_url":"https://github.com/quantumudit/Ebooks-Extractor-App","commit_stats":null,"previous_names":["quantumudit/ebooks-extractor-app"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quantumudit%2FEbooks-Extractor-App","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quantumudit%2FEbooks-Extractor-App/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quantumudit%2FEbooks-Extractor-App/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quantumudit%2FEbooks-Extractor-App/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/quantumudit","download_url":"https://codeload.github.com/quantumudit/Ebooks-Extractor-App/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239278526,"owners_count":19612329,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["python","streamlit","webapp","websraping"],"created_at":"2024-11-06T21:17:31.781Z","updated_at":"2025-11-01T09:30:37.681Z","avatar_url":"https://github.com/quantumudit.png","language":"Python","funding_links":["https://www.buymeacoffee.com/quantumudit"],"categories":[],"sub_categories":[],"readme":"# ![Project Logo][project_logo]\n\n---\n\n\u003ch4 align=\"center\"\u003eEmpowering users to access tailored book selections from the \u003ca href=\"https://www.ebooks.com/\" target=\"_blank\"\u003eEbooks\u003c/a\u003e website. This web application, developed with \u003ca href=\"https://www.python.org/\" target=\"_blank\"\u003ePython\u003c/a\u003e and \u003ca href=\"https://streamlit.io/\" target=\"_blank\"\u003eStreamlit\u003c/a\u003e, streamlines the process of downloading books that match their preferences.\u003c/h4\u003e\n\n\u003cp align='center'\u003e\n\u003cimg src=\"https://forthebadge.com/images/badges/built-with-love.svg\" alt=\"built-with-love\" border=\"0\"\u003e\n\u003cimg src=\"https://forthebadge.com/images/badges/powered-by-coffee.svg\" alt=\"powered-by-coffee\" border=\"0\"\u003e\n\u003cimg src=\"https://forthebadge.com/images/badges/cc-nc-sa.svg\" alt=\"cc-nc-sa\" border=\"0\"\u003e\n\u003c/p\u003e\n\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"#overview\"\u003eOverview\u003c/a\u003e •\n  \u003ca href=\"#prerequisites\"\u003ePrerequisites\u003c/a\u003e •\n  \u003ca href=\"#architecture\"\u003eArchitecture\u003c/a\u003e •\n  \u003ca href=\"#demo\"\u003eDemo\u003c/a\u003e •\n  \u003ca href=\"#support\"\u003eSupport\u003c/a\u003e •\n  \u003ca href=\"#license\"\u003eLicense\u003c/a\u003e\n\u003c/p\u003e\n\n## Overview\n\nThe primary goal of this project revolves around the retrieval of comprehensive book data from the [Ebooks][website_link] website.\n\n\u003cp align='center'\u003e\n  \u003ca href=\"https://www.ebooks.com/\"\u003e\n    \u003cimg src=\"./images/website_snippet.png\" alt=\"website-snippet\" style=\"0\"\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\nThe web application has been meticulously designed to cater to on-demand web scraping. In essence, it selectively extracts essential book information based on the user's specified choices regarding category, subject, and topic.\n\nOnce the user designates a category, the application promptly generates a list of associated subjects for the user to select from. Likewise, upon selecting a subject, the application dynamically populates a dropdown menu with relevant topics (if available).\n\n\u003cp align='center'\u003e\n  \u003ca href=\"https://ebooks-extractor-app.streamlit.app/\"\u003e\n    \u003cimg src=\"./images/webapp_image.png\" alt=\"webapp_image\" style=\"0\"\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\nArmed with these three choices, users can effortlessly obtain their desired information in the form of a downloadable CSV file, simply by clicking the \"Get Data\" button.\n\nThe project repository exhibits the following structure:\n\n```\nEbooks-Extractor-App/\n└─ 📁.streamlit/\n    ├─ ⚙️config.toml\n├─ 🐍app.py\n├─ 🐍scraper_functions.py\n├─ 🗒️readme.md\n├─ 🗒️requirements.txt\n├─ 📜.gitignore\n├─ 🔑LICENSE\n└─ 📁images/\n   ├─ 🖼️books_image.jpg\n   ├─ 🖼️ebooks_logo.png\n   ├─ 🖼️process_workflow.png\n   ├─ 🖼️webapp_graphic.gif\n   ├─ 🖼️webapp_image.png\n   ├─ 🖼️website_snippet.png\n```\nThe Streamlit application is driven by two fundamental Python scripts:\n\n- **🐍[app.py][app]**: This script capitalizes on functions from the [scraper_functions.py][scraper_funcs] file, enabling seamless web scraping. Moreover, it stands as the cornerstone of the Streamlit application.\n\n- **🐍[scraper_functions.py][scraper_funcs]**: This file houses a collection of functions specifically designed for data extraction via web scraping techniques.\n\n\n## Prerequisites\n\nTo fully grasp the concepts and processes involved in this project, it is recommended to have a solid understanding of the following skills:\n\n- Fundamental knowledge of Python, APIs, Streamlit\n- Familiarity with the Python libraries listed in the 🗒️[requirements.txt][requirements] file\n- Basic familiarity with browser developer tools\n\nHaving these skills as a foundation will help to ensure a smooth and effective experience while working on this project.\n\n\u003e The selection of applications and their installation process may differ depending on personal preferences and computer configurations.\n\n## Architecture\n\nThe architectural design of this project is transparent and can be readily comprehended with the assistance of the accompanying diagram illustrated below:\n\n![Process Architecture][process_workflow]\n\nThe project's architectural framework encompasses the following key steps:\n\n### User Interaction\n\nThe user initiates the process by selecting their desired category from the available options.\nBased on the chosen category, the web application dynamically scrapes and presents a list of related subjects for the user's selection.\n\nUpon subject selection, the web app proceeds to scrape topics associated with the selected subject (if available).\n\nThe user can then finalize their selection by choosing \"Get Data\"\n\n### Data Retrieval\n\nSubsequently, the web application conducts a comprehensive scraping operation to gather book-related information. This gathered data is then structured into a CSV file format.\n\n### User Output\n\nThe user is provided with a downloadable CSV file containing the acquired book data, facilitating easy access to the information they require.\n\n\n## Demo\n\nThe following illustration demonstrates the process of collecting data by providing necessary inputs to the web application:\n\n\u003cp align='center'\u003e\n  \u003ca href=\"https://ebooks-extractor-app.streamlit.app/\"\u003e\n    \u003cimg src=\"./images/webapp_graphic.gif\" alt=\"webapp-graphic\" style=\"0\"\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n\u003e Access the web application by clicking here: **[Ebooks Extractor App][webapp_link]**\n\n\n\n## Support\n\nIf you have any questions, concerns, or suggestions, feel free to reach out to me through any of the following channels:\n\n[![Linkedin Badge][linkedinbadge]][linkedin] [![Twitter Badge][twitterbadge]][twitter] [![Medium Badge][mediumbadge]][medium]\n\n\nIf you find my work valuable, you can show your appreciation by [buying me a coffee][buy_me_a_coffee]\n\n\u003ca href=\"https://www.buymeacoffee.com/quantumudit\" target=\"_blank\"\u003e\n\u003cimg src=\"https://i.ibb.co/9cyrq6m/buy-me-a-coffee.png\" alt=\"buy-me-a-coffee\" border=\"0\" width=\"170\" height=\"50\"\u003e\n\u003c/a\u003e\n\n## License\n\n\u003ca href = 'https://creativecommons.org/licenses/by-nc-sa/4.0/' target=\"_blank\"\u003e\n    \u003cimg src=\"https://i.ibb.co/mvmWGkm/by-nc-sa.png\" alt=\"by-nc-sa\" border=\"0\" width=\"88\" height=\"31\"\u003e\n\u003c/a\u003e\n\nThis license allows reusers to distribute, remix, adapt, and build upon the material in any medium or format for noncommercial purposes only, and only so long as attribution is given to the creator. If you remix, adapt, or build upon the material, you must license the modified material under identical terms.\n\n---\n\u003cp align='center'\u003e\n  \u003ca href=\"https://topmate.io/quantumudit\"\u003e\n    \u003cimg src=\"./images/topmate_featured.png\" alt=\"topmate-udit\" style=\"0\"\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n---\n\n\u003c!-- Image Links --\u003e\n\n[project_logo]: ./images/ebooks_logo.png\n[process_workflow]: ./images/process_workflow.png\n\n\u003c!-- External Links --\u003e\n\n[website_link]: https://www.ebooks.com/\n[webapp_link]: https://ebooks-extractor-app.streamlit.app/\n[requirements]: ./requirements.txt\n\n\n\u003c!-- Project Specific Links --\u003e\n\n[app]: ./app.py\n[scraper_funcs]: ./scraper_functions.py \n\n\u003c!-- Profile Links --\u003e\n\n[linkedin]: https://www.linkedin.com/in/uditkumarchatterjee/\n[twitter]: https://twitter.com/quantumudit\n[medium]: https://medium.com/@quantumudit\n[buy_me_a_coffee]: https://www.buymeacoffee.com/quantumudit\n\n\u003c!-- Shields Profile Links --\u003e\n\n[linkedinbadge]: https://img.shields.io/badge/-uditkumarchatterjee-0e76a8?style=flat\u0026labelColor=0e76a8\u0026logo=linkedin\u0026logoColor=white\n[twitterbadge]: https://img.shields.io/badge/-quantumudit-000000?style=flat\u0026labelColor=000000\u0026logo=x\u0026logoColor=white\u0026link=https://twitter.com/quantumudit\n[mediumbadge]: https://img.shields.io/badge/-quantumudit-02b875?style=flat\u0026labelColor=02b875\u0026logo=medium\u0026logoColor=white\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquantumudit%2Febooks-extractor-app","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fquantumudit%2Febooks-extractor-app","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquantumudit%2Febooks-extractor-app/lists"}