{"id":25074940,"url":"https://github.com/nathancordeiro/web-scraper","last_synced_at":"2025-03-31T19:47:31.170Z","repository":{"id":237483986,"uuid":"792648668","full_name":"NathanCordeiro/WEB-SCRAPER","owner":"NathanCordeiro","description":"A python GUI web scraper made with beautiful soup and PyQt5. ","archived":false,"fork":false,"pushed_at":"2024-05-26T18:25:52.000Z","size":999,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-07T00:20:20.404Z","etag":null,"topics":["beautifulsoup4","pyqt5","python","python-gui","web-scraper"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/NathanCordeiro.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-04-27T06:58:41.000Z","updated_at":"2025-01-26T19:11:33.000Z","dependencies_parsed_at":"2024-05-03T20:04:13.294Z","dependency_job_id":"debd4a28-46f5-4794-8bda-df7e7de93d91","html_url":"https://github.com/NathanCordeiro/WEB-SCRAPER","commit_stats":null,"previous_names":["nathancordeiro/web-scraper"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NathanCordeiro%2FWEB-SCRAPER","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NathanCordeiro%2FWEB-SCRAPER/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NathanCordeiro%2FWEB-SCRAPER/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NathanCordeiro%2FWEB-SCRAPER/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/NathanCordeiro","download_url":"https://codeload.github.com/NathanCordeiro/WEB-SCRAPER/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246531983,"owners_count":20792735,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["beautifulsoup4","pyqt5","python","python-gui","web-scraper"],"created_at":"2025-02-07T00:19:46.168Z","updated_at":"2025-03-31T19:47:31.147Z","avatar_url":"https://github.com/NathanCordeiro.png","language":"Python","readme":"---\n\u003ch1 align=center\u003e\u003cb\u003e PYTHON WEB SCRAPER \u003c/b\u003e\u003c!h1\u003e\n  \n---\n\n\u003cdiv align=\"center\"\u003e\n  \u003ci\u003eA powerful and user-friendly web scraping tool built with Python beautiful soup and PyQt5.\u003c/i\u003e\n\u003c/div\u003e\n\n\u003chr\u003e\n\n\u003ch2\u003eAbout the Project\u003c/h2\u003e\n    \u003cp\u003eThe Web Scraper Application is a versatile and intuitive tool designed to make web scraping tasks effortless. Built using Python and PyQt5, it offers a sleek interface coupled with powerful functionality, allowing users to easily scrape content from websites and save it for further analysis.\u003c/p\u003e\n\n\u003chr\u003e\n\n  \u003ch2\u003eFeatures\u003c/h2\u003e\n    \u003cul\u003e\n        \u003cli\u003e\u003cstrong\u003eUser-Friendly Interface:\u003c/strong\u003e The application features an intuitive user interface, making it easy for users of all skill levels to navigate and utilize its features.\u003c/li\u003e\n        \u003cli\u003e\u003cstrong\u003eScraping Capabilities:\u003c/strong\u003e Users can enter a URL and scrape the content of the corresponding webpage with a single click. The scraped content is displayed in real-time within the application.\u003c/li\u003e\n        \u003cli\u003e\u003cstrong\u003eHTML Content Saving:\u003c/strong\u003e The application allows users to save the scraped HTML content to a text file for future reference or analysis.\u003c/li\u003e\n        \u003cli\u003e\u003cstrong\u003eTask Bar Navigation:\u003c/strong\u003e With a built-in task bar, users can seamlessly switch between different functionalities such as scraping, viewing, and accessing the about section.\u003c/li\u003e\n    \u003c/ul\u003e\n\n  \u003chr\u003e\n\n  \u003ch2\u003eGetting Started\u003c/h2\u003e\n    \u003ch3\u003ePrerequisites\u003c/h3\u003e\n    \u003cul\u003e\n        \u003cli\u003ePython 3.x\u003c/li\u003e\n        \u003cli\u003ePyQt5 library\u003c/li\u003e\n        \u003cli\u003eBeautifulSoup4 library\u003c/li\u003e\n        \u003cli\u003eRequests library\u003c/li\u003e\n    \u003c/ul\u003e\n\n   \u003ch3\u003eInstallation\u003c/h3\u003e\n    \u003col\u003e\n        \u003cli\u003eClone the repository:\u003c/li\u003e\n        \u003ccode\u003egit clone https://github.com/NathanCordeiro/WEB-SCRAPER.git\u003c/code\u003e\n        \u003cli\u003eNavigate to the project directory:\u003c/li\u003e\n        \u003ccode\u003ecd WEB-SCRAPER\u003c/code\u003e\n        \u003cli\u003eInstall the required dependencies:\u003c/li\u003e\n        \u003ccode\u003epip install -r requirements.txt\u003c/code\u003e\n    \u003c/ol\u003e\n\n  \u003chr\u003e\n\n   \u003ch2\u003eUsage\u003c/h2\u003e\n    \u003col\u003e\n        \u003cli\u003eRun the application by executing the \u003ccode\u003emain.py\u003c/code\u003e file.\u003c/li\u003e\n        \u003ccode\u003epython main.py\u003c/code\u003e\n        \u003cli\u003eEnter the URL of the website you want to scrape in the designated input field.\u003c/li\u003e\n        \u003cli\u003eClick on the \"Scrape\" button to initiate the scraping process.\u003c/li\u003e\n        \u003cli\u003eThe scraped HTML content will be displayed in the application's view section. Additionally, it will be saved to a text file named \u003ccode\u003escraped_content.html\u003c/code\u003e.\u003c/li\u003e\n    \u003c/ol\u003e\n\n   \u003chr\u003e\n\n   \u003ch2\u003eContributing\u003c/h2\u003e\n    \u003cp\u003eContributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are \u003cstrong\u003egreatly appreciated\u003c/strong\u003e.\u003c/p\u003e\n    \u003col\u003e\n        \u003cli\u003eFork the Project\u003c/li\u003e\n        \u003cli\u003eCreate your Feature Branch (\u003ccode\u003egit checkout -b feature/AmazingFeature\u003c/code\u003e)\u003c/li\u003e\n        \u003cli\u003eCommit your Changes (\u003ccode\u003egit commit -m 'Add some AmazingFeature'\u003c/code\u003e)\u003c/li\u003e\n        \u003cli\u003ePush to the Branch (\u003ccode\u003egit push origin feature/AmazingFeature\u003c/code\u003e)\u003c/li\u003e\n        \u003cli\u003eOpen a Pull Request\u003c/li\u003e\n    \u003c/ol\u003e\n\n   \u003chr\u003e\n\n  \u003ch2\u003eLicense\u003c/h2\u003e\n    \u003cp\u003eDistributed under the MIT License. See \u003ca href=\"LICENSE\"\u003eLICENSE\u003c/a\u003e for more information.\u003c/p\u003e\n\n   \u003chr\u003e\n\n   \u003ch2\u003eContact\u003c/h2\u003e\n    \u003cp\u003eNathan Cordeiro - \u003ca\u003enathanjohncordeiro@gmail.com\u003c/a\u003e\u003c/p\u003e\n    \u003cp\u003eProject Link: \u003ca href=\"https://github.com/NathanCordeiro/WEB-SCRAPER\"\u003ehttps://github.com/NathanCordeiro/WEB-SCRAPER\u003c/a\u003e\u003c/p\u003e\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnathancordeiro%2Fweb-scraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnathancordeiro%2Fweb-scraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnathancordeiro%2Fweb-scraper/lists"}