{"id":20218328,"url":"https://github.com/behi22/webscraper","last_synced_at":"2026-04-01T17:35:08.235Z","repository":{"id":260254651,"uuid":"880763680","full_name":"behi22/WebScraper","owner":"behi22","description":"(FYI: the free Render PostgreSQL database has expired) A SPA that takes a website URL as input, scrapes its content, and classifies visitors based on their interests or industry. The goal is to dynamically generate questions and multiple-choice options that help categorize users visiting the site.","archived":false,"fork":false,"pushed_at":"2025-02-13T16:49:03.000Z","size":135091,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-03-28T01:35:30.809Z","etag":null,"topics":["ajax","antd","axios","babel","flask","html-css-javascript","jest","linux","node","nodejs","postgres","python","react","redis","redux","render","wsl"],"latest_commit_sha":null,"homepage":"https://webscraper-behbod-babai.vercel.app/","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/behi22.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-10-30T10:08:25.000Z","updated_at":"2025-04-03T22:32:46.000Z","dependencies_parsed_at":"2025-02-13T17:42:22.288Z","dependency_job_id":null,"html_url":"https://github.com/behi22/WebScraper","commit_stats":null,"previous_names":["behi22/webscraper"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/behi22/WebScraper","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/behi22%2FWebScraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/behi22%2FWebScraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/behi22%2FWebScraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/behi22%2FWebScraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/behi22","download_url":"https://codeload.github.com/behi22/WebScraper/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/behi22%2FWebScraper/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31290537,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-01T13:12:26.723Z","status":"ssl_error","status_checked_at":"2026-04-01T13:12:25.102Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ajax","antd","axios","babel","flask","html-css-javascript","jest","linux","node","nodejs","postgres","python","react","redis","redux","render","wsl"],"created_at":"2024-11-14T06:38:09.102Z","updated_at":"2026-04-01T17:35:08.211Z","avatar_url":"https://github.com/behi22.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"#\n\nWebScraper\n\n\u003e Web Scraper for Visitor Classification.\n\u003e\n\u003e \u003c!--Live demo [_here_]().  If you have the project hosted somewhere, include the link here. --\u003e\n\n## Table of Contents\n\n- [General Info](#general-information)\n- [Technologies Used](#technologies-used)\n- [Screenshots](#screenshots)\n- [Usage](#usage)\n- [Project Status](#project-status)\n- [Room for Improvement](#room-for-improvement)\n- [Acknowledgements](#acknowledgements)\n- [Contact](#contact)\n\u003c!-- * [License](#license) --\u003e\n\n## General Information\n\nA Single Page Application that takes a website URL as input, scrapes its content, and classifies visitors based on their interests or industry. The goal is to dynamically generate questions and multiple-choice options that help categorize users visiting the site.\n\n- Project Presentation: [Loom Video](https://www.loom.com/share/84359f4e50ee41fc97b21f921d9aa4a0?sid=a69fe144-e797-46e1-8a23-09a507a4b98a)\n\n- The most Up-to-date and Deployed \u003cb\u003eFrontend\u003c/b\u003e Repo can be viewed at: https://github.com/behi22/visitor-classifier-frontend\n\n- The most Up-to-date and Deployed \u003cb\u003eBackend\u003c/b\u003e Repo can be viewed at: https://github.com/behi22/visitor-classifier\n\n\u003c!-- You don't have to answer all the questions - just the ones relevant to your project. --\u003e\n\n## Technologies Used\n\n- npm - 8.15.0\n- React.js - 18.3.1\n- Redux - 9.1.2\n- antd - 5.22.2\n- HTML - version html5\n- CSS\n- babel\n- Axios\n- AJAX\n- git version 2.38.1.windows.1\n- github\n- Linux\n- WSL\n- Python\n- Flask\n- PostgreSQL\n- Vercel\n- Redis\n- Render\n\n## Screenshots\n\n![alt text](image-1.png)\n\n## Usage\n\nThe app should have the following features:\n\n- **Frontend** - Neat and User-Friendly component based Frontend, created with React and deployed using Vercel\n- **Backend API** - Python-based API, Properly implementing web scraping, data extraction, and AI-based content generation, deployed using Render\n- **Storage** - Utilize PostgreSQL database for storage, Hosted on Render\n- **Caching** - Utilize Redis for caching, Hosted on Redis Cloud\n- Effective integration of Frontend and Backend components\n\n## Project Status\n\nProject is: Semi-Complete (Demo)\n\n## Room for Improvement\n\n- As indicated in the comments in [Home.js](/visitor-classifier-frontend/src/pages/Home.js), currently the answers for each question aren't submitted anywhere, and the logic could be developed further.\n\n- The script for generating questions in [App.py](/visitor-classifier/app.py) is still very primitive and could be developed further with more time and resources at hand, so that we could generate more meaningful questions.\n\n- There is an issue with the \u003cins\u003eMissing Answers StyledParagraph\u003c/ins\u003e inside [Home.js](/visitor-classifier-frontend/src/pages/Home.js) where it is still visible after submitting partial answers and changing the URL, that needs further time in debugging in order to resolve.\n\n## Acknowledgements\n\n- Many thanks to Brave Career for including me in their Software Engineer assessment project.\n\n## Contact\n\nCreated by Behbod Babai - feel free to contact me via email!\nmy email: behibabai@gmail.com\n\n\u003c!-- Optional --\u003e\n\u003c!-- ## License --\u003e\n\u003c!-- This project is open source and available under the [... License](). --\u003e\n\n\u003c!-- You don't have to include all sections - just the one's relevant to your project --\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbehi22%2Fwebscraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbehi22%2Fwebscraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbehi22%2Fwebscraper/lists"}