https://github.com/behi22/webscraper
(FYI: the free Render PostgreSQL database has expired) A SPA that takes a website URL as input, scrapes its content, and classifies visitors based on their interests or industry. The goal is to dynamically generate questions and multiple-choice options that help categorize users visiting the site.
https://github.com/behi22/webscraper
ajax antd axios babel flask html-css-javascript jest linux node nodejs postgres python react redis redux render wsl
Last synced: 3 days ago
JSON representation
(FYI: the free Render PostgreSQL database has expired) A SPA that takes a website URL as input, scrapes its content, and classifies visitors based on their interests or industry. The goal is to dynamically generate questions and multiple-choice options that help categorize users visiting the site.
- Host: GitHub
- URL: https://github.com/behi22/webscraper
- Owner: behi22
- Created: 2024-10-30T10:08:25.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-13T16:49:03.000Z (about 1 year ago)
- Last Synced: 2025-02-13T17:48:46.604Z (about 1 year ago)
- Topics: ajax, antd, axios, babel, flask, html-css-javascript, jest, linux, node, nodejs, postgres, python, react, redis, redux, render, wsl
- Language: JavaScript
- Homepage: https://webscraper-behbod-babai.vercel.app/
- Size: 129 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
#
WebScraper
> Web Scraper for Visitor Classification.
>
>
## Table of Contents
- [General Info](#general-information)
- [Technologies Used](#technologies-used)
- [Screenshots](#screenshots)
- [Usage](#usage)
- [Project Status](#project-status)
- [Room for Improvement](#room-for-improvement)
- [Acknowledgements](#acknowledgements)
- [Contact](#contact)
## General Information
A Single Page Application that takes a website URL as input, scrapes its content, and classifies visitors based on their interests or industry. The goal is to dynamically generate questions and multiple-choice options that help categorize users visiting the site.
- Project Presentation: [Loom Video](https://www.loom.com/share/84359f4e50ee41fc97b21f921d9aa4a0?sid=a69fe144-e797-46e1-8a23-09a507a4b98a)
- The most Up-to-date and Deployed Frontend Repo can be viewed at: https://github.com/behi22/visitor-classifier-frontend
- The most Up-to-date and Deployed Backend Repo can be viewed at: https://github.com/behi22/visitor-classifier
## Technologies Used
- npm - 8.15.0
- React.js - 18.3.1
- Redux - 9.1.2
- antd - 5.22.2
- HTML - version html5
- CSS
- babel
- Axios
- AJAX
- git version 2.38.1.windows.1
- github
- Linux
- WSL
- Python
- Flask
- PostgreSQL
- Vercel
- Redis
- Render
## Screenshots

## Usage
The app should have the following features:
- **Frontend** - Neat and User-Friendly component based Frontend, created with React and deployed using Vercel
- **Backend API** - Python-based API, Properly implementing web scraping, data extraction, and AI-based content generation, deployed using Render
- **Storage** - Utilize PostgreSQL database for storage, Hosted on Render
- **Caching** - Utilize Redis for caching, Hosted on Redis Cloud
- Effective integration of Frontend and Backend components
## Project Status
Project is: Semi-Complete (Demo)
## Room for Improvement
- As indicated in the comments in [Home.js](/visitor-classifier-frontend/src/pages/Home.js), currently the answers for each question aren't submitted anywhere, and the logic could be developed further.
- The script for generating questions in [App.py](/visitor-classifier/app.py) is still very primitive and could be developed further with more time and resources at hand, so that we could generate more meaningful questions.
- There is an issue with the Missing Answers StyledParagraph inside [Home.js](/visitor-classifier-frontend/src/pages/Home.js) where it is still visible after submitting partial answers and changing the URL, that needs further time in debugging in order to resolve.
## Acknowledgements
- Many thanks to Brave Career for including me in their Software Engineer assessment project.
## Contact
Created by Behbod Babai - feel free to contact me via email!
my email: behibabai@gmail.com