https://github.com/streamlined2/webcrawler
Web crawler application that collects domain statistical information and saves it to database
https://github.com/streamlined2/webcrawler
dao-layer freemarker front-controller-pattern heroku-deployment heroku-maven-plugin httpclient java-17 jetty-server jpa-hibernate jsoup-library lombok mvc-pattern postgresql service-layer servlet
Last synced: about 1 month ago
JSON representation
Web crawler application that collects domain statistical information and saves it to database
- Host: GitHub
- URL: https://github.com/streamlined2/webcrawler
- Owner: streamlined2
- Created: 2021-12-30T21:15:45.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2022-01-03T10:37:38.000Z (over 4 years ago)
- Last Synced: 2026-01-02T13:53:30.577Z (5 months ago)
- Topics: dao-layer, freemarker, front-controller-pattern, heroku-deployment, heroku-maven-plugin, httpclient, java-17, jetty-server, jpa-hibernate, jsoup-library, lombok, mvc-pattern, postgresql, service-layer, servlet
- Language: Java
- Homepage:
- Size: 44.9 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# webCrawler
Simple web crawler to collect domain statistical information and save it to DB
Application deployed to Heroku service
https://very-simple-web-crawler.herokuapp.com/crawler
DB setup script webCrawler/setup.sql
- Java 17
- JEE front controller servlet
- MVC pattern
- service, DAO layers
- JPA transaction programmatic management
- JPA/Lombok annotations for entity classes
- Embedded Jetty server
- PostgreSQL 13.4
- Freemarker templates
- Java HttpClient to fetch resources
- Jsoup to analyze HTML