{"id":18725560,"url":"https://github.com/reljicd/spring-boot-web-scraper","last_synced_at":"2025-04-12T16:12:01.068Z","repository":{"id":94100537,"uuid":"108723738","full_name":"reljicd/spring-boot-web-scraper","owner":"reljicd","description":"Simple web scrapping app made using Spring Boot + Thymeleaf + Jsoup + Java 8 Lambdas \u0026 Streams","archived":false,"fork":false,"pushed_at":"2018-03-26T12:06:23.000Z","size":100,"stargazers_count":36,"open_issues_count":0,"forks_count":24,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-03-26T10:36:26.158Z","etag":null,"topics":["docker","docker-compose","functional-programming","h2","h2-database","java","java-8","java-lambda","java-streams","jsoup","lambda","scraper","spring","spring-boot","spring-data-jpa","spring-mvc","spring-security","stream","thymeleaf","web-scraping"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/reljicd.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-10-29T10:15:00.000Z","updated_at":"2025-01-12T11:53:43.000Z","dependencies_parsed_at":"2023-03-13T17:07:36.238Z","dependency_job_id":null,"html_url":"https://github.com/reljicd/spring-boot-web-scraper","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reljicd%2Fspring-boot-web-scraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reljicd%2Fspring-boot-web-scraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reljicd%2Fspring-boot-web-scraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reljicd%2Fspring-boot-web-scraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/reljicd","download_url":"https://codeload.github.com/reljicd/spring-boot-web-scraper/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248594139,"owners_count":21130312,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker","docker-compose","functional-programming","h2","h2-database","java","java-8","java-lambda","java-streams","jsoup","lambda","scraper","spring","spring-boot","spring-data-jpa","spring-mvc","spring-security","stream","thymeleaf","web-scraping"],"created_at":"2024-11-07T14:10:47.518Z","updated_at":"2025-04-12T16:12:01.062Z","avatar_url":"https://github.com/reljicd.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Spring Boot Web Scraper\n\n## About\n\nThis is a demo project. The idea was to build some basic web scraping app.\n\nIt was made using **Spring Boot**, **Spring Security**, **Thymeleaf**, **Spring Data JPA**, **Spring Data REST** and **Docker**. Database is in memory **H2**.\n\nThere is a login and registration functionality included.\n\nUsers can add web links to their profile and tag them. Links are also suggested based on data scraping of the website on given link.\n\n## Configuration\n\n### Configuration Files\n\nFolder **src/resources/** contains config files for **web-scraper** Spring Boot application.\n\n* **src/resources/application.properties** - main configuration file. Here it is possible to change admin username/password,\nas well as change the port number.\n\n## How to run\n\nThere are several ways to run the application. You can run it from the command line with included Maven Wrapper, Maven or Docker. \n\nOnce the app starts, go to the web browser and visit `http://localhost:8090/home`\n\nAdmin username: **admin**\n\nAdmin password: **admin**\n\nUser username: **user**\n\nUser password: **password**\n\n### Maven Wrapper\n\n#### Using the Maven Plugin\n\nGo to the root folder of the application and type:\n```bash\n$ chmod +x scripts/mvnw\n$ scripts/mvnw spring-boot:run\n```\n\n#### Using Executable Jar\n\nOr you can build the JAR file with \n```bash\n$ scripts/mvnw clean package\n``` \n\nThen you can run the JAR file:\n```bash\n$ java -jar target/web-scraper-0.0.1-SNAPSHOT.jar\n```\n\n### Maven\n\nOpen a terminal and run the following commands to ensure that you have valid versions of Java and Maven installed:\n\n```bash\n$ java -version\njava version \"1.8.0_102\"\nJava(TM) SE Runtime Environment (build 1.8.0_102-b14)\nJava HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode)\n```\n\n```bash\n$ mvn -v\nApache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00)\nMaven home: /usr/local/Cellar/maven/3.3.9/libexec\nJava version: 1.8.0_102, vendor: Oracle Corporation\n```\n\n#### Using the Maven Plugin\n\nThe Spring Boot Maven plugin includes a run goal that can be used to quickly compile and run your application. \nApplications run in an exploded form, as they do in your IDE. \nThe following example shows a typical Maven command to run a Spring Boot application:\n \n```bash\n$ mvn spring-boot:run\n``` \n\n#### Using Executable Jar\n\nTo create an executable jar run:\n\n```bash\n$ mvn clean package\n``` \n\nTo run that application, use the java -jar command, as follows:\n\n```bash\n$ java -jar target/web-scraper-0.0.1-SNAPSHOT.jar\n```\n\nTo exit the application, press **ctrl-c**.\n\n### Docker\n\nIt is possible to run **web-scraper** using Docker:\n\nBuild Docker image:\n```bash\n$ mvn clean package\n$ docker build -t web-scraper:dev -f docker/Dockerfile .\n```\n\nRun Docker container:\n```bash\n$ docker run --rm -i -p 8090:8090 \\\n      --name web-scraper \\\n      web-scraper:dev\n```\n\n##### Helper script\n\nIt is possible to run all of the above with helper script:\n\n```bash\n$ chmod +x scripts/run_docker.sh\n$ scripts/run_docker.sh\n```\n\n## Docker \n\nFolder **docker** contains:\n\n* **docker/web-scraper/Dockerfile** - Docker build file for executing web-scraper Docker image. \nInstructions to build artifacts, copy build artifacts to docker image and then run app on proper port with proper configuration file.\n\n## Util Scripts\n\n* **scripts/run_docker.sh.sh** - util script for running web-scraper Docker container using **docker/Dockerfile**\n\n## Tests\n\nTests can be run by executing following command from the root of the project:\n\n```bash\n$ mvn test\n```\n\n## Helper Tools\n\n### HAL REST Browser\n\nGo to the web browser and visit `http://localhost:8090/`\n\nYou will need to be authenticated to be able to see this page.\n\n### H2 Database web interface\n\nGo to the web browser and visit `http://localhost:8090/h2-console`\n\nIn field **JDBC URL** put \n```\njdbc:h2:mem:web_scraper_db\n```\n\nIn `/src/main/resources/application.properties` file it is possible to change both\nweb interface url path, as well as the datasource url.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Freljicd%2Fspring-boot-web-scraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Freljicd%2Fspring-boot-web-scraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Freljicd%2Fspring-boot-web-scraper/lists"}