Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ivangfr/spring-cloud-stream-kafka-elasticsearch
The goal of this project is to implement a "News" processing pipeline composed of five Spring Boot applications: producer-api, categorizer-service, collector-service, publisher-api and news-client.
https://github.com/ivangfr/spring-cloud-stream-kafka-elasticsearch
docker elasticsearch eureka java jib kafka spring-boot spring-cloud-openfeign spring-cloud-stream spring-data-elasticsearch spring-web-mvc springdoc-openapi thymeleaf zipkin
Last synced: 4 days ago
JSON representation
The goal of this project is to implement a "News" processing pipeline composed of five Spring Boot applications: producer-api, categorizer-service, collector-service, publisher-api and news-client.
- Host: GitHub
- URL: https://github.com/ivangfr/spring-cloud-stream-kafka-elasticsearch
- Owner: ivangfr
- Created: 2018-08-11T14:39:23.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2024-12-16T18:50:42.000Z (about 2 months ago)
- Last Synced: 2025-01-20T23:11:23.502Z (11 days ago)
- Topics: docker, elasticsearch, eureka, java, jib, kafka, spring-boot, spring-cloud-openfeign, spring-cloud-stream, spring-data-elasticsearch, spring-web-mvc, springdoc-openapi, thymeleaf, zipkin
- Language: Java
- Homepage:
- Size: 19.6 MB
- Stars: 91
- Watchers: 12
- Forks: 49
- Open Issues: 0
-
Metadata Files:
- Readme: README.adoc
- Changelog: news-client/pom.xml
- Funding: .github/FUNDING.yml
Awesome Lists containing this project
README
= spring-cloud-stream-kafka-elasticsearch
The goal of this project is to implement a "News" processing pipeline composed of five https://docs.spring.io/spring-boot/index.html[`Spring Boot`] applications: `producer-api`, `categorizer-service`, `collector-service`, `publisher-api` and `news-client`.
== Proof-of-Concepts & Articles
On https://ivangfr.github.io[ivangfr.github.io], I have compiled my Proof-of-Concepts (PoCs) and articles. You can easily search for the technology you are interested in by using the filter. Who knows, perhaps I have already implemented a PoC or written an article about what you are looking for.
== Additional Readings
* [Medium]: https://medium.com/@ivangfr/implementing-a-kafka-producer-and-consumer-using-spring-cloud-stream-d4b9a6a9eab1[**Implementing a Kafka Producer and Consumer using Spring Cloud Stream**]
* [Medium]: https://medium.com/@ivangfr/implementing-unit-tests-for-a-kafka-producer-and-consumer-that-uses-spring-cloud-stream-f7a98a89fcf2[**Implementing Unit Tests for a Kafka Producer and Consumer that uses Spring Cloud Stream**]
* [Medium]: https://medium.com/@ivangfr/implementing-end-to-end-testing-for-a-kafka-producer-and-consumer-that-uses-spring-cloud-stream-fbf5e666899e[**Implementing End-to-End testing for a Kafka Producer and Consumer that uses Spring Cloud Stream**]
* [Medium]: https://medium.com/@ivangfr/configuring-distributed-tracing-with-zipkin-in-a-kafka-producer-and-consumer-that-uses-spring-cloud-9f1e55468b9e[**Configuring Distributed Tracing with Zipkin in a Kafka Producer and Consumer that uses Spring Cloud Stream**]
* [Medium]: https://medium.com/@ivangfr/using-cloudevents-in-a-kafka-producer-and-consumer-that-uses-spring-cloud-stream-9c51670b5566[**Using CloudEvents in a Kafka Producer and Consumer that uses Spring Cloud Stream**]== Technologies used
* https://docs.spring.io/spring-cloud-stream/docs/current/reference/html/[`Spring Cloud Stream`] to build highly scalable event-driven applications connected with shared messaging systems;
* https://docs.spring.io/spring-cloud-schema-registry/docs/current/reference/html/spring-cloud-schema-registry.html[`Spring Cloud Schema Registry`] that supports schema evolution so that the data can be evolved over time; besides, it lets you store schema information in a textual format (typically JSON) and makes that information accessible to various applications that need it to receive and send data in binary format;
* https://docs.spring.io/spring-data/elasticsearch/reference/[`Spring Data Elasticsearch`] to persist data in https://www.elastic.co/elasticsearch[`Elasticsearch`];
* https://docs.spring.io/spring-cloud-openfeign/docs/current/reference/html/[`Spring Cloud OpenFeign`] to write web service clients easily;
* https://www.thymeleaf.org/[`Thymeleaf`] as HTML template;
* https://zipkin.io[`Zipkin`] to visualize traces between and within applications;
* https://github.com/Netflix/eureka[`Eureka`] as service registration and discovery.NOTE: In https://github.com/ivangfr/docker-swarm-environment[`docker-swarm-environment`] repository, it is shown how to deploy this project into a cluster of Docker Engines in swarm mode.
== Project Architecture
image::documentation/project-diagram.jpeg[]
== Applications
* *producer-api*
+
`Spring Boot` Web Java application that creates news and pushes news events to `producer.news` topic in `Kafka`.* *categorizer-service*
+
`Spring Boot` Web Java application that listens to news events in `producer.news` topic in `Kafka`, categorizes and pushes them to `categorizer.news` topic.* *collector-service*
+
`Spring Boot` Web Java application that listens for news events in `categorizer.news` topic in `Kafka`, saves them in `Elasticsearch` and pushes the news events to `collector.news` topic.* *publisher-api*
+
`Spring Boot` Web Java application that reads directly from `Elasticsearch` and exposes a REST API. It doesn't listen from `Kafka`.* *news-client*
+
`Spring Boot` Web java application that provides a User Interface to see the news. It implements a `Websocket` that consumes news events from the topic `collector.news`. So, news are updated on the fly on the main page. Besides, `news-client` communicates directly with `publisher-api` whenever search for a specific news or news update are needed.
+
The `Websocket` operation is shown in the short gif below. News is created in `producer-api` and, immediately, it is shown in `news-client`.
+
image::documentation/websocket-operation.gif[]== Prerequisites
* https://www.oracle.com/java/technologies/downloads/#java21[`Java 21+`]
* Some containerization tool https://www.docker.com[`Docker`], https://podman.io[`Podman`], etc.== Generate NewsEvent
* Open a terminal and navigate to the `spring-cloud-stream-kafka-elasticsearch` root folder;
* Run the following command to generate `NewsEvent`:
+
[source]
----
./mvnw clean install --projects commons-news
----
+
It will install `commons-news-1.0.0.jar` in you local `Maven` repository, so that it can be visible by all services.== Start Environment
* In a terminal, navigate to the `spring-cloud-stream-kafka-elasticsearch` root folder, and run:
+
[source]
----
docker compose up -d
----* Wait for Docker containers to be up and running. To check it, run:
+
[source]
----
docker ps -a
----== Running Applications with Maven
Inside the `spring-cloud-stream-kafka-elasticsearch` root folder, run the following `Maven` commands in different terminals:
* *eureka-server*
+
[source]
----
./mvnw clean spring-boot:run --projects eureka-server
----* *producer-api*
+
[source]
----
./mvnw clean spring-boot:run --projects producer-api -Dspring-boot.run.jvmArguments="-Dserver.port=9080"
----* *categorizer-service*
+
[source]
----
./mvnw clean spring-boot:run --projects categorizer-service -Dspring-boot.run.jvmArguments="-Dserver.port=9081"
----* *collector-service*
+
[source]
----
./mvnw clean spring-boot:run --projects collector-service -Dspring-boot.run.jvmArguments="-Dserver.port=9082"
----* *publisher-api*
+
[source]
----
./mvnw clean spring-boot:run --projects publisher-api -Dspring-boot.run.jvmArguments="-Dserver.port=9083"
----* *news-client*
+
[source]
----
./mvnw clean spring-boot:run --projects news-client
----== Running Applications as Docker containers
=== Build Application's Docker Image
* In a terminal, make sure you are in the `spring-cloud-stream-kafka-elasticsearch` root folder;
* In order to build the application's docker images, run the following script:
+
[source]
----
./build-docker-images.sh
----=== Application's Environment Variables
* *producer-api*
+
|===
|Environment Variable | Description|`KAFKA_HOST`
|Specify host of the `Kafka` message broker to use (default `localhost`)|`KAFKA_PORT`
|Specify port of the `Kafka` message broker to use (default `29092`)|`SCHEMA_REGISTRY_HOST`
|Specify host of the `Schema Registry` to use (default `localhost`)|`SCHEMA_REGISTRY_PORT`
|Specify port of the `Schema Registry` to use (default `8081`)|`EUREKA_HOST`
|Specify host of the `Eureka` service discovery to use (default `localhost`)|`EUREKA_PORT`
|Specify port of the `Eureka` service discovery to use (default `8761`)|`ZIPKIN_HOST`
|Specify host of the `Zipkin` distributed tracing system to use (default `localhost`)|`ZIPKIN_PORT`
|Specify port of the `Zipkin` distributed tracing system to use (default `9411`)|===
* *categorizer-service*
+
|===
|Environment Variable | Description|`KAFKA_HOST`
|Specify host of the `Kafka` message broker to use (default `localhost`)|`KAFKA_PORT`
|Specify port of the `Kafka` message broker to use (default `29092`)|`SCHEMA_REGISTRY_HOST`
|Specify host of the `Schema Registry` to use (default `localhost`)|`SCHEMA_REGISTRY_PORT`
|Specify port of the `Schema Registry` to use (default `8081`)|`EUREKA_HOST`
|Specify host of the `Eureka` service discovery to use (default `localhost`)|`EUREKA_PORT`
|Specify port of the `Eureka` service discovery to use (default `8761`)|`ZIPKIN_HOST`
|Specify host of the `Zipkin` distributed tracing system to use (default `localhost`)|`ZIPKIN_PORT`
|Specify port of the `Zipkin` distributed tracing system to use (default `9411`)|===
* *collector-service*
+
|===
|Environment Variable | Description|`ELASTICSEARCH_HOST`
|Specify host of the `Elasticsearch` search engine to use (default `localhost`)|`ELASTICSEARCH_NODES_PORT`
|Specify nodes port of the `Elasticsearch` search engine to use (default `9300`)|`ELASTICSEARCH_REST_PORT`
|Specify rest port of the `Elasticsearch` search engine to use (default `9200`)|`KAFKA_HOST`
|Specify host of the `Kafka` message broker to use (default `localhost`)|`KAFKA_PORT`
|Specify port of the `Kafka` message broker to use (default `29092`)|`SCHEMA_REGISTRY_HOST`
|Specify host of the `Schema Registry` to use (default `localhost`)|`SCHEMA_REGISTRY_PORT`
|Specify port of the `Schema Registry` to use (default `8081`)|`EUREKA_HOST`
|Specify host of the `Eureka` service discovery to use (default `localhost`)|`EUREKA_PORT`
|Specify port of the `Eureka` service discovery to use (default `8761`)|`ZIPKIN_HOST`
|Specify host of the `Zipkin` distributed tracing system to use (default `localhost`)|`ZIPKIN_PORT`
|Specify port of the `Zipkin` distributed tracing system to use (default `9411`)|===
* *publisher-api*
+
|===
|Environment Variable | Description|`ELASTICSEARCH_HOST`
|Specify host of the `Elasticsearch` search engine to use (default `localhost`)|`ELASTICSEARCH_NODES_PORT`
|Specify nodes port of the `Elasticsearch` search engine to use (default `9300`)|`ELASTICSEARCH_REST_PORT`
|Specify rest port of the `Elasticsearch` search engine to use (default `9200`)|`EUREKA_HOST`
|Specify host of the `Eureka` service discovery to use (default `localhost`)|`EUREKA_PORT`
|Specify port of the `Eureka` service discovery to use (default `8761`)|`ZIPKIN_HOST`
|Specify host of the `Zipkin` distributed tracing system to use (default `localhost`)|`ZIPKIN_PORT`
|Specify port of the `Zipkin` distributed tracing system to use (default `9411`)|===
* *news-client*
+
|===
|Environment Variable | Description|`KAFKA_HOST`
|Specify host of the `Kafka` message broker to use (default `localhost`)|`KAFKA_PORT`
|Specify port of the `Kafka` message broker to use (default `29092`)|`SCHEMA_REGISTRY_HOST`
|Specify host of the `Schema Registry` to use (default `localhost`)|`SCHEMA_REGISTRY_PORT`
|Specify port of the `Schema Registry` to use (default `8081`)|`EUREKA_HOST`
|Specify host of the `Eureka` service discovery to use (default `localhost`)|`EUREKA_PORT`
|Specify port of the `Eureka` service discovery to use (default `8761`)|`ZIPKIN_HOST`
|Specify host of the `Zipkin` distributed tracing system to use (default `localhost`)|`ZIPKIN_PORT`
|Specify port of the `Zipkin` distributed tracing system to use (default `9411`)|===
=== Run Application's Docker Container
* In a terminal, make sure you are inside the `spring-cloud-stream-kafka-elasticsearch` root folder;
* Run following script:
+
[source]
----
./start-apps.sh
----== Applications URLs
|===
|Application |URL|producer-api
|http://localhost:9080/swagger-ui.html|publisher-api
|http://localhost:9083/swagger-ui.html|news-client
|http://localhost:8080|===
== Useful links
* *Eureka*
+
`Eureka` can be accessed at http://localhost:8761
+
image::documentation/eureka.jpg[]* *Zipkin*
+
`Zipkin` can be accessed at http://localhost:9411
+
image::documentation/zipkin.jpg[]* *Kafka Topics UI*
+
`Kafka Topics UI` can be accessed at http://localhost:8085* *Kafka Manager*
+
`Kafka Manager` can be accessed at http://localhost:9001
+
_Configuration_
+
- First, you must create a new cluster. Click on `Cluster` (dropdown button on the header) and then on `Add Cluster`
- Type the name of your cluster in `Cluster Name` field, for example: `MyCluster`
- Type `zookeeper:2181` in `Cluster Zookeeper Hosts` field
- Enable checkbox `Poll consumer information (Not recommended for large # of consumers if ZK is used for offsets tracking on older Kafka versions)`
- Click on `Save` button at the bottom of the page.* *Schema Registry UI*
+
`Schema Registry UI` can be accessed at http://localhost:8001* *Elasticsearch REST API*
+
Check ES is up and running
+
[source]
----
curl localhost:9200
----
+
Check indexes
+
[source]
----
curl "localhost:9200/_cat/indices?v"
----
+
Check _news_ index mapping
+
[source]
----
curl "localhost:9200/news/_mapping?pretty"
----
+
Simple search
+
[source]
----
curl "localhost:9200/news/_search?pretty"
----
+
Delete _news_ index
+
[source]
----
curl -X DELETE localhost:9200/news
----== Shutdown
* To stop applications:
** If they were started with `Maven`, go to the terminals where they are running and press `Ctrl+C`;
** If they were started as Docker containers, in a terminal and inside the `spring-cloud-stream-kafka-elasticsearch` root folder, run the script below:
+
[source]
----
./stop-apps.sh
----* To stop and remove docker compose containers, network and volumes, in a terminal, navigate to the `spring-cloud-stream-kafka-elasticsearch` root folder, and run the following command:
+
[source]
----
docker compose down -v
----== Cleanup
To remove the Docker images created by this project, in a terminal and inside the `spring-cloud-stream-kafka-elasticsearch` root folder, run the script below:
[source]
----
./remove-docker-images.sh
----