https://github.com/ivangfr/spring-cloud-stream-kafka-elasticsearch

The goal of this project is to implement a "News" processing pipeline composed of five Spring Boot applications: producer-api, categorizer-service, collector-service, publisher-api and news-client.
https://github.com/ivangfr/spring-cloud-stream-kafka-elasticsearch

docker elasticsearch eureka java jib kafka spring-boot spring-cloud-openfeign spring-cloud-stream spring-data-elasticsearch spring-web-mvc springdoc-openapi thymeleaf zipkin

Last synced: 3 months ago
JSON representation

The goal of this project is to implement a "News" processing pipeline composed of five Spring Boot applications: producer-api, categorizer-service, collector-service, publisher-api and news-client.

Host: GitHub
URL: https://github.com/ivangfr/spring-cloud-stream-kafka-elasticsearch
Owner: ivangfr
Created: 2018-08-11T14:39:23.000Z (almost 7 years ago)
Default Branch: master
Last Pushed: 2024-12-16T18:50:42.000Z (7 months ago)
Last Synced: 2025-03-29T16:06:53.288Z (3 months ago)
Topics: docker, elasticsearch, eureka, java, jib, kafka, spring-boot, spring-cloud-openfeign, spring-cloud-stream, spring-data-elasticsearch, spring-web-mvc, springdoc-openapi, thymeleaf, zipkin
Language: Java
Homepage:
Size: 19.6 MB
Stars: 95
Watchers: 12
Forks: 51
Open Issues: 0
Metadata Files:
- Readme: README.adoc
- Changelog: news-client/pom.xml
- Funding: .github/FUNDING.yml

Awesome Lists containing this project

README

        = spring-cloud-stream-kafka-elasticsearch

The goal of this project is to implement a "News" processing pipeline composed of five https://docs.spring.io/spring-boot/index.html[`Spring Boot`] applications: `producer-api`, `categorizer-service`, `collector-service`, `publisher-api` and `news-client`.

== Proof-of-Concepts & Articles

On https://ivangfr.github.io[ivangfr.github.io], I have compiled my Proof-of-Concepts (PoCs) and articles. You can easily search for the technology you are interested in by using the filter. Who knows, perhaps I have already implemented a PoC or written an article about what you are looking for.

== Additional Readings

* [Medium]: https://medium.com/@ivangfr/implementing-a-kafka-producer-and-consumer-using-spring-cloud-stream-d4b9a6a9eab1[**Implementing a Kafka Producer and Consumer using Spring Cloud Stream**]

* [Medium]: https://medium.com/@ivangfr/implementing-unit-tests-for-a-kafka-producer-and-consumer-that-uses-spring-cloud-stream-f7a98a89fcf2[**Implementing Unit Tests for a Kafka Producer and Consumer that uses Spring Cloud Stream**]

* [Medium]: https://medium.com/@ivangfr/implementing-end-to-end-testing-for-a-kafka-producer-and-consumer-that-uses-spring-cloud-stream-fbf5e666899e[**Implementing End-to-End testing for a Kafka Producer and Consumer that uses Spring Cloud Stream**]

* [Medium]: https://medium.com/@ivangfr/configuring-distributed-tracing-with-zipkin-in-a-kafka-producer-and-consumer-that-uses-spring-cloud-9f1e55468b9e[**Configuring Distributed Tracing with Zipkin in a Kafka Producer and Consumer that uses Spring Cloud Stream**]

* [Medium]: https://medium.com/@ivangfr/using-cloudevents-in-a-kafka-producer-and-consumer-that-uses-spring-cloud-stream-9c51670b5566[**Using CloudEvents in a Kafka Producer and Consumer that uses Spring Cloud Stream**]

== Technologies used

* https://docs.spring.io/spring-cloud-stream/docs/current/reference/html/[`Spring Cloud Stream`] to build highly scalable event-driven applications connected with shared messaging systems;

* https://docs.spring.io/spring-cloud-schema-registry/docs/current/reference/html/spring-cloud-schema-registry.html[`Spring Cloud Schema Registry`] that supports schema evolution so that the data can be evolved over time; besides, it lets you store schema information in a textual format (typically JSON) and makes that information accessible to various applications that need it to receive and send data in binary format;

* https://docs.spring.io/spring-data/elasticsearch/reference/[`Spring Data Elasticsearch`] to persist data in https://www.elastic.co/elasticsearch[`Elasticsearch`];

* https://docs.spring.io/spring-cloud-openfeign/docs/current/reference/html/[`Spring Cloud OpenFeign`] to write web service clients easily;

* https://www.thymeleaf.org/[`Thymeleaf`] as HTML template;

* https://zipkin.io[`Zipkin`] to visualize traces between and within applications;

* https://github.com/Netflix/eureka[`Eureka`] as service registration and discovery.

NOTE: In https://github.com/ivangfr/docker-swarm-environment[`docker-swarm-environment`] repository, it is shown how to deploy this project into a cluster of Docker Engines in swarm mode.

== Project Architecture

image::documentation/project-diagram.jpeg[]

== Applications

* *producer-api*

+

`Spring Boot` Web Java application that creates news and pushes news events to `producer.news` topic in `Kafka`.

* *categorizer-service*

+

`Spring Boot` Web Java application that listens to news events in `producer.news` topic in `Kafka`, categorizes and pushes them to `categorizer.news` topic.

* *collector-service*

+

`Spring Boot` Web Java application that listens for news events in `categorizer.news` topic in `Kafka`, saves them in `Elasticsearch` and pushes the news events to `collector.news` topic.

* *publisher-api*

+

`Spring Boot` Web Java application that reads directly from `Elasticsearch` and exposes a REST API. It doesn't listen from `Kafka`.

* *news-client*

+

`Spring Boot` Web java application that provides a User Interface to see the news. It implements a `Websocket` that consumes news events from the topic `collector.news`. So, news are updated on the fly on the main page. Besides, `news-client` communicates directly with `publisher-api` whenever search for a specific news or news update are needed.

+

The `Websocket` operation is shown in the short gif below. News is created in `producer-api` and, immediately, it is shown in `news-client`.

+

image::documentation/websocket-operation.gif[]

== Prerequisites

* https://www.oracle.com/java/technologies/downloads/#java21[`Java 21+`]

* Some containerization tool https://www.docker.com[`Docker`], https://podman.io[`Podman`], etc.

== Generate NewsEvent

* Open a terminal and navigate to the `spring-cloud-stream-kafka-elasticsearch` root folder;

* Run the following command to generate `NewsEvent`:

+

[source]

----

./mvnw clean install --projects commons-news

----

+

It will install `commons-news-1.0.0.jar` in you local `Maven` repository, so that it can be visible by all services.

== Start Environment

* In a terminal, navigate to the `spring-cloud-stream-kafka-elasticsearch` root folder, and run:

+

[source]

----

docker compose up -d

----

* Wait for Docker containers to be up and running. To check it, run:

+

[source]

----

docker ps -a

----

== Running Applications with Maven

Inside the `spring-cloud-stream-kafka-elasticsearch` root folder, run the following `Maven` commands in different terminals:

* *eureka-server*

+

[source]

----

./mvnw clean spring-boot:run --projects eureka-server

----

* *producer-api*

+

[source]

----

./mvnw clean spring-boot:run --projects producer-api -Dspring-boot.run.jvmArguments="-Dserver.port=9080"

----

* *categorizer-service*

+

[source]

----

./mvnw clean spring-boot:run --projects categorizer-service -Dspring-boot.run.jvmArguments="-Dserver.port=9081"

----

* *collector-service*

+

[source]

----

./mvnw clean spring-boot:run --projects collector-service -Dspring-boot.run.jvmArguments="-Dserver.port=9082"

----

* *publisher-api*

+

[source]

----

./mvnw clean spring-boot:run --projects publisher-api -Dspring-boot.run.jvmArguments="-Dserver.port=9083"

----

* *news-client*

+

[source]

----

./mvnw clean spring-boot:run --projects news-client

----

== Running Applications as Docker containers

=== Build Application's Docker Image

* In a terminal, make sure you are in the `spring-cloud-stream-kafka-elasticsearch` root folder;

* In order to build the application's docker images, run the following script:

+

[source]

----

./build-docker-images.sh

----

=== Application's Environment Variables

* *producer-api*

+

|===

|Environment Variable | Description

|`KAFKA_HOST`

|Specify host of the `Kafka` message broker to use (default `localhost`)

|`KAFKA_PORT`

|Specify port of the `Kafka` message broker to use (default `29092`)

|`SCHEMA_REGISTRY_HOST`

|Specify host of the `Schema Registry` to use (default `localhost`)

|`SCHEMA_REGISTRY_PORT`

|Specify port of the `Schema Registry` to use (default `8081`)

|`EUREKA_HOST`

|Specify host of the `Eureka` service discovery to use (default `localhost`)

|`EUREKA_PORT`

|Specify port of the `Eureka` service discovery to use (default `8761`)

|`ZIPKIN_HOST`

|Specify host of the `Zipkin` distributed tracing system to use (default `localhost`)

|`ZIPKIN_PORT`

|Specify port of the `Zipkin` distributed tracing system to use (default `9411`)

|===

* *categorizer-service*

+

|===

|Environment Variable | Description

|`KAFKA_HOST`

|Specify host of the `Kafka` message broker to use (default `localhost`)

|`KAFKA_PORT`

|Specify port of the `Kafka` message broker to use (default `29092`)

|`SCHEMA_REGISTRY_HOST`

|Specify host of the `Schema Registry` to use (default `localhost`)

|`SCHEMA_REGISTRY_PORT`

|Specify port of the `Schema Registry` to use (default `8081`)

|`EUREKA_HOST`

|Specify host of the `Eureka` service discovery to use (default `localhost`)

|`EUREKA_PORT`

|Specify port of the `Eureka` service discovery to use (default `8761`)

|`ZIPKIN_HOST`

|Specify host of the `Zipkin` distributed tracing system to use (default `localhost`)

|`ZIPKIN_PORT`

|Specify port of the `Zipkin` distributed tracing system to use (default `9411`)

|===

* *collector-service*

+

|===

|Environment Variable | Description

|`ELASTICSEARCH_HOST`

|Specify host of the `Elasticsearch` search engine to use (default `localhost`)

|`ELASTICSEARCH_NODES_PORT`

|Specify nodes port of the `Elasticsearch` search engine to use (default `9300`)

|`ELASTICSEARCH_REST_PORT`

|Specify rest port of the `Elasticsearch` search engine to use (default `9200`)

|`KAFKA_HOST`

|Specify host of the `Kafka` message broker to use (default `localhost`)

|`KAFKA_PORT`

|Specify port of the `Kafka` message broker to use (default `29092`)

|`SCHEMA_REGISTRY_HOST`

|Specify host of the `Schema Registry` to use (default `localhost`)

|`SCHEMA_REGISTRY_PORT`

|Specify port of the `Schema Registry` to use (default `8081`)

|`EUREKA_HOST`

|Specify host of the `Eureka` service discovery to use (default `localhost`)

|`EUREKA_PORT`

|Specify port of the `Eureka` service discovery to use (default `8761`)

|`ZIPKIN_HOST`

|Specify host of the `Zipkin` distributed tracing system to use (default `localhost`)

|`ZIPKIN_PORT`

|Specify port of the `Zipkin` distributed tracing system to use (default `9411`)

|===

* *publisher-api*

+

|===

|Environment Variable | Description

|`ELASTICSEARCH_HOST`

|Specify host of the `Elasticsearch` search engine to use (default `localhost`)

|`ELASTICSEARCH_NODES_PORT`

|Specify nodes port of the `Elasticsearch` search engine to use (default `9300`)

|`ELASTICSEARCH_REST_PORT`

|Specify rest port of the `Elasticsearch` search engine to use (default `9200`)

|`EUREKA_HOST`

|Specify host of the `Eureka` service discovery to use (default `localhost`)

|`EUREKA_PORT`

|Specify port of the `Eureka` service discovery to use (default `8761`)

|`ZIPKIN_HOST`

|Specify host of the `Zipkin` distributed tracing system to use (default `localhost`)

|`ZIPKIN_PORT`

|Specify port of the `Zipkin` distributed tracing system to use (default `9411`)

|===

* *news-client*

+

|===

|Environment Variable | Description

|`KAFKA_HOST`

|Specify host of the `Kafka` message broker to use (default `localhost`)

|`KAFKA_PORT`

|Specify port of the `Kafka` message broker to use (default `29092`)

|`SCHEMA_REGISTRY_HOST`

|Specify host of the `Schema Registry` to use (default `localhost`)

|`SCHEMA_REGISTRY_PORT`

|Specify port of the `Schema Registry` to use (default `8081`)

|`EUREKA_HOST`

|Specify host of the `Eureka` service discovery to use (default `localhost`)

|`EUREKA_PORT`

|Specify port of the `Eureka` service discovery to use (default `8761`)

|`ZIPKIN_HOST`

|Specify host of the `Zipkin` distributed tracing system to use (default `localhost`)

|`ZIPKIN_PORT`

|Specify port of the `Zipkin` distributed tracing system to use (default `9411`)

|===

=== Run Application's Docker Container

* In a terminal, make sure you are inside the `spring-cloud-stream-kafka-elasticsearch` root folder;

* Run following script:

+

[source]

----

./start-apps.sh

----

== Applications URLs

|===

|Application |URL

|producer-api

|http://localhost:9080/swagger-ui.html

|publisher-api

|http://localhost:9083/swagger-ui.html

|news-client

|http://localhost:8080

|===

== Useful links

* *Eureka*

+

`Eureka` can be accessed at http://localhost:8761

+

image::documentation/eureka.jpg[]

* *Zipkin*

+

`Zipkin` can be accessed at http://localhost:9411

+

image::documentation/zipkin.jpg[]

* *Kafka Topics UI*

+

`Kafka Topics UI` can be accessed at http://localhost:8085

* *Kafka Manager*

+

`Kafka Manager` can be accessed at http://localhost:9001

+

_Configuration_

+

- First, you must create a new cluster. Click on `Cluster` (dropdown button on the header) and then on `Add Cluster`

- Type the name of your cluster in `Cluster Name` field, for example: `MyCluster`

- Type `zookeeper:2181` in `Cluster Zookeeper Hosts` field

- Enable checkbox `Poll consumer information (Not recommended for large # of consumers if ZK is used for offsets tracking on older Kafka versions)`

- Click on `Save` button at the bottom of the page.

* *Schema Registry UI*

+

`Schema Registry UI` can be accessed at http://localhost:8001

* *Elasticsearch REST API*

+

Check ES is up and running

+

[source]

----

curl localhost:9200

----

+

Check indexes

+

[source]

----

curl "localhost:9200/_cat/indices?v"

----

+

Check _news_ index mapping

+

[source]

----

curl "localhost:9200/news/_mapping?pretty"

----

+

Simple search

+

[source]

----

curl "localhost:9200/news/_search?pretty"

----

+

Delete _news_ index

+

[source]

----

curl -X DELETE localhost:9200/news

----

== Shutdown

* To stop applications:

** If they were started with `Maven`, go to the terminals where they are running and press `Ctrl+C`;

** If they were started as Docker containers, in a terminal and inside the `spring-cloud-stream-kafka-elasticsearch` root folder, run the script below:

+

[source]

----

./stop-apps.sh

----

* To stop and remove docker compose containers, network and volumes, in a terminal, navigate to the `spring-cloud-stream-kafka-elasticsearch` root folder, and run the following command:

+

[source]

----

docker compose down -v

----

== Cleanup

To remove the Docker images created by this project, in a terminal and inside the `spring-cloud-stream-kafka-elasticsearch` root folder, run the script below:

[source]

----

./remove-docker-images.sh

----

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ivangfr/spring-cloud-stream-kafka-elasticsearch

Awesome Lists containing this project

README