Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/mneedham/pinot-wiki


https://github.com/mneedham/pinot-wiki

kafka pinot pulsar redpanda wikipedia

Last synced: 12 days ago
JSON representation

Awesome Lists containing this project

README

        

# Building a real-time analytics dashboard with Streamlit, Apache Pinot, and Apache Kafka

Clone repository

[source, bash]
----
git clone [email protected]:mneedham/pinot-wiki.git && cd pinot-wiki
----

Spin up all components

[source, bash]
----
docker-compose up
----

or on the Mac M1:

[source, bash]
----
docker-compose -f docker-compose-m1.yml up
----

Setup Python

Ingest Wikipedia events

[source, bash]
----
python -m venv .venv
source venv/bin/activate
pip install -r requirements.txt
----

Create Kafka topic

[source, bash]
----
docker exec -it kafka-wiki kafka-topics.sh \
--bootstrap-server localhost:9092 \
--partitions 5 \
--topic wiki-events \
--create
----

Ingest Wikipedia events

[source, bash]
----
python wiki_to_kafka.py
----

Check Wikipedia events are ingesting

[source, bash]
----
docker exec -it kafka-wiki kafka-run-class.sh kafka.tools.GetOffsetShell \
--broker-list localhost:9092 \
--topic wiki-events
----

[souce, bash]
----
kafkacat -C -b localhost:9092 -t wiki-events
----

Add Pinot Table

[source, bash]
----
docker exec -it pinot-controller-wiki bin/pinot-admin.sh AddTable \
-tableConfigFile /config/table.json \
-schemaFile /config/schema.json \
-exec
----

Open the Pinot UI http://localhost:9000/

Run Streamlit app

[source, bash]
----
streamlit run streamlit/app.py
----