Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mneedham/pinot-wiki
https://github.com/mneedham/pinot-wiki
kafka pinot pulsar redpanda wikipedia
Last synced: 12 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/mneedham/pinot-wiki
- Owner: mneedham
- Created: 2022-01-11T12:27:58.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2023-05-09T11:07:21.000Z (over 1 year ago)
- Last Synced: 2024-04-14T09:10:05.891Z (9 months ago)
- Topics: kafka, pinot, pulsar, redpanda, wikipedia
- Language: Python
- Homepage:
- Size: 222 KB
- Stars: 19
- Watchers: 3
- Forks: 10
- Open Issues: 2
-
Metadata Files:
- Readme: README.adoc
Awesome Lists containing this project
README
# Building a real-time analytics dashboard with Streamlit, Apache Pinot, and Apache Kafka
Clone repository
[source, bash]
----
git clone [email protected]:mneedham/pinot-wiki.git && cd pinot-wiki
----Spin up all components
[source, bash]
----
docker-compose up
----or on the Mac M1:
[source, bash]
----
docker-compose -f docker-compose-m1.yml up
----Setup Python
Ingest Wikipedia events
[source, bash]
----
python -m venv .venv
source venv/bin/activate
pip install -r requirements.txt
----Create Kafka topic
[source, bash]
----
docker exec -it kafka-wiki kafka-topics.sh \
--bootstrap-server localhost:9092 \
--partitions 5 \
--topic wiki-events \
--create
----Ingest Wikipedia events
[source, bash]
----
python wiki_to_kafka.py
----Check Wikipedia events are ingesting
[source, bash]
----
docker exec -it kafka-wiki kafka-run-class.sh kafka.tools.GetOffsetShell \
--broker-list localhost:9092 \
--topic wiki-events
----[souce, bash]
----
kafkacat -C -b localhost:9092 -t wiki-events
----Add Pinot Table
[source, bash]
----
docker exec -it pinot-controller-wiki bin/pinot-admin.sh AddTable \
-tableConfigFile /config/table.json \
-schemaFile /config/schema.json \
-exec
----Open the Pinot UI http://localhost:9000/
Run Streamlit app
[source, bash]
----
streamlit run streamlit/app.py
----