https://github.com/cfitzsimons/twitch-chat-running-sentiment
An Apache Storm pipeline to do live sentiment analysis on Twitch streams
https://github.com/cfitzsimons/twitch-chat-running-sentiment
apache-storm data-science docker visualisation
Last synced: 2 months ago
JSON representation
An Apache Storm pipeline to do live sentiment analysis on Twitch streams
- Host: GitHub
- URL: https://github.com/cfitzsimons/twitch-chat-running-sentiment
- Owner: CFitzsimons
- Created: 2025-11-23T20:36:20.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2025-11-27T22:59:58.000Z (7 months ago)
- Last Synced: 2025-11-30T12:38:55.362Z (7 months ago)
- Topics: apache-storm, data-science, docker, visualisation
- Language: Java
- Homepage:
- Size: 2.83 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
# Twitch Sentiment Storm
This project consists of the following:
- A java project that can build a Storm topology
- A python rest server to handle data analytics
- A docker-compose to manage running locally
- A superset instance for visualisations
There's a full writeup that will be included _after_ the grades have been released to ensure the work is not copied. It will contain a full explanation of the motivations behind this project.
I've also included an example CSV which shows the kind of data that the pipeline will produce, that is visible here:
- [example_of_processed_data.csv](example_of_processed_data.csv)
This just contains 7,000~ records gathered during a brief run of the pipeline.
## Architecture

## How to use it?
Right now you need to perform the following actions:
1. Build the maven project, this produces the Topology jar
2. Modify the twitch streams you want to connect to in `TwitchMessageSpout` and populate your oauth token
3. Run `docker-compose up --build -d`. This should raise the various services required.
4. Run the `deploy-topology.sh` script after the topology has been built, this will ensure it is added to the deployed Storm cluster
5. If you are going to use superset, run `superset-startup.sh`. This will create the right credentials. You can log in with `admin`/`admin`.
## FAQ
> Q: How can I change the sentiment analysis model being used?
>
> A: Look at `app.py`, you can change the model in there. Provided the interface is the same, you can swap this to any model you wish without having to modify the topology.
> Q: The scripts aren't working
>
> A: Ensure they have execute permission. Try `chmod +x deploy-topology.sh` for example.
> Q: How do I compile the topology?
>
> A: The easiest way is to import the project in to Intellij and get Maven to handle it for you. The `pom.xml` should have all the dependencies setup.
> Q: How do I run it in local mode?
>
> A: You need to change the Postgres and Sentiment Bolt URLs to point to localhost (since you'll be running it outside the container). Then look in the `pom.xml` and remove `provided` in the `storm` library, since it's not going to be available when running in local mode.
## What's left to do?
- [ ] Resolve python server bottleneck. Possibly deploy to k8s.
- [ ] Resolve DB write bottleneck, possibly replace with Apache Cassandra
- [ ] Anywhere there's a hardcoded variable (localhost vs. docker network names), replace them with config variables.
- [ ] Build a better interface for subscribing to streamer channels and deploying more spouts.
## Acknowledgements
The author acknowledges the use of AI-based coding support tools, specifically ChatGPT, during the implementation of sections of this project, including the Twitch spout and database connectivity.
All AI-assisted content was critically reviewed, tested, and incorporated under the author’s responsibility.