https://github.com/jmoussa/go-sentitweet
CLI Application holding a sentiment analysis data (Twitter tweets) pipeline with its own Web API to query results in the database. Written entirely in Go.
https://github.com/jmoussa/go-sentitweet
api channels cli cli-app cobra data-pipeline data-pipelines gin gin-framework gin-gonic go go-twitter golang gorilla-mux mongodb nlp sentiment-analysis twitter-api
Last synced: about 1 month ago
JSON representation
CLI Application holding a sentiment analysis data (Twitter tweets) pipeline with its own Web API to query results in the database. Written entirely in Go.
- Host: GitHub
- URL: https://github.com/jmoussa/go-sentitweet
- Owner: jmoussa
- Created: 2021-12-30T07:31:35.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2022-03-06T15:53:01.000Z (over 4 years ago)
- Last Synced: 2025-02-25T17:36:51.646Z (over 1 year ago)
- Topics: api, channels, cli, cli-app, cobra, data-pipeline, data-pipelines, gin, gin-framework, gin-gonic, go, go-twitter, golang, gorilla-mux, mongodb, nlp, sentiment-analysis, twitter-api
- Language: Go
- Homepage:
- Size: 13.4 MB
- Stars: 1
- Watchers: 1
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Sentiment Analysis Platform
> A data-acquisition and enrichment pipeline for loading tweets into a MongoDB database with an API on top to query it.
> All wrapped in a neat CLI
## Components
**Analysis**: Functions available for use in the data pipelines to perform mutations on the data
**API**: handle the API call/logic for fetching tweets and sentiment scores
**CLI**: twitter (tw) CLI utility logic for interacting with tweets and sentiment scores
**Config**: config module based in JSON (enter twitter credentials for use)
**Data Pipelines**: orchestrate/run tweet crawling and sentiment analysis
**DB**: DB-specific connection and query logic
**Monitoring**: monitoring and logging utilities (using AWS SNS for live-streaming insights through SQS Queue subscriptions)
## Architecture
There are two main pieces of architecture
- API
- Sentiment Analysis Pipelines
- Custom (`tw`) CLI utility
- Run the sentiment analysis pipeline and the API webserver to access sentiment analysis results
- *Coming Soon: List, Search, and interact with Twitter through the CLI*
## What the Pipeline Does
The pipeline streams in tweets (based on a search phrase) and performs sentiment analysis on the tweets, before loading all relevant info into the database.
**The Process**:
- Fetch Tweets (based on search term)
- Score Tweets (using the `vader-go` Default Lexicon)
- Upload to DB (MongoDB)
Credentials are configured using JSON config file.
## Running Locally with the CLI:
```bash
# Project Setup (inside project directory)
cp config/config.json.template config/config.json
# fill in your config.json with your credentials
export CONFIG_LOCATION=$(pwd)/config.json
cd bin/ # or add to your $PATH
# Run the sentiment analysis pipeline with no tweet search phrases (default #nft)
# runs in the foreground
./tw pipeline
./tw pipeline --term="#amazon"
# Run the RestAPI server (on port 8080)
./tw server
# (Coming Soon) Output CSV of tweets and sentiment scores
# running the pipeline chunking 100 tweets at a time to csv
tw pipeline --output=csv --chunk-size=100 --term="#amazon" --output-path=./output/
# query the API/database for tweets by time back
tw query --days_back=5 --output=csv --output-path=./output/
```