An open API service indexing awesome lists of open source software.

https://github.com/suryavamsi-p/conflict-nlp-topic-modeling-sentiment-analysis-using-llms

Extracts insights from 26K+ protest events using BERTopic, Top2Vec, and LLMs for real-world applications like crisis monitoring, policy research, and social unrest analysis.
https://github.com/suryavamsi-p/conflict-nlp-topic-modeling-sentiment-analysis-using-llms

all-mpnet-base-v2 bertopic conflict-data data data-science lda llama2 llms machine-learning mistral-7b nlp nltk protest-analysis pyldavis python3 top2vec topic-modeling transformers visualization

Last synced: about 1 month ago
JSON representation

Extracts insights from 26K+ protest events using BERTopic, Top2Vec, and LLMs for real-world applications like crisis monitoring, policy research, and social unrest analysis.

Awesome Lists containing this project

README

          

## Conflict NLP :- Topic Modeling, Sentiment Analysis, and Classification using LLMs

Extracts insights from 26K+ protest events using BERTopic, Top2Vec, and LLMs for real-world applications like crisis monitoring, policy research, and social unrest analysis.

## Project Overview

This capstone project uses **state-of-the-art NLP techniques** to perform :-

- **Topic Modeling** using BERTopic, Top2Vec, and LLaMA2
- **Sentiment Analysis** to assess public sentiment across global conflicts
- **Text Classification** for conflict categorization

The goal is to transform raw conflict data into **actionable intelligence** for policy makers, researchers, and humanitarian aid groups.

## Key Highlights

- **26,000+ conflict records** from ACLED and Google Trends
- Built **4 different topic modeling pipelines** (LDA, BERTopic, Top2Vec, LLaMA2)
- Boosted coherence score for BERT-based topics
- Visualized topic dominance, distributions & coherence
- Preprocessed multilingual noisy text: stopword removal, tokenization, vectorization

## Techniques Used

| Task | Methodology / Tools |
|-----------------------|------------------------------------------|
| Preprocessing | Python, NLTK, RegEx, Gensim |
| Topic Modeling | BERTopic, LDA, Top2Vec, LLaMA2 |
| Dimensionality Reduction | UMAP, HDBSCAN |
| Sentiment Analysis | Hugging Face Transformers (BERT-based) |
| Classification | Logistic Regression, SVM, RandomForest |
| Visualization | matplotlib, seaborn, pyLDAvis, Plotly |

## Repository Structure

```
├── notebooks/
│ ├── BERTopic_Protest_Classification.ipynb
│ ├── LDA_Protest_Classification.ipynb
│ ├── LLaMA2_TopicModeling_protest_analysis.ipynb
│ └── Top2Vec_TopicModeling_Protest_Analysis.ipynb

├── presentations/
│ ├── WorldBank_Final.pptx
│ └── GWU_Capstone_Final.pptx

├── data/ # Not uploaded due to size/privacy
├── README.md
```

## Use Cases

- **Crisis Detection**: Detect and visualize emerging unrest topics
- **Policy Research**: Extract protest drivers across countries
- **Social Analytics**: Map sentiment trends over time or region

## How to Run

1. Clone the repo: `git clone https://github.com/your-username/your-repo-name`
2. Install dependencies from `requirements.txt`
3. Run the Jupyter notebooks inside `notebooks/`

## Contact

**Surya Vamsi Patiballa**
Graduate Student, MS in Data Science — George Washington University (GWU)
- Email :- svamsi2002@gmail.com
- LinkedIn :- https://www.linkedin.com/in/surya-patiballa-b724851aa/
- Resume :- https://drive.google.com/file/d/178IYcArC6YYVdJiIwRmJYodzKZ-JXe-D/view?usp=sharing

> _"Transforming data into dialogue. Insights into action."_

## If you found this project insightful, feel free to star it!!!