https://github.com/abdul-aa/black-friday-consumer-experience--before-and-after-covid-
Analyzed posts on Reddit related to Black Friday using topic modeling, sentiment analysis, linear regression, and other statistical techniques to uncover user attitudes and trends.
https://github.com/abdul-aa/black-friday-consumer-experience--before-and-after-covid-
causal-inference lda-model linear-regression natural-language-processing textpreprocessing topic-modeling vader-sentiment-analysis
Last synced: 3 months ago
JSON representation
Analyzed posts on Reddit related to Black Friday using topic modeling, sentiment analysis, linear regression, and other statistical techniques to uncover user attitudes and trends.
- Host: GitHub
- URL: https://github.com/abdul-aa/black-friday-consumer-experience--before-and-after-covid-
- Owner: Abdul-AA
- License: mit
- Created: 2024-02-14T11:02:11.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-02-20T19:09:22.000Z (over 1 year ago)
- Last Synced: 2025-01-21T00:14:59.953Z (4 months ago)
- Topics: causal-inference, lda-model, linear-regression, natural-language-processing, textpreprocessing, topic-modeling, vader-sentiment-analysis
- Language: Jupyter Notebook
- Homepage:
- Size: 5.1 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Analyzing Black Friday Sentiment: Pre and Post-Pandemic Insights
## Project Overview
This project leverages Natural Language Processing (NLP) and regression analysis to study consumer sentiment towards Black Friday across pre and post-COVID-19 pandemic periods, utilizing data from Reddit subreddits to uncover changes in consumer attitudes and behaviors.## Contributors
- Abdul Aroworamimo
- Mohamed Elenany
- Tomy Pelletier
- Joshua Poozhikala
- Valentin Najean## Methodology and Findings
### Methodology
The project's approach included several critical steps to analyze consumer sentiment towards Black Friday across the pre and post-COVID-19 pandemic periods using Reddit data:
- **Data Retrieval & Pre-Processing:** Data from various subreddits were merged, followed by tokenization, lemmatization, POS tagging, N-Gram modeling, and TF-IDF application to assess word significance.
- **Sentiment Analysis:** VADER sentiment analysis was used for sentiment labeling, supplemented by K-Means clustering on TF-IDF vectors to further categorize sentiments.
- **Latent Dirichlet Allocation (LDA):** Implemented to uncover latent topics within the discussions, distinguishing between online and in-store shopping preferences.
- **Logistic Regression Analysis:** Employed to predict sentiment based on variables such as subreddit IDs, comment scores, and year of the post, alongside the LDA results. Negative comments were up-sampled to balance the dataset.
- **Causal Inference Analysis:** Utilized the causalml library to estimate the Average Treatment Effect (ATE) of the pandemic period on sentiment, employing T-learner and S-learner models.
### Findings
- **Sentiment Distribution:** The analysis identified 26,002 positive, 14,746 neutral, and 8,561 negative posts, with notable differences between the sentiment categories derived from VADER and those from K-Means clustering.
- **Latent Topics:** The LDA model indicated that discussions primarily revolved around online shopping and in-store experiences, with two topics providing the most coherence. From the logistic regression, it was inferred that posts that are about online shopping are correlated to positive sentiments
- **Logistic Regression Result:** The logistic regression analysis indicated that mentions of online shopping in posts are associated with positive sentiment.- **Impact of Time on Sentiment:** Causal inference suggested a slight decrease in positive sentiment post-pandemic, indicating the pandemic's potential negative impact on public sentiment towards Black Friday.
### Key Takeaways
This detailed analysis offers valuable insights into the shifts in consumer sentiment towards Black Friday, providing a basis for businesses, economists, and policymakers to adapt strategies and make informed decisions in response to changing consumer preferences and behaviors.