Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/srosalino/predicting_black_friday_consumption_patterns

Predicting Black Friday consumption patterns based on customer personal information
https://github.com/srosalino/predicting_black_friday_consumption_patterns

data-exploration feature-engineering linear-regression machine-learning r

Last synced: 4 days ago
JSON representation

Predicting Black Friday consumption patterns based on customer personal information

Awesome Lists containing this project

README

        

**The Black Friday Concept**

The purpose of this work is to evaluate consumer behavior on Black Friday, through the analysis of a dataset of information relating to purchases made during this special period of discounts in retail trade. It is important to start by briefly explaining what Black Friday consists of. It is a sales concept that was born in the United States of America and created to encourage Christmas shopping. It always takes place on the last Friday of November, after Thanksgiving. Thanksgiving is the most celebrated national holiday in the USA and is a day traditionally spent with family. Thanksgiving Day is always on the last Thursday of November and Americans' domestic trips to visit family took them away from stores and, for this reason, it was a very weak sales period in retail trade. Thanks to Black Friday, a weekend that was traditionally very slow for retailers turned into a very intense shopping period, due to the very generous discounts offered to attract customers. This concept was so successful that the idea spread throughout the world. If before its creation the day was considered “dark” due to poor revenue, now it can continue to be considered “dark” but for customers. Everyone wants to take advantage of promotions and discounts, which leads to a lot of euphoria that culminates in the formation of long queues, people being run over, and even uncivil consumer behavior, in addition to the fraud associated with the discounts offered. Faced with this reality, Black Friday began to assume an important role in the sales planning of large consumer companies, which began to invest heavily in marketing campaigns during this period and increasingly use analytical information to understand customer behaviors in purchasing processes, given the stimuli created. In this context, the exploration of large volumes of information about sales and customers could provide important competitive advantages for companies operating in these sectors, thus creating an extremely relevant field for the use of data science.

**Task to Solve**

Considering the importance of Black Friday for company sales in the pre-Christmas period, the objective of this work is to be able to analyze the set of information available in the dataset to understand the behavior and preferences of consumers in this special shopping period, with structuring of information by some relevant attributes of these consumers and segmented by the products sold.
This study can be important for defining companies' inventory levels, pricing, sales force planning, among other relevant dimensions of sales processes, in addition to allowing product offerings to be adjusted to customers' real preferences.


In this sense, this work seeks to understand/answer some of the following questions:
1) What is the relationship between gender and shopping?
2) Is there an association between age group and purchases?
3) Does occupation influence preference for Black Friday?
4) Does the city category influence sales?
5) Does marital status influence consumption?
6) Does prolonged stay in a city influence consumption?

In this report, different customer preferences were investigated in different cities, with different professions, married and single, female or male and taking into account the length of residence in the city. Linear regression models were also used to test the statistical significance of the results obtained. The dataset was segmented into a subsample and models were used to explore the relationship between the total purchase value per person and their sociodemographic information (the filtering criterion of customers residing in their city for 4 or more years was used). Finally, a predictive analysis was performed on a training dataset and a model was adopted to predict the purchase value (Purchase variable) in the data belonging to the test dataset. This report presents some of the results obtained during the work, with the code used appearing in the R file sent together with this document. For process savings, in most situations, it was decided not to transcribe the code created in R for this report.

**Full Report**

Please access the file '*Trabalho_Grupo_4_CDA1_IMD_Relatório_Final.pdf*' for a detailed explanation of the developed work and obtained results.