{"id":20046597,"url":"https://github.com/ncsoft/promotionimpact","last_synced_at":"2025-03-02T07:49:40.366Z","repository":{"id":56935073,"uuid":"159423056","full_name":"ncsoft/promotionImpact","owner":"ncsoft","description":"R package for promotion effect analysis","archived":false,"fork":false,"pushed_at":"2020-04-08T09:56:04.000Z","size":351,"stargazers_count":47,"open_issues_count":2,"forks_count":7,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-02-13T00:05:03.679Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ncsoft.png","metadata":{"files":{"readme":"README.MD","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-11-28T01:12:13.000Z","updated_at":"2024-04-29T06:53:10.000Z","dependencies_parsed_at":"2022-08-21T05:50:10.677Z","dependency_job_id":null,"html_url":"https://github.com/ncsoft/promotionImpact","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ncsoft%2FpromotionImpact","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ncsoft%2FpromotionImpact/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ncsoft%2FpromotionImpact/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ncsoft%2FpromotionImpact/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ncsoft","download_url":"https://codeload.github.com/ncsoft/promotionImpact/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241476435,"owners_count":19968916,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-13T11:24:58.655Z","updated_at":"2025-03-02T07:49:40.333Z","avatar_url":"https://github.com/ncsoft.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"# promotionImpact\n\n'promotionImpact' package is for analysis and measurement of promotion effectiveness on a given target variable(e.g. daily sales). \n\nThis package provides convenient tools for converting promotion schedule into dummy or smoothed predictor variables and examining the effects of these variables controlled for trend/periodicity/structural change or some prespecified variables(e.g. start of a month).\n\n## How to install\nJust run the code below in R.\n\n```\ninstall.packages(\"promotionImpact\")\n```\nfor CRAN version.\n\nOr, \n\n```\nlibrary(devtools)\ndevtools::install_github(\"ncsoft/promotionImpact\")\n```\nfor GitHub version.\n\nTo install properly, you should install Rtools for your R version at https://cran.r-project.org/bin/windows/Rtools/.\nNote that when you run the install_github command, you are asked if you want to update other packages. Even if you select 'None', promotionImpact will be installed successfully.\n\n## How to use\nFirst, you need the following data (Note that promotionImpact contains sample data for practice).\n\n- Daily target data (e.g., Daily Sales, Daily Active Users(DAU))\n\n- Promotion schedule data (including Promotion ID, Start/End date and Promotion type)\n\n```\npromotionImpact::sim.data  # daily simulated sales data\n```\n|     dt     | simulated_sales |\n| :--------: | :-------------: |\n| 2015-02-11 |  1,601,948,810  |\n| 2015-02-12 |  2,048,650,675  |\n| 2015-02-13 |  2,288,870,304  |\n|    ...     |       ...       |\n| 2017-09-25 |  1,492,506,224  |\n  \n```\npromotionImpact::sim.promotion # simulated promotion schedule data\n```\n| pro_id        | start_dt     | end_dt     | tag_info   |\n|:----------------------:|:-------------:|:-----------:|:-----------:|\n| pro_1_1  | 2015-02-16 | 2015-03-14 |    A     |\n| pro_1_2  | 2015-06-07 | 2015-06-25 |    A     |\n|   ...    |    ...     |    ...     |   ...    |\n| pro_5_10 | 2017-04-02 | 2017-04-26 |    E     |\n  \nPromotions in the sample data were run a total of 50 times during 2015-02-11 ~ 2017-09-25, and 10 times for each of the five types A, B, C, D and E.\n\nThe daily sales of the sample data consist of the effects of these promotions, the trend/periodicity factors, the sales surge of the first day of the month(1st day of each month) and some random errors.\n\n\u003cimg src=\"https://github.com/ncsoft/promotionImpact/blob/master/resources/simulated_daily_sales.png?raw=true\"\u003e\n  \nThe goal is to separate and estimate the effects of each promotion type in the daily sales data.\n\nFirst, if you want to control the effect of the first day of the month, add the dummy variable of the first day of each month as shown below.\n\n```\nlibrary(dplyr)\nsim.data \u003c- sim.data %\u003e% \n  dplyr::mutate(month_start = ifelse(substr(as.character(dt),9,10) == '01', 1, 0))\n```\n\nIn this way, you can add as many dummy variables as you need to consider when comparing promotional effects.\n\nNow, create the model as shown below.\n\n```\npri1 \u003c- promotionImpact(data=sim.data, promotion=sim.promotion, \n                        time.field = 'dt', target.field = 'simulated_sales', \n                        dummy.field = 'month_start',\n                        trend = T, period = 30.5, trend.param = 0.02, period.param = 2,\n                        logged = TRUE, differencing = TRUE, synergy.promotion = FALSE,\n                        synergy.var = NULL, allow.missing = TRUE)\n```\n\nA description of each parameter used above model is given below.\n\n- data : dataset with time.field, target.field and other dummy variables\n- promotion : promotion schedule data\n- trend : whether a trend exists\n- period : If NULL, there is no periodicity. If 'auto', the periodicity is automatically estimated. If you specify a numeric value, the periodicity corresponding to the input number is calculated.\n- trend.param : This parameter controls the flexibility of the trend component. The higher the value, the more dynamic it changes.\n- period.param : This parameter controls the flexibility of the periodic component. The higher the value, the more dynamic it changes.\n- logged : If TRUE, target indicator and continuous independent variables are log-transformed.\n- differencing : If TRUE, target indicator and continuous independent variables are transformed to differences.\n- synergy.promotion : whether to consider synergies between promotion types.\n- synergy.var : A list of variables to consider synergy with the promotion type. Inserting c('month_start') takes into account the synergy between each promotion type and the 'month_start' variable.\n- allow.missing : If TRUE, the function will be executed after outputting warning message even if there is no promotion sales data during the promotion period. If FALSE, the execution will be aborted with error message.\n\nNow you can see the effects of each type of promotion.\n\n```\npri1$effects\nA        B        C        D        E\n1 19.34965 13.40238 10.46531 7.764716 4.015453\n```\n\nBecause of the log transformation, you can interpret the effect of each promotion type as the 'daily sales growth rate(%) during the promotion'.\n\nFor example, during the A type promotion, the daily sales increase by about 19.3% per day on average compared to periods without promotions.\n\n-----------------------------------------------------------------------------------------------------------------------------------------------------------\n\nIn order to obtain these effect estimates, promotionImpact takes a series of variable processing steps.\n\nThe promotionImpact processes the variables by copying the shape of the average pattern of the promotion effects through a smoothing function.\n\nFor example, in the above model, the change in promotion effect over time is estimated as:\n\n```\npri1$smoothvar$smoothing_graph\n```\n\n\u003cp align=\"center\"\u003e\n\u003cimg width=\"450\" height=\"400\" src=\"https://github.com/ncsoft/promotionImpact/blob/master/resources/smoothing_function.png?raw=true\" style=\"float: center; zoom:60%\"\u003e\n\u003c/p\u003e\n  \nThis effect is greatest at the start of the promotion and its effect decreases over time.\n\nThe model receives values corresponding to the shape and progress of this promotion type as the value of the promotion type variable for each date.\n\nMore information on this process can be found below.\n\n```\npri1$smoothvar$data   # daily final smoothed value\npri1$smoothvar$smooth_except_date   # the date removed when creating the smoothing function\npri1$smoothvar$smoothing_means   # smoothed values\npri1$smoothvar$smoothing_graph   # plot of above smoothed values\npri1$smoothvar$smooth_value   # smoothed values calculated for each promotion type\npri1$smoothvar$smooth_value_mean   # averages of smoothed values of each promotion type\n```\n\nThe final modeling results are shown below.\n\n```\npri1$model$model  # the final linear model object (including elements of generic lm objects)\npri1$model$final_input_data   # input data (after pre-processing such as variable transformation)\npri1$model$fit_plot   # target vs fitted plot\n```\n\u003cimg src=\"https://github.com/ncsoft/promotionImpact/blob/master/resources/model_call_new.PNG?raw=true\" style=\"float: center; zoom:60%\"\u003e\n  \n\u003cimg src=\"https://github.com/ncsoft/promotionImpact/blob/master/resources/fit_plot.png?raw=true\" style=\"float: center; zoom:60%\"\u003e\n\nThe graph above shows the fitted values with target values after the log and difference transformations.\n\nThe plot below shows the suitability of the trend/periodic components used in the model.\n\n```\npri1$model$trend_period_graph_with_target   # view trend+periodicity components with target variable\n```\n\n\u003cimg src=\"https://github.com/ncsoft/promotionImpact/blob/master/resources/trend_periodicity_with_target.png?raw=true\" style=\"float: center; zoom:60%\"\u003e\n  \n  \n-----------------------------------------------------------------------------------------------------------------------------------------------------------\n\nIn the above example, we only have schedule data (start/end date and type) per promotion, so we estimate the effectiveness of each promotion from the daily target variable itself.\n\nHowever, some user may also have data on how much the daily promotion effect was for each promotion. (For example, you can see the daily payment amount for each promotion)\n\nTo use this to estimate your promotion effectiveness, enter your promotional data as shown below.\n\n```\npromotionImpact::sim.promotion.sales  # simulated daily promotion sales data\n```\n\n|  pro_id  |  start_dt  |   end_dt   | tag_info |     dt     |    payment    |\n| :------: | :--------: | :--------: | :------: | :--------: | :-----------: |\n| pro_1_1  | 2015-02-16 | 2015-03-14 |    A     | 2015-02-16 | 1,033,921,614 |\n| pro_1_1  | 2015-02-16 | 2015-03-14 |    A     | 2015-02-17 |  971,764,194  |\n|   ...    |    ...     |    ...     |   ...    |    ...     |      ...      |\n| pro_5_10 | 2017-04-02 | 2017-04-26 |    E     | 2017-04-26 |  54,212,694   |\n\nThe above is the data with the daily promotion sales('payment' column) for each promotion.\n\nIn this case, the smoothing function that represents the time-based pattern of the promotion effect is estimated from the daily promotion sales for each promotion, and the rest of the process is the same as the example above.\n\n```\npri2 \u003c- promotionImpact(data=sim.data, promotion=sim.promotion.sales, \n                        time.field = 'dt', target.field = 'simulated_sales',\n                        dummy.field = 'month_start',\n                        trend = T, period = 30.5, trend.param = 0.02, period.param = 2,\n                        logged = T, differencing = T)\n```\n\n\n-----------------------------------------------------------------------------------------------------------------------------------------------------------\n\nOn the other hand, instead of inputting the promotion effect as a smoothing function, you can simply input it as a dummy variable. \n\nIn this case, you can set the var.type option to 'dummy' as shown below.\n\n```\npri3 \u003c- promotionImpact(data=sim.data, promotion=sim.promotion, \n                        time.field = 'dt', target.field = 'simulated_sales', \n                        dummy.field = 'month_start', var.type = 'dummy',\n                        trend = T, period = 30.5, trend.param = 0.02, period.param = 2,\n                        structural.change = T, logged = F, differencing = F)\n```\n\nWe also added a structural change element in the time series as well as trend/periodicity components in this example. If the structural.change option is set to TRUE, the model will detect the sudden change in the level of the daily target indicator and add it as a variable.\n\n```\npri3$model$structural_breakpoint\n\"2015-09-16 UTC\" \"2016-02-23 UTC\" \"2016-11-22 UTC\" \"2017-04-20 UTC\"\n```\n\nThen you can see that there has been a sudden change in daily sales on the dates above.\n\nNote that if the promotion has a large impact on the target indicator, the sudden effect of a promotion, such as a promotion launch, can be misinterpreted as a structural change in the average target values. (It is important to make appropriate judgments about this problem with prior knowledge).\n\nThe final data in this model is shown below.\n\n```\npri3$model$final_input_data\n```\n\n\u003cimg src=\"https://github.com/ncsoft/promotionImpact/blob/master/resources/dummy_variables.PNG?raw=true\" style=\"float: center; zoom:100%\"\u003e\n  \nYou can see that the variables A, B, C, D, and E are entered as 1 if the promotion of that type is in progress, and 0 otherwise. \n\nThe 'structure' variable is a factor variable that starts at 1 and increases by 2, 3, ..., etc. per structure change point.\n\nSince the log transformation was not performed in this model, the estimated effect value represents the absolute effect of each promotional type, not the relative effect (growth rate; %).\n\n```\npri3$effects\nA         B         C         D         E\n1 383088749 154422868 108831741 113017212 -13252524\n```\n\nCompared to the previous results, we can see that the ranking of effects between promotion types has changed slightly (C \u003c-\u003e D), and E type promotions have a rather negative effect. (This differs from the level and ranking of the promotion effects when generating the simulation data.)\n\nUntil now, it seems that smoothed variables that reflect changes in promotion effects over time, rather than dummy variables for each type of promotions, are generally more accurate estimates of promotion effects.\n\n-----------------------------------------------------------------------------------------------------------------------------------------------------------\n\nIn summary, you can use promotionImpact to measure and compare promotion effects, assuming that the target indicator is composed of three components; 1) control variables(trend, periodicity, structural change and dummy variables), 2) promotion effects(dummy or smoothed variables of each promotion type) and 3) random errors. \n\nA key feature of promotionImpact is to separate/measure promotion effects that change over time through smoothed values while estimating the trend/periodicity/structure change components from the target time series and controlling them.\n\n-----------------------------------------------------------------------------------------------------------------------------------------------------------\n  \n# detectOutliers\n\ndecectOutliers is a function that captures observations that interfere with the promotion effectiveness analysis.\n\nThis function takes a promotionImpact object as an input and returns dates that are considered outliers among observations used in the model.\n\nThe outlier determination criteria have default values, but you can also specify them.\n\n## How to use\nFirst, we need the object that stores the execution result of the promotionImpact function, so we create the first model shown below.\n\n```\npri4 \u003c- promotionImpact(data = sim.data, promotion = sim.promotion.sales, \n                        time.field = 'dt', target.field = 'simulated_sales')\n```\n\nThen use the detectOutliers function to capture too large or too small observations disturbing the measurement of average promotion effect.\n\n```\nout \u003c- detectOutliers(model = pri4, threshold = list(cooks.distance=1, dfbetas=1, dffits=2), option = 1)\n```\n\nA description of each of the above parameters is given below.\n\n- model : the object of the execution result of promotionImpact function\n- threshold : the list of outlier determination criteria. For dfbetas and dffits, applied as an absolute value.\n- option : the number of indicators that must be exceeded to be considered the final outliers. It can only have a value of 1, 2 or 3. For example, if 2, only the observations exceeding the criteria for at least two of the three indicators are output as the final outlier. \n\nYou can see outliers as follows.\n\n```\nout$outliers\ndate      value   ckdist dfbetas.(Intercept)    dfbetas.A   dfbetas.B\n781 2017-04-02 -0.2822406 0.164772          -0.1117467 -0.005641418 0.004097004\ndfbetas.C    dfbetas.D dfbetas.E dfbetas.trend_period_value    dffits\n781 -0.01066382 -0.005173209  -1.07215                -0.05684834 -1.079674\n```\n\nFrom the above results, we can see that the absolute value of dfbetas for the coefficient value corresponding to E on April 2, 2017 was found to be outlier by exceeding the criterion of 1.\n\nNow, let's remove the outlier and run the promotionImpact function again.\n\n```\nlibrary(dplyr)\nsim.data.new \u003c- sim.data %\u003e% filter(dt != '2017-04-02')\nsim.promotion.sales.new \u003c- sim.promotion.sales %\u003e% filter(dt != '2017-04-02')\npri5 \u003c- promotionImpact(data = sim.data.new, promotion = sim.promotion.sales.new, \n                        time.field = 'dt', target.field = 'simulated_sales')\npri4$effects\nA       B        C       D        E\n1 22.34649 16.8745 11.57992 8.82892 3.970266\npri5$effects\nA        B        C        D        E\n1 22.40018 16.93162 11.61099 8.854282 4.436345\n```\n\nYou can observe the change in promotion effect value compared to before removing the outlier.\n\nIn particular, in the case of other types of promotions, the change in value is small, but in the case of type E, which was the cause of outliers, the value fluctuated significantly.\n\n\n-----------------------------------------------------------------------------------------------------------------------------------------------------------\n  \n# compareModels\n\ncompareModels is a function that helps you specify many of the options in the promotionImpact function for your data.\n\nEnter the input data of the promotionImpact function and specify the options that you need to fix, if necessary. \n\nIt will find the appropriate options under that constraints.\n\n## How to use\nEnter the data you want to use for promotion effect measurement, and set the date, target, and dummy field names.\n\nIf you have the necessary constraints, you can fix them by specifying them with the fix option.\n\n```\nlibrary(dplyr)\nsim.data \u003c- sim.data %\u003e% mutate(month_start = ifelse(substr(as.character(dt),9,10) == '01', 1, 0))\ncomparison \u003c- compareModels(data = sim.data, promotion = sim.promotion.sales,\n                            fix = list(logged = T, differencing = T, smooth.origin='tag'), \n                            time.field = 'dt', target.field = 'simulated_sales', \n                            dummy.field = 'month_start',\n                            trend.param = 0.02, period.param = 2)\nAnalysis report\nTo satisfy the assumption of residuals, we recommand logged=TRUE, differencing=TRUE transformation on the response variable.\nAnd the most appropriate options for independent variables are smooth.origin=tag, synergy.promotion=FALSE, trend=FALSE, period=auto, structural.change=FALSE under logged=TRUE, differencing=TRUE, smooth.origin=tag condition.\nBut this may be local optimum not global optimum.\n```\nIt suggests options to minimize the AIC under the constraints as above.\n\nSince the decision about logged and differencing transformation is made mainly through residual analysis, various plots are stored so that users can make decision considering each case.\n\nFor example, the following figures show that the periodicity of the residuals can be removed through differencing transformation.\n\n```\nlibrary(gridExtra)\ndo.call(grid.arrange, comparison$residualPlot)\n```\n\u003cp align=\"center\"\u003e\n\u003cimg width=\"500\" height=\"280\" src=\"https://github.com/ncsoft/promotionImpact/blob/master/resources/residual_plot.png?raw=true\" /\u003e\n\u003c/p\u003e\n  \n```\ndo.call(grid.arrange, comparison$acfPlot)\n```\n\u003cp align=\"center\"\u003e\n\u003cimg width=\"500\" height=\"280\" src=\"https://github.com/ncsoft/promotionImpact/blob/master/resources/acf_plot.png?raw=true\" /\u003e\n\u003c/p\u003e\n\nA normal distribution assumption is required for the test of the coefficients of the model. The pictures below will help you to make a decision. \n\n```\ndo.call(grid.arrange, comparison$qqPlot)\n```\n\u003cp align=\"center\"\u003e\n\u003cimg width=\"500\" height=\"280\" src=\"https://github.com/ncsoft/promotionImpact/blob/master/resources/normal_qqplot.png?raw=true\" /\u003e\n\u003c/p\u003e\n  \n```\ndo.call(grid.arrange, comparison$histPlot)\n```\n\u003cp align=\"center\"\u003e\n\u003cimg width=\"500\" height=\"280\" src=\"https://github.com/ncsoft/promotionImpact/blob/master/resources/hist_plot.png?raw=true\" /\u003e\n\u003c/p\u003e\n\nThe following table shows the results of various models when other options have been changed, with the exception of log and differencing transformations.\n\nAt this time, up to 10 models are compared considering various combinations of options.\n\n```\ncomparison$params\ndifferencing logged smooth.origin synergy.promotion trend period structural.change       AIC      RMSE        MAE  p        \n1          TRUE   TRUE           tag             FALSE  TRUE   NULL             FALSE -1488.699 0.1101252 0.08259737  8        \n2          TRUE   TRUE           tag              TRUE FALSE   auto             FALSE -1492.414 0.1087691 0.08221139 18        \n3          TRUE   TRUE           tag             FALSE FALSE   auto             FALSE -1493.125 0.1098708 0.08260071  8 *final*\n4          TRUE   TRUE           tag             FALSE FALSE   NULL             FALSE -1490.699 0.1101252 0.08259681  7        \n5          TRUE   TRUE           tag              TRUE FALSE   NULL              TRUE -1483.025 0.1089619 0.08221421 21        \n6          TRUE   TRUE           tag              TRUE  TRUE   auto              TRUE -1485.006 0.1087355 0.08218397 22        \n7          TRUE   TRUE           tag             FALSE FALSE   auto              TRUE -1485.148 0.1098695 0.08261233 12        \n8          TRUE   TRUE           tag              TRUE FALSE   NULL             FALSE -1491.008 0.1089629 0.08219880 17        \n9          TRUE   TRUE           tag             FALSE  TRUE   NULL              TRUE -1480.721 0.1101239 0.08260761 12        \n10         TRUE   TRUE           tag              TRUE  TRUE   NULL             FALSE -1489.008 0.1089628 0.08219762 18 \n```\n\nIn addition, the promotionImpact object for each model is stored in \"models\" in the form of a list, and the final model which minimizes AIC can be called from \"final_model\".\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fncsoft%2Fpromotionimpact","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fncsoft%2Fpromotionimpact","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fncsoft%2Fpromotionimpact/lists"}