{"id":22848677,"url":"https://github.com/code-str8/time-series-forecasting","last_synced_at":"2025-03-31T06:12:32.426Z","repository":{"id":220845576,"uuid":"742776344","full_name":"Code-str8/time-series-forecasting","owner":"Code-str8","description":"Developing a model that effectively forecasts the unit sales of numerous items across various Favorita stores with precision.","archived":false,"fork":false,"pushed_at":"2024-02-04T20:23:49.000Z","size":747,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-31T06:12:26.733Z","etag":null,"topics":["data","dataanalysis","forcasting","machine-learning","time-series","visualizations"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Code-str8.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2024-01-13T10:35:48.000Z","updated_at":"2024-05-04T20:41:50.000Z","dependencies_parsed_at":"2024-02-04T20:36:08.363Z","dependency_job_id":"dcfdfce7-dd94-4f1a-9320-c810e4ba8621","html_url":"https://github.com/Code-str8/time-series-forecasting","commit_stats":null,"previous_names":["code-str8/time_series_forecasting","code-str8/time-series-forecasting"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Code-str8%2Ftime-series-forecasting","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Code-str8%2Ftime-series-forecasting/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Code-str8%2Ftime-series-forecasting/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Code-str8%2Ftime-series-forecasting/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Code-str8","download_url":"https://codeload.github.com/Code-str8/time-series-forecasting/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246423728,"owners_count":20774820,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data","dataanalysis","forcasting","machine-learning","time-series","visualizations"],"created_at":"2024-12-13T04:13:49.408Z","updated_at":"2025-03-31T06:12:32.410Z","avatar_url":"https://github.com/Code-str8.png","language":"Jupyter Notebook","readme":"# Time_Series_Forecasting\n\n\nThis is a time series forecasting problem. In this project, you'll\npredict store sales on data from Corporation Favorita, a large\nEcuadorian-based grocery retailer.\n\nSpecifically, you are to **build a model** that more accurately predicts\nthe unit sales for thousands of items sold at different Favorita stores.\n\nThe training data includes dates, store, and product information,\nwhether that item was being promoted, as well as the sales numbers.\nAdditional files include supplementary information that may be useful in\nbuilding your models\n\n**File Descriptions and Data Field Information**\n\ntrain.csv\n\n-   The training data, comprising time series of features store_nbr, family, \n    and onpromotion as well as the target sales.\n\n-   **store_nbr** identifies the store at which the products are sold.\n\n-   **family** identifies the type of product sold.\n\n-   **sales** gives the total sales for a product family at a particular store\n    at a given date. Fractional values are possible since products can be sold in \n    fractional units (1.5 kg of cheese, for instance, as opposed to 1 bag of chips).\n\n-   **onpromotion** gives the total number of items in a product family that\n    were being promoted at a store at a given date.\n\ntest.csv\n\n-   The test data, having the same features as the training data. You will predict the target sales for the dates in this file.\n\n-   The dates in the test data are for the 15 days after the last date in the training data.\n\ntransaction.csv\n\n-   Contains date, store_nbr and transaction made on that specific date.\n\nsample_submission.csv\n\n-   A sample submission file in the correct format.\n\nstores.csv\n\n-   Store metadata, including city, state, type, and cluster.\n\n-   cluster is a grouping of similar stores.\n\noil.csv\n\n-   **Daily oil price** which includes values during both the train and\n     test data timeframes. (Ecuador is an oil-dependent country and its\n     economical health is highly vulnerable to shocks in oil prices.)\n\nholidays_events.csv\n\n-   Holidays and Events, with metadata\n\n\u003e **NOTE**: Pay special attention to the transferred column. A holiday\n\u003e that is transferred officially falls on that calendar day but was\n\u003e moved to another date by the government. A transferred day is more\n\u003e like a normal day than a holiday. To find the day that it was\n\u003e celebrated, look for the corresponding row where type is **Transfer**.\n\u003e\n\u003e For example, the holiday Independencia de Guayaquil was transferred\n\u003e from 2012-10-09 to 2012-10-12, which means it was celebrated on\n\u003e 2012-10-12. Days that are type **Bridge** are extra days that are\n\u003e added to a holiday (e.g., to extend the break across a long weekend).\n\u003e These are frequently made up by the type **Work Day** which is a day\n\u003e not normally scheduled for work (e.g., Saturday) that is meant to\n\u003e payback the Bridge.\n\n-   Additional holidays are days added a regular calendar holiday, for\n    example, as typically happens around Christmas (making Christmas\n    Eve a holiday).\n\n**Additional Notes**\n\n-   Wages in the public sector are paid every two weeks on the 15th and\n    on the last day of the month. Supermarket sales could be affected\n    by this.\n\n-   A magnitude 7.8 earthquake struck Ecuador on April 16, 2016. People\n    rallied in relief efforts donating water and other first need\n    products which greatly affected supermarket sales for several\n    weeks after the earthquake.\n\n**Data Preparation**\n\n**Hypothesis \u0026 Questions**\n\nThe questions below are to be answered. Do note that, you are free to\ndraw more hypothesis from the data.\n\n1.  Is the train dataset complete (has all the required dates)?\n\n2.  Which dates have the lowest and highest sales for each year?\n\n3.  Did the earthquake impact sales?\n\n4.  Are certain groups of stores selling more products? (Cluster, city,\n    state, type)\n\n5.  Are sales affected by promotions, oil prices and holidays?\n\n6.  What analysis can we get from the date and its extractable features?\n\n7.  What is the difference between RMSLE, RMSE, MSE (or why is the MAE\n    greater than all of them?)\n\nYour task is to **build a model** that more accurately predicts the unit\nsales for thousands of items.\n\n**Important**\n\n-   Document process from data cleaning, analysis, assumptions, model\n    building etc. Marks will be awarded for documentation.\n\n**Rubric**\n\n**Documentation**:\n\n-   Excellent: Having documentation on the project ie data cleaning,\n    analysis, hypothesis and model.\n\n-   Good: Gave a summary on some of the processes\n\n-   Fair: Gave a bullet list of the processes with short sentences\n\n-   Poor: No documentation\n\n**Hypothesis Analysis \u0026Visualization:**\n\n-   Excellent: Validated the hypothesis and answered all questions\n    listed earlier with appropriate charts. Used relevant diagrams and\n    charts to show analysis/metrics.\n\n-   Good: Validated at least 4 hypothesis and answered some of the\n    questions listed with appropriate charts. Used relevant diagrams but\n    might need some improvement and.\n\n-   Fair: Lack of clarity on whether the hypothesis was true.\n\n-   Poor: Not answered any of the hypothesis\n\n**Model Building:**\n\n-   Excellent: Model has an RMSLE of 0.2\n\n-   Good: Model has RMSLE of 0.3\n\n-   Fair: Model has RMSLE of 0.4\n\n-   Poor: Model has RMSLE of 0.4 +","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcode-str8%2Ftime-series-forecasting","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcode-str8%2Ftime-series-forecasting","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcode-str8%2Ftime-series-forecasting/lists"}