{"id":19866065,"url":"https://github.com/zhenye-na/da-rnn","last_synced_at":"2025-04-05T18:07:57.948Z","repository":{"id":62112096,"uuid":"134233989","full_name":"Zhenye-Na/DA-RNN","owner":"Zhenye-Na","description":"📃  𝖀𝖓𝖔𝖋𝖋𝖎𝖈𝖎𝖆𝖑 PyTorch Implementation of DA-RNN (arXiv:1704.02971)","archived":false,"fork":false,"pushed_at":"2020-11-22T11:33:08.000Z","size":7918,"stargazers_count":424,"open_issues_count":0,"forks_count":140,"subscribers_count":12,"default_branch":"master","last_synced_at":"2025-03-29T17:08:24.999Z","etag":null,"topics":["attention-mechanism","deep-learning","lstm-neural-networks","paper-implementations","recurrent-neural-networks","rnn-pytorch"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Zhenye-Na.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-05-21T07:27:48.000Z","updated_at":"2025-02-20T14:17:27.000Z","dependencies_parsed_at":"2022-10-26T15:45:13.777Z","dependency_job_id":null,"html_url":"https://github.com/Zhenye-Na/DA-RNN","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Zhenye-Na%2FDA-RNN","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Zhenye-Na%2FDA-RNN/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Zhenye-Na%2FDA-RNN/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Zhenye-Na%2FDA-RNN/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Zhenye-Na","download_url":"https://codeload.github.com/Zhenye-Na/DA-RNN/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247378142,"owners_count":20929296,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["attention-mechanism","deep-learning","lstm-neural-networks","paper-implementations","recurrent-neural-networks","rnn-pytorch"],"created_at":"2024-11-12T15:24:55.169Z","updated_at":"2025-04-05T18:07:57.918Z","avatar_url":"https://github.com/Zhenye-Na.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PyTorch Implementation of DA-RNN\n\n[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square)](http://makeapullrequest.com)\n[![contributions welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat-square)](https://github.com/Zhenye-Na/DA-RNN/issues)\n[![HitCount](http://hits.dwyl.io/Zhenye-Na/DA-RNN.svg)](http://hits.dwyl.io/Zhenye-Na/DA-RNN)\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Zhenye-Na/DA-RNN/blob/master/src/da_rnn.ipynb.py)\n\n\n\u003e *Get hands-on experience of implementation of RNN (LSTM) in Pytorch;*  \n\u003e *Get familiar with Finacial data with Deep Learning;*\n\n\u003cdiv align=\"center\"\u003e\n  \u003ca href=\"https://starchart.cc/Zhenye-Na/DA-RNN\"\u003e\u003cimg src=\"https://starchart.cc/Zhenye-Na/DA-RNN.svg\" alt=\"Stargazers over time\"\u003e\u003c/a\u003e\n  \u003cp align=\"center\"\u003eStargazers over time\u003c/p\u003e\n\u003c/div\u003e\n\n## Table of Contents\n\n- [Dataset](#dataset)\n    - [Download](#download)\n    - [Description](#description)\n- [Usage](#usage)\n    - [Train](#train)\n- [Result](#result)\n    - [Training Loss](#training-loss)\n    - [Prediction](#prediction)\n- [DA-RNN](#da-rnn)\n    - [LSTM](#lstm)\n    - [Attention Mechanism](#attention-mechanism)\n    - [Model](#model)\n    - [Experiments and Parameters Settings](#experiments-and-parameters-settings)\n        - [NASDAQ 100 Stock dataset](#nasdaq-100-stock-dataset)\n        - [Training procedure \u0026 Parameters Settings](#training-procedure--parameters-settings)\n- [References](#references)\n\n\n## Dataset\n\n### Download\n\n[NASDAQ 100 stock data](http://cseweb.ucsd.edu/~yaq007/NASDAQ100_stock_data.html)\n\n### Description\n\nThis dataset is a subset of the full `NASDAQ 100 stock dataset` used in \u003csup\u003e[1]\u003c/sup\u003e. It includes 105 days' stock data starting from July 26, 2016 to December 22, 2016. Each day contains 390 data points except for 210 data points on November 25 and 180 data points on Decmber 22.\n\nSome of the corporations under `NASDAQ 100` are not included in this dataset because they have too much missing data. There are in total 81 major coporations in this dataset and we interpolate the missing data with linear interpolation.\n\nIn \u003csup\u003e[1]\u003c/sup\u003e, the first 35,100 data points are used as the training set and the following 2,730 data points are used as the validation set. The last 2,730 data points are used as the test set.\n\n\n\n## Usage\n\n### Train\n\n```\nusage: main.py [-h] [--dataroot DATAROOT] [--batchsize BATCHSIZE]\n               [--nhidden_encoder NHIDDEN_ENCODER]\n               [--nhidden_decoder NHIDDEN_DECODER] [--ntimestep NTIMESTEP]\n               [--epochs EPOCHS] [--lr LR]\n\nPyTorch implementation of paper 'A Dual-Stage Attention-Based Recurrent Neural\nNetwork for Time Series Prediction'\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --dataroot DATAROOT   path to dataset\n  --batchsize BATCHSIZE\n                        input batch size [128]\n  --nhidden_encoder NHIDDEN_ENCODER\n                        size of hidden states for the encoder m [64, 128]\n  --nhidden_decoder NHIDDEN_DECODER\n                        size of hidden states for the decoder p [64, 128]\n  --ntimestep NTIMESTEP\n                        the number of time steps in the window T [10]\n  --epochs EPOCHS       number of epochs to train [10, 200, 500]\n  --lr LR               learning rate [0.001] reduced by 0.1 after each 10000\n                        iterations\n```\n\nAn example of training process is as follows:\n\n```\npython3 main --lr 0.0001 --epochs 50\n```\n\n## Result\n\n### Training process\n\n| \u003cimg src=\"https://github.com/Zhenye-Na/DA-RNN/blob/master/fig/result_01.png?raw=true\" width=\"80%\"\u003e | \u003cimg src=\"https://github.com/Zhenye-Na/DA-RNN/blob/master/fig/result_02.png?raw=true\" width=\"80%\"\u003e |\n|----------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------|\n\n### Training Loss\n\n| \u003cimg src=\"https://github.com/Zhenye-Na/DA-RNN/blob/master/fig/loss1.png?raw=true\" width=\"80%\"\u003e | \u003cimg src=\"https://github.com/Zhenye-Na/DA-RNN/blob/master/fig/loss2.png?raw=true\" width=\"80%\"\u003e |\n|------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|\n\n\n### Prediction\n\n| \u003cimg src=\"https://github.com/Zhenye-Na/DA-RNN/blob/master/fig/prediction.png?raw=true\" width=\"80%\"\u003e |\n|-----------------------------------------------------------------------------------------------------|\n\n\n## DA-RNN\n\nIn the paper [*\"A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction\"*](https://arxiv.org/pdf/1704.02971.pdf). \n\nThey proposed a novel dual-stage attention-based recurrent neural network (DA-RNN) for time series prediction. In the first stage, an input attention mechanism is introduced to adaptively extract relevant driving series (a.k.a., input features) at each time step by referring to the previous encoder hidden state. In the second stage, a temporal attention mechanism is introduced to select relevant encoder hidden states across all time steps.\n\nFor the objective, a square loss is used. With these two attention mechanisms, the DA-RNN can adaptively select the most relevant input features and capture the long-term temporal dependencies of a time series. A graphical illustration of the proposed model is shown in Figure 1.\n\n\u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"https://github.com/Zhenye-Na/DA-RNN/blob/master/fig/fig1.png?raw=true\" \u003e\n    \u003cp\u003eFigure 1: Graphical illustration of the dual-stage attention-based recurrent neural network.\u003c/p\u003e\n\u003c/div\u003e\n\n\nThe Dual-Stage Attention-Based RNN (a.k.a. DA-RNN) model belongs to the general class of \u003cu\u003eNonlinear Autoregressive Exogenous (NARX) models\u003c/u\u003e, which predict the current value of a time series based on historical values of this series plus the historical values of multiple exogenous time series.\n\n### LSTM\n\nRecursive Neural Network model has been used in this paper. RNN models are powerful to exhibit quite sophisticated dynamic temporal structure for sequential data. RNN models come in many forms, one of which is the Long-Short Term Memory (LSTM) model that is widely applied in language models. \n\n\n### Attention Mechanism\n\nAttention mechanism performs feature selection as the paper mentioned, the model can keep only the most useful information at each temporal stage.\n\n### Model\n\nDA-RNN model includes two LSTM networks with attention mechanism (an encoder and a decoder). \n\nIn the encoder, they introduced a novel input attention mechanism that can adaptively select the relevant driving series. In the decoder, a temporal attention mechanism is used to automatically select relevant encoder hidden states across all time steps.\n\n### Experiments and Parameters Settings\n\n#### NASDAQ 100 Stock dataset\n\n\u003e In the NASDAQ 100 Stock dataset, we collected the stock prices of 81 major corporations under NASDAQ 100, which are used as the driving time series. The index value of the NASDAQ 100 is used as the target series. The frequency of the data collection is minute-by-minute. This data covers the period from July 26, 2016 to December 22, 2016, 105 days in total. Each day contains 390 data points from the opening to closing of the market except that there are 210 data points on November 25 and 180 data points on December 22. In our experiments, we use the ﬁrst 35,100 data points as the training set and the following 2,730 data points as the validation set. The last 2,730 data points are used as the test set. This dataset is publicly available and will be continuously enlarged to aid the research in this direction.\n\n\n#### Training procedure \u0026 Parameters Settings\n\n|                 Category                |                                       Description                                      |\n|:---------------------------------------:|:--------------------------------------------------------------------------------------:|\n|           Optimization method           |      minibatch stochastic gradient descent (SGD) together with the Adam optimizer      |\n|   number of time steps in the window T  |                                         T = 10                                         |\n| size of hidden states for the encoder m |                                     m = p = 64, 128                                    |\n| size of hidden states for the decoder p |                                     m = p = 64, 128                                    |\n|            Evaluation Metrics           | $$O(y_T , \\hat{y_T} ) = \\frac{1}{N} \\sum \\limits_{i=1}^{N} (y_T^i , \\hat{y_T}^i)^2  $$ |\n\n## References\n\n[1] Yao Qin, Dongjin Song, Haifeng Chen, Wei Cheng, Guofei Jiang, Garrison W. Cottrell. [*\"A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction\"*](https://arxiv.org/pdf/1704.02971.pdf). arXiv preprint arXiv:1704.02971 (2017).  \n[2] Chandler Zuo. [*\"A PyTorch Example to Use RNN for Financial Prediction\"*](http://chandlerzuo.github.io/blog/2017/11/darnn). (2017).  \n[3] YitongCU. [*\"Dual Staged Attention Model for Time Series prediction\"*](https://github.com/YitongCU/Duel-staged-Attention-for-NYC-Weather-prediction).  \n[4] Pytorch Forum. [*\"Why 3d input tensors in LSTM?\"*](https://discuss.pytorch.org/t/why-3d-input-tensors-in-lstm/4455).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzhenye-na%2Fda-rnn","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzhenye-na%2Fda-rnn","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzhenye-na%2Fda-rnn/lists"}