{"id":21589184,"url":"https://github.com/modanesh/recurrent_implicit_quantile_networks","last_synced_at":"2025-06-14T09:06:29.146Z","repository":{"id":112712047,"uuid":"259685919","full_name":"modanesh/recurrent_implicit_quantile_networks","owner":"modanesh","description":"Implementation of the Recurrent Implicit Quantile Networks (RIQNs), used as a baseline in the OOD detection in the anomalous RL benchmark","archived":false,"fork":false,"pushed_at":"2021-10-24T06:33:28.000Z","size":20021,"stargazers_count":14,"open_issues_count":1,"forks_count":5,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-06-14T09:04:57.439Z","etag":null,"topics":["anomaly-detection","pytorch","reinforcement-learning"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2107.04982","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/modanesh.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2020-04-28T16:02:53.000Z","updated_at":"2024-12-31T12:25:01.000Z","dependencies_parsed_at":null,"dependency_job_id":"d7bfe175-327a-4db0-831b-82ff1838a7aa","html_url":"https://github.com/modanesh/recurrent_implicit_quantile_networks","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/modanesh/recurrent_implicit_quantile_networks","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/modanesh%2Frecurrent_implicit_quantile_networks","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/modanesh%2Frecurrent_implicit_quantile_networks/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/modanesh%2Frecurrent_implicit_quantile_networks/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/modanesh%2Frecurrent_implicit_quantile_networks/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/modanesh","download_url":"https://codeload.github.com/modanesh/recurrent_implicit_quantile_networks/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/modanesh%2Frecurrent_implicit_quantile_networks/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259790457,"owners_count":22911547,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anomaly-detection","pytorch","reinforcement-learning"],"created_at":"2024-11-24T16:12:59.821Z","updated_at":"2025-06-14T09:06:29.129Z","avatar_url":"https://github.com/modanesh.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Recurrent Implicit Quantile Networks\n\nThis repository provides the implementation of the baseline for out-of-distribution detection in \nRL benchmarks. Here is the code to the benchmark: \n[https://github.com/modanesh/anomalous_rl_envs](https://github.com/modanesh/anomalous_rl_envs).\nIt contains two sets of environments, one is derived from [OpenAI Gym control task](https://github.com/openai/gym) and the other\nfrom [PyBullet3](https://github.com/bulletphysics/bullet3). \n\nIf you ever used this repo in your work, please cite it with:\n```\n@inproceedings{danesh2021oodd,\n  title={Out-of-Distribution Dynamics Detection: RL-Relevant Benchmarks and Results},\n  author={Danesh, Mohamad H and Fern, Alan},\n  booktitle={International Conference on Machine Learning, Uncertainty \u0026 Robustness in Deep Learning Workshop},\n  journal={},\n  year={2021}\n}\n```\n\n## Installation\n- python 3.6+\n- To install dependencies:\n```commandline\npip install -r requirements.txt\n```\n\n## Usage\nThe two files: `autoregressive_control.py` and `autoregressive_pybullet.py` are quite similar in structure and functionality.\nTheir main difference is their targeted environments. `autoregressive_control.py` works with OpenAI Gym control tasks, while\n`autoregressive_pybullet.py` works with the Bullet physics simulations.\n\nIn the following, the usage of `autoregressive_control.py` is provided. However, the same would apply for the other case.\n\n### Parameters\n```commandline\nusage: autoregressive_control.py [-h] [--predictive_model_training]\n                                 [--predictive_model_testing]\n                                 [--anomaly_detection]\n                                 [--horizon_comparison_as]\n                                 [--samplesize_comparison_as]\n                                 [--avgvsmax_comparison_as]\n                                 [--dataset_analysis] [--dists_cdf]\n                                 [--detection_delay] [--is_recurrent]\n                                 [--is_recurrent_v2] [--feature_part_analysis]\n                                 [--scheduled_sampling_training]\n                                 [--predictive_model_paths PREDICTIVE_MODEL_PATHS [PREDICTIVE_MODEL_PATHS ...]]\n                                 [--batch_size BATCH_SIZE] [--lr LR]\n                                 [--gru_units GRU_UNITS]\n                                 [--num_quantile_sample NUM_QUANTILE_SAMPLE]\n                                 [--policy_num_quantile_sample POLICY_NUM_QUANTILE_SAMPLE]\n                                 [--num_tau_sample NUM_TAU_SAMPLE]\n                                 [--quantile_embedding_dim QUANTILE_EMBEDDING_DIM]\n                                 [--policy_quantile_embedding_dim POLICY_QUANTILE_EMBEDDING_DIM]\n                                 [--test_interval TEST_INTERVAL]\n                                 [--num_iterations NUM_ITERATIONS]\n                                 [--env_name ENV_NAME] [--data_path DATA_PATH]\n                                 [--test_data_path TEST_DATA_PATH]\n                                 [--noisy_data_path NOISY_DATA_PATH]\n                                 [--anomaly_inserted ANOMALY_INSERTED]\n                                 [--clip_value CLIP_VALUE]\n                                 [--horizons HORIZONS [HORIZONS ...]]\n                                 [--sampling_sizes SAMPLING_SIZES [SAMPLING_SIZES ...]]\n                                 [--given_fpr GIVEN_FPR]\n                                 [--decay_type {linear,exponential}]\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --predictive_model_training\n                        To train autoregressive models\n  --predictive_model_testing\n                        To test autoregressive models\n  --anomaly_detection   Do the AD when anomalies injected into the system\n  --horizon_comparison_as\n                        Studying the affect of horizon on anomaly scores and\n                        AUCs\n  --samplesize_comparison_as\n                        Studying the affect of sampling size on anomaly scores\n                        and AUCs\n  --avgvsmax_comparison_as\n                        Studying the affect of combining anomaly scores using\n                        avg vs. max on AUCs\n  --dataset_analysis    Analyzing dataset\n  --dists_cdf           Studying CDFs of internal distributions\n  --detection_delay     Measuring the delay in detecting anomalies\n  --is_recurrent        Determines whether the model has memory or not\n  --is_recurrent_v2     Determines whether the model has memory or not -- v2\n                        RNN model\n  --feature_part_analysis\n                        Analyzing feature participation is calculating anomaly\n                        scores\n  --scheduled_sampling_training\n                        To train autoregressive models using scheduled\n                        sampling\n  --predictive_model_paths PREDICTIVE_MODEL_PATHS [PREDICTIVE_MODEL_PATHS ...]\n                        Path to all predictive models\n  --batch_size BATCH_SIZE\n                        Batch size\n  --lr LR               Learning rate\n  --gru_units GRU_UNITS\n                        Number of cells in the GRU\n  --num_quantile_sample NUM_QUANTILE_SAMPLE\n                        Number of quantile samples for IQN\n  --policy_num_quantile_sample POLICY_NUM_QUANTILE_SAMPLE\n                        Number of quantile samples for policy IQN\n  --num_tau_sample NUM_TAU_SAMPLE\n                        Number of tau samples for IQN, sets the distribution\n                        sampling size.\n  --quantile_embedding_dim QUANTILE_EMBEDDING_DIM\n                        Qunatiles embedding dimension in IQN\n  --policy_quantile_embedding_dim POLICY_QUANTILE_EMBEDDING_DIM\n                        Qunatiles embedding dimension in policy IQN\n  --test_interval TEST_INTERVAL\n                        Intervals between train and test\n  --num_iterations NUM_ITERATIONS\n                        Number of iterations to update model\n  --env_name ENV_NAME   Name of the main environment: to train, test, update\n                        models, find threshold, and calculate performance on\n                        normal envs\n  --data_path DATA_PATH\n                        path to the dataset json file\n  --test_data_path TEST_DATA_PATH\n                        path to the test dataset json file\n  --noisy_data_path NOISY_DATA_PATH\n                        path to the test dataset json file\n  --anomaly_inserted ANOMALY_INSERTED\n                        Time when the anomaly is inserted into the system\n  --clip_value CLIP_VALUE\n                        Clipping gradients\n  --horizons HORIZONS [HORIZONS ...]\n                        Horizon to go forward in time\n  --sampling_sizes SAMPLING_SIZES [SAMPLING_SIZES ...]\n                        Size of the sampling to build the tree of\n                        distributions at time t\n  --given_fpr GIVEN_FPR\n                        Acceptable FPR rate to calculate the threshold for\n                        anomaly detection delay\n  --decay_type {linear,exponential}\n                        How to decay epsilon in Scheduled sampling\n```\n\n### How To Run\nFirst, you need to generate a dataset of nominal trajectories by the following command:\n```commandline\npython autoregressive_control.py --test_policy --env_name Acrobot-v1\n```\n\nTo train the RIQN predictor:\n```commandline\npython autoregressive_control.py --predictive_model_training --env_name Acrobot-v1 --is_recurrent_v2 --predictive_model_paths \"SOME PATHS\"\n```\n\nTo test the RIQN predictor:\n```commandline\npython autoregressive_control.py --predictive_model_testing --env_name Acrobot-v1 --predictive_model_paths \"SOME PATHS\" --is_recurrent_v2 --horizons 1 --anomaly_inserted 0\n```\n\nTo detect anomalies using the RIQN predictor:\n```commandline\npython autoregressive_control.py --anomaly_detection --anomaly_inserted 20 --horizons 1 10 --sampling_sizes 4 8 32 128 --is_recurrent_v2 --num_iterations 5 --env_name AcrobotMod-v4 --predictive_model_paths \"SOME PATHS\"\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmodanesh%2Frecurrent_implicit_quantile_networks","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmodanesh%2Frecurrent_implicit_quantile_networks","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmodanesh%2Frecurrent_implicit_quantile_networks/lists"}