{"id":15362508,"url":"https://github.com/patrick-kidger/generalised_shapelets","last_synced_at":"2026-03-01T09:32:31.857Z","repository":{"id":37656819,"uuid":"242159324","full_name":"patrick-kidger/generalised_shapelets","owner":"patrick-kidger","description":"Code for \"Generalised Interpretable Shapelets for Irregular Time Series\"","archived":false,"fork":false,"pushed_at":"2023-07-06T21:59:18.000Z","size":44581,"stargazers_count":57,"open_issues_count":2,"forks_count":9,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-11-17T15:34:56.534Z","etag":null,"topics":["machine-learning","shapelet-transform","shapelets","time-series"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/patrick-kidger.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-02-21T14:34:49.000Z","updated_at":"2025-09-08T02:59:42.000Z","dependencies_parsed_at":"2024-11-08T10:02:32.516Z","dependency_job_id":"4cd1b715-a88d-4bc8-8304-59b6e5a9e41e","html_url":"https://github.com/patrick-kidger/generalised_shapelets","commit_stats":{"total_commits":124,"total_committers":3,"mean_commits":"41.333333333333336","dds":0.3548387096774194,"last_synced_commit":"04930c89dc4673e2af402895fe67655f8375a808"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/patrick-kidger/generalised_shapelets","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/patrick-kidger%2Fgeneralised_shapelets","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/patrick-kidger%2Fgeneralised_shapelets/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/patrick-kidger%2Fgeneralised_shapelets/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/patrick-kidger%2Fgeneralised_shapelets/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/patrick-kidger","download_url":"https://codeload.github.com/patrick-kidger/generalised_shapelets/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/patrick-kidger%2Fgeneralised_shapelets/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29965606,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-01T06:55:38.174Z","status":"ssl_error","status_checked_at":"2026-03-01T06:53:04.810Z","response_time":124,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["machine-learning","shapelet-transform","shapelets","time-series"],"created_at":"2024-10-01T13:02:03.045Z","updated_at":"2026-03-01T09:32:31.838Z","avatar_url":"https://github.com/patrick-kidger.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1 align='center'\u003e Generalised Interpretable Shapelets for Irregular Time Series\u003cbr\u003e\n    [\u003ca href=\"https://arxiv.org/abs/2005.13948\"\u003earXiv\u003c/a\u003e] \u003c/h1\u003e\n\n\u003cp align=\"center\"\u003e\n\u003cimg align=\"middle\" src=\"./paper/images/new_pendigits.png\" width=\"666\" /\u003e\n\u003c/p\u003e\n\nA generalised approach to _the shapelet method_ used in time series classification, in which a time series is described by its similarity to each of a collection of 'shapelets'. Given lots of well-chosen shapelets, then you can now look at those similarities and conclude that \"This time series is probably of class X, because it has a very high similarity to shapelet Y.\"\n\nWe extend the method by:\n+ Extending to irregularly sampled, partially observed multivariate time series.\n+ Differentiably optimising the shapelet lengths. (Previously a discrete parameter.)\n+ Imposing interpretability via regularisation.\n+ Introducing generalised discrepancy functions for domain adaptation.\n\nThis gives a way to classify time series, whilst being able to answer questions about why that classification was chosen, and even being able to give new insight into the data. (For example, we demonstrate the discovery of a kind of spectral gap in an audio classification problem.)\n\nDespite the similar names, shapelets have nothing to do with wavelets.\n\n----\n## Library\nWe provide a PyTorch-compatible library for computing the generalised shapelet transform [here](./torchshapelets).\n\n## Results\nAccuracies on ten different datasets:\n\u003cp align=\"center\"\u003e\n\u003cimg align=\"middle\" src=\"./paper/images/results_table_full.png\" width=\"666\" /\u003e\n\u003c/p\u003e\n\nThe first 14 MFC coefficients for an audio recording from the Speech Commands dataset, along with the learnt shapelet, and the difference between them.:\n\u003cp align=\"center\"\u003e\n\u003cimg align=\"middle\" src=\"./paper/images/new_speech_commands_heatmap.png\" width=\"666\" /\u003e\n\u003c/p\u003e\n\nInterpreting why a class was chosen based on similarity to a shapelet, on the PenDigits dataset:\n\u003cp align=\"center\"\u003e\n\u003cimg align=\"middle\" src=\"./paper/images/new_pendigits.png\" width=\"666\" /\u003e\n\u003c/p\u003e\n\nUsing a pseudometric uncovers a spectral gap in an audio classification problem:\n\u003cp align=\"center\"\u003e\n\u003cimg align=\"middle\" src=\"./paper/images/spectral.png\" width=\"300\" /\u003e\n\u003c/p\u003e\n\n## Citation\n\n```bibtex\n@article{kidger2020shapelets,\n    author={Kidger, Patrick and Morrill, James and Lyons, Terry},\n    title={{Generalised Interpretable Shapelets for Irregular Time Series}},\n    year={2020},\n    journal={arXiv:2005.13948}\n}\n```\n\n## Reproducing the experiments\n\n### Requirements\n+ python==3.7.4\n+ numpy==1.18.3\n+ scikit-learn==0.22.2\n+ six==1.15.0 \n+ scipy==1.4.1\n+ sktime==0.3.1 \n+ torch==1.4.0\n+ torchaudio==0.4.0 \n+ tqdm==4.46.0\n+ signatory==1.2.0.1.4.0        [This must be installed _after_ PyTorch]\n\nThe following are also needed if you wish to run the interpretability notebooks:\n+ jupyter==1.0.0          \n+ matplotlib==3.2.1\n+ seaborn==0.10.1\n\nFinally, the `torchshapelets` package (in this repository) must be installed via:\n``python torchshapelets/setup.py develop``\n\n### Downloading the data\n+ ``python get_data/uea.py``\n+ ``python get_data/speech_commands.py``\n\n### Running the experiments\nFirst make a folder at `experiments/results`, which is where the results of the experiments will be stored. Each model is saved after training for later analysis, so make this a symlink if you need to save on space. All experiments can be run via:\n+ ``python experiments/uea.py \u003cargument\u003e``\n+ ``python experiments/speech_commands.py \u003cargument2\u003e``\n\nwhere ``\u003cargument\u003e`` is one of:\n+ ``all``: run every experiment. Not recommended, will take forever.\n+ ``hyperparameter_search_old``: do hyperparameter searches for the performance of the classical shapelet transform on the UEA datasets.\n+ ``hyperparameter_search_l2``: do hyperparameter searches for the performance of the generalised shapelet transform on the UEA datasets with missing data.\n+ ``comparison_test``: actually use the hyperparameter searches (hardcoded to the results we found) for the UEA comparison between classical and generalised shapelets.\n+ ``missing_and_length_test``: actually use the hyperparameter searches (hardcoded to the results we found) for the test about learning lengths and missing data.\n+ ``pendigits_interpretability``: run models for just PenDigits, and then save the resulting shapelets.\n\nand ``\u003cargument2\u003e`` is one of:\n+ ``all``: Run every experiment. Not recommended, will take forever.\n+ ``old``: Run just the classical shapelet transform.\n+ ``new``: Run just the generalised shapelet transform.\n\n_Note that the code uses a lot of memory, and takes a long time to run. It's very much research code, not production code. See [`LIMIATIONS.md`](./torchshapelets/LIMITATIONS.md) for some discussion on why._\n\n### Model evaluation\nOnce an experiment has been completed, model performance can be viewed using the `experiments/parse_results.py` script. Simply run the file with an argument that corresponds to the name of a folder in `experiments/results`. For example, suppose we have run the UEA comparison test, then results can be viewed by running:\n+ `python experiments/parse_results.py uea_comparison`\n\nAlso see the notebooks in the [`notebooks`](./notebooks) directory, for an investigation into the interpretability of these models.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpatrick-kidger%2Fgeneralised_shapelets","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpatrick-kidger%2Fgeneralised_shapelets","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpatrick-kidger%2Fgeneralised_shapelets/lists"}