{"id":13822206,"url":"https://github.com/wannesm/dtaidistance","last_synced_at":"2025-05-13T21:10:09.479Z","repository":{"id":37686561,"uuid":"80764246","full_name":"wannesm/dtaidistance","owner":"wannesm","description":"Time series distances: Dynamic Time Warping (fast  DTW implementation in C)","archived":false,"fork":false,"pushed_at":"2024-10-24T07:05:14.000Z","size":1389,"stargazers_count":1085,"open_issues_count":22,"forks_count":184,"subscribers_count":28,"default_branch":"master","last_synced_at":"2024-10-25T02:40:21.679Z","etag":null,"topics":["c","clustering","distance-measure","dtw","dynamic-time-warping","python","timeseries"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wannesm.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-02-02T20:11:03.000Z","updated_at":"2024-10-24T10:17:21.000Z","dependencies_parsed_at":"2023-01-31T05:45:50.241Z","dependency_job_id":"a719abc4-d415-4692-b401-956a57f1ded3","html_url":"https://github.com/wannesm/dtaidistance","commit_stats":{"total_commits":727,"total_committers":21,"mean_commits":34.61904761904762,"dds":0.2585969738651994,"last_synced_commit":"17c117bbe13be110eb6cf6b443370288a8557645"},"previous_names":[],"tags_count":49,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wannesm%2Fdtaidistance","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wannesm%2Fdtaidistance/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wannesm%2Fdtaidistance/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wannesm%2Fdtaidistance/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wannesm","download_url":"https://codeload.github.com/wannesm/dtaidistance/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248154030,"owners_count":21056534,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c","clustering","distance-measure","dtw","dynamic-time-warping","python","timeseries"],"created_at":"2024-08-04T08:01:48.277Z","updated_at":"2025-04-10T03:38:32.277Z","avatar_url":"https://github.com/wannesm.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":["General-Purpose Machine Learning"],"readme":"[![PyPi Version](https://img.shields.io/pypi/v/dtaidistance.svg)](https://pypi.org/project/dtaidistance/)\n[![Conda Version](https://img.shields.io/conda/vn/conda-forge/dtaidistance.svg)](https://anaconda.org/conda-forge/dtaidistance)\n[![Documentation Status](https://readthedocs.org/projects/dtaidistance/badge/?version=latest)](https://dtaidistance.readthedocs.io/en/latest/?badge=latest)\n[![DOI](https://zenodo.org/badge/80764246.svg)](https://zenodo.org/badge/latestdoi/80764246) \n\n# Time Series Distances\n\nLibrary for time series distances (e.g. Dynamic Time Warping) used in the\n[DTAI Research Group](https://dtai.cs.kuleuven.be). The library offers a pure\nPython implementation and a fast implementation in C. The C implementation\nhas only Cython as a dependency. It is compatible with Numpy and Pandas and\nimplemented such that unnecessary data copy operations are avoided.\n\nDocumentation: http://dtaidistance.readthedocs.io\n\nExample:\n\n    from dtaidistance import dtw\n    import numpy as np\n    s1 = np.array([0.0, 0, 1, 2, 1, 0, 1, 0, 0])\n    s2 = np.array([0.0, 1, 2, 0, 0, 0, 0, 0, 0])\n    d = dtw.distance_fast(s1, s2)\n\nCiting this work:\n\n\u003e Wannes Meert, Kilian Hendrickx, Toon Van Craenendonck, Pieter Robberechts, Hendrik Blockeel \u0026 Jesse Davis.  \n\u003e DTAIDistance (Version v2). Zenodo.  \n\u003e http://doi.org/10.5281/zenodo.5901139\n\n**New in v2**:\n\n- Numpy is now an optional dependency, also to compile the C library\n  (only Cython is required).\n- Small optimizations throughout the C code to improve speed.\n- The consistent use of `ssize_t` instead of `int` allows for larger data structures on 64 bit \n  machines and be more compatible with Numpy.\n- The parallelization is now implemented directly in C (included if OpenMP is installed).\n- The `max_dist` argument turned out to be similar to Silva and Batista's work \n  on PrunedDTW [7]. The toolbox now implements a version that is equal to PrunedDTW\n  since it prunes more partial distances. Additionally, a `use_pruning` argument\n  is added to automatically set `max_dist` to the Euclidean distance, as suggested\n  by Silva and Batista, to speed up the computation (a new method `ub_euclidean` is available).\n- Support in the C library for multi-dimensional sequences in the `dtaidistance.dtw_ndim`\n  package.\n- DTW Barycenter Averaging for clustering (v2.2).\n- Subsequence search and local concurrences (v2.3).\n- Support for N-dimensional time series (v2.3.7).\n\n\n## Installation\n\n    $ pip install dtaidistance\n    \nor\n\n    $ conda install -c conda-forge dtaidistance\n\nThe pip installation requires Numpy as a dependency to compile Numpy-compatible\nC code (using Cython). However, this dependency is optional and can be removed.\n\nThe source code is available at\n[github.com/wannesm/dtaidistance](https://github.com/wannesm/dtaidistance).\n\nIf you encounter any problems during compilation (e.g. the C-based implementation or OpenMP\nis not available), see the \n[documentation](https://dtaidistance.readthedocs.io/en/latest/usage/installation.html)\nfor more options.\n\n## Usage\n\n### Dynamic Time Warping (DTW) Distance Measure\n\n    from dtaidistance import dtw\n    from dtaidistance import dtw_visualisation as dtwvis\n    import numpy as np\n    s1 = np.array([0., 0, 1, 2, 1, 0, 1, 0, 0, 2, 1, 0, 0])\n    s2 = np.array([0., 1, 2, 3, 1, 0, 0, 0, 2, 1, 0, 0, 0])\n    path = dtw.warping_path(s1, s2)\n    dtwvis.plot_warping(s1, s2, path, filename=\"warp.png\")\n\n![Dynamic Time Warping (DTW) Example](https://people.cs.kuleuven.be/wannes.meert/dtw/dtw_example.png?v=5)\n\n\n#### DTW Distance Measure Between Two Series\n\nOnly the distance measure based on two sequences of numbers:\n\n    from dtaidistance import dtw\n    s1 = [0, 0, 1, 2, 1, 0, 1, 0, 0]\n    s2 = [0, 1, 2, 0, 0, 0, 0, 0, 0]\n    distance = dtw.distance(s1, s2)\n    print(distance)\n\nThe fastest version (30-300 times) uses c directly but requires an array as input (with the double type),\nand (optionally) also prunes computations by setting `max_dist` to the Euclidean upper bound:\n\n    from dtaidistance import dtw\n    import array\n    s1 = array.array('d',[0, 0, 1, 2, 1, 0, 1, 0, 0])\n    s2 = array.array('d',[0, 1, 2, 0, 0, 0, 0, 0, 0])\n    d = dtw.distance_fast(s1, s2, use_pruning=True)\n\nOr you can use a numpy array (with dtype double or float):\n\n    from dtaidistance import dtw\n    import numpy as np\n    s1 = np.array([0, 0, 1, 2, 1, 0, 1, 0, 0], dtype=np.double)\n    s2 = np.array([0.0, 1, 2, 0, 0, 0, 0, 0, 0])\n    d = dtw.distance_fast(s1, s2, use_pruning=True)\n\n\nCheck the `__doc__` for information about the available arguments:\n\n    print(dtw.distance.__doc__)\n\nA number of options are foreseen to early stop some paths the dynamic programming algorithm is exploring or tune\nthe distance measure computation:\n\n- `window`: Only allow for shifts up to this amount away from the two diagonals.\n- `max_dist`: Stop if the returned distance measure will be larger than this value.\n- `max_step`: Do not allow steps larger than this value.\n- `max_length_diff`: Return infinity if difference in length of two series is larger.\n- `penalty`: Penalty to add if compression or expansion is applied (on top of the distance).\n- `psi`: Psi relaxation to ignore begin and/or end of sequences (for cylical sequences) [2].\n- `use_pruning`: Prune computations based on the Euclidean upper bound.\n\n\n#### DTW Distance Measure all warping paths\n\nIf, next to the distance, you also want the full matrix to see all possible warping paths:\n\n    from dtaidistance import dtw\n    s1 = [0, 0, 1, 2, 1, 0, 1, 0, 0]\n    s2 = [0, 1, 2, 0, 0, 0, 0, 0, 0]\n    distance, paths = dtw.warping_paths(s1, s2)\n    print(distance)\n    print(paths)\n\nThe matrix with all warping paths can be visualised as follows:\n\n    from dtaidistance import dtw\n    from dtaidistance import dtw_visualisation as dtwvis\n    import random\n    import numpy as np\n    x = np.arange(0, 20, .5)\n    s1 = np.sin(x)\n    s2 = np.sin(x - 1)\n    random.seed(1)\n    for idx in range(len(s2)):\n        if random.random() \u003c 0.05:\n            s2[idx] += (random.random() - 0.5) / 2\n    d, paths = dtw.warping_paths(s1, s2, window=25, psi=2)\n    best_path = dtw.best_path(paths)\n    dtwvis.plot_warpingpaths(s1, s2, paths, best_path)\n\n![DTW Example](https://people.cs.kuleuven.be/wannes.meert/dtw/warping_paths.png?v=3)\n\nNotice the `psi` parameter that relaxes the matching at the beginning and end.\nIn this example this results in a perfect match even though the sine waves are slightly shifted.\n\n\n#### DTW Distance Measures Between Set of Series\n\nTo compute the DTW distance measures between all sequences in a list of sequences, use the method `dtw.distance_matrix`.\nYou can set variables to use more or less c code (`use_c` and `use_nogil`) and parallel or serial execution\n(`parallel`).\n\nThe `distance_matrix` method expects a list of lists/arrays:\n\n    from dtaidistance import dtw\n    import numpy as np\n    series = [\n        np.array([0, 0, 1, 2, 1, 0, 1, 0, 0], dtype=np.double),\n        np.array([0.0, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0]),\n        np.array([0.0, 0, 1, 2, 1, 0, 0, 0])]\n    ds = dtw.distance_matrix_fast(series)\n\nor a matrix (in case all series have the same length):\n\n    from dtaidistance import dtw\n    import numpy as np\n    series = np.matrix([\n        [0.0, 0, 1, 2, 1, 0, 1, 0, 0],\n        [0.0, 1, 2, 0, 0, 0, 0, 0, 0],\n        [0.0, 0, 1, 2, 1, 0, 0, 0, 0]])\n    ds = dtw.distance_matrix_fast(series)\n\n\n#### DTW Distance Measures Between Set of Series, limited to block\n\nYou can instruct the computation to only fill part of the distance measures matrix.\nFor example to distribute the computations over multiple nodes, or to only \ncompare source series to target series.\n\n    from dtaidistance import dtw\n    import numpy as np\n    series = np.matrix([\n         [0., 0, 1, 2, 1, 0, 1, 0, 0],\n         [0., 1, 2, 0, 0, 0, 0, 0, 0],\n         [1., 2, 0, 0, 0, 0, 0, 1, 1],\n         [0., 0, 1, 2, 1, 0, 1, 0, 0],\n         [0., 1, 2, 0, 0, 0, 0, 0, 0],\n         [1., 2, 0, 0, 0, 0, 0, 1, 1]])\n    ds = dtw.distance_matrix_fast(series, block=((1, 4), (3, 5)))\n\nThe output in this case will be:\n\n    #  0     1    2    3       4       5\n    [[ inf   inf  inf     inf     inf  inf]    # 0\n     [ inf   inf  inf  1.4142  0.0000  inf]    # 1\n     [ inf   inf  inf  2.2360  1.7320  inf]    # 2\n     [ inf   inf  inf     inf  1.4142  inf]    # 3\n     [ inf   inf  inf     inf     inf  inf]    # 4\n     [ inf   inf  inf     inf     inf  inf]]   # 5\n\n\n### Clustering\n\nA distance matrix can be used for time series clustering. You can use existing methods such as\n`scipy.cluster.hierarchy.linkage` or one of two included clustering methods (the latter is a\nwrapper for the SciPy linkage method).\n\n    from dtaidistance import clustering\n    # Custom Hierarchical clustering\n    model1 = clustering.Hierarchical(dtw.distance_matrix_fast, {})\n    cluster_idx = model1.fit(series)\n    # Augment Hierarchical object to keep track of the full tree\n    model2 = clustering.HierarchicalTree(model1)\n    cluster_idx = model2.fit(series)\n    # SciPy linkage clustering\n    model3 = clustering.LinkageTree(dtw.distance_matrix_fast, {})\n    cluster_idx = model3.fit(series)\n\n\nFor models that keep track of the full clustering tree (`HierarchicalTree` or `LinkageTree`), the\ntree can be visualised:\n\n    model.plot(\"myplot.png\")\n\n![Dynamic Time Warping (DTW) hierarchical clusteringt](https://people.cs.kuleuven.be/wannes.meert/dtw/hierarchy.png?v=2)\n\n\n### Subsequence search\n\nDTAIDistance supports various subsequence search algorithms like\nSubsequence Alignment, Subsequence KNN Search and Local Concurrences.\nSee the [documentation](https://dtaidistance.readthedocs.io/en/latest/usage/subsequence.html)\nfor more information.\n\n### Motif Discovery\n\nWhile methods such as `dtw.distance_matrix` and `subsequence.subsequencesearch`\ncan be used for motif discovery in time series (after windowing), a more\nefficient and effective algorithm\nbased on time warping is available in the\n[LoCoMotif package](https://github.com/ML-KULeuven/locomotif).\n\n\n\n## Dependencies\n\n- [Python 3](http://www.python.org)\n\nOptional:\n\n- [Cython](http://cython.org)\n- [Numpy](http://www.numpy.org)\n- [tqdm](https://github.com/tqdm/tqdm)\n- [Matplotlib](https://matplotlib.org)\n- [SciPy](https://www.scipy.org)\n- [PyClustering](https://pyclustering.github.io)\n\nDevelopment:\n\n- [pytest](http://doc.pytest.org)\n- [pytest-benchmark](http://pytest-benchmark.readthedocs.io)\n\n\n## Contact\n\n- https://people.cs.kuleuven.be/wannes.meert\n\n\n## References\n\n1. T. K. Vintsyuk,\n   Speech discrimination by dynamic programming.\n   Kibernetika, 4:81–88, 1968.\n2. H. Sakoe and S. Chiba,\n   Dynamic programming algorithm optimization for spoken word recognition.\n   IEEE Transactions on Acoustics, Speech and Signal Processing, 26(1):43–49, 1978.\n3. C. S. Myers and L. R. Rabiner,\n   A comparative study of several dynamic time-warping algorithms for connected-word recognition.\n   The Bell System Technical Journal, 60(7):1389–1409, Sept 1981.\n4. Mueen, A and Keogh, E, \n   [Extracting Optimal Performance from Dynamic Time Warping](http://www.cs.unm.edu/~mueen/DTW.pdf),\n   Tutorial, KDD 2016\n5. D. F. Silva, G. E. A. P. A. Batista, and E. Keogh.\n   [On the effect of endpoints on dynamic time warping](http://www-bcf.usc.edu/~liu32/milets16/paper/MiLeTS_2016_paper_7.pdf),\n   In SIGKDD Workshop on Mining and Learning from Time Series, II. Association for Computing Machinery-ACM, 2016.\n6. C. Yanping, K. Eamonn, H. Bing, B. Nurjahan, B. Anthony, M. Abdullah and B. Gustavo.\n   [The UCR Time Series Classification Archive](www.cs.ucr.edu/~eamonn/time_series_data/), 2015.\n7. D. F. Silva and G. E. Batista. \n   [Speeding up all-pairwise dynamic time warping matrix calculation](http://sites.labic.icmc.usp.br/dfs/pdf/SDM_PrunedDTW.pdf),\n   In Proceedings of the 2016 SIAM International Conference on Data Mining, pages 837–845. SIAM, 2016.\n\n\n\n## License\n\n    DTAI distance code.\n\n    Copyright 2016-2022 KU Leuven, DTAI Research Group\n\n    Licensed under the Apache License, Version 2.0 (the \"License\");\n    you may not use this file except in compliance with the License.\n    You may obtain a copy of the License at\n\n        http://www.apache.org/licenses/LICENSE-2.0\n\n    Unless required by applicable law or agreed to in writing, software\n    distributed under the License is distributed on an \"AS IS\" BASIS,\n    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n    See the License for the specific language governing permissions and\n    limitations under the License.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwannesm%2Fdtaidistance","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwannesm%2Fdtaidistance","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwannesm%2Fdtaidistance/lists"}