https://github.com/src-d/seriate
Optimal ordering of elements in a set given their distance matrix.
https://github.com/src-d/seriate
python seriation
Last synced: 26 days ago
JSON representation
Optimal ordering of elements in a set given their distance matrix.
- Host: GitHub
- URL: https://github.com/src-d/seriate
- Owner: src-d
- License: other
- Created: 2019-03-22T15:54:02.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2023-10-02T15:58:29.000Z (over 1 year ago)
- Last Synced: 2025-05-05T05:04:57.761Z (26 days ago)
- Topics: python, seriation
- Language: Python
- Homepage:
- Size: 2.34 MB
- Stars: 17
- Watchers: 8
- Forks: 11
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# seriate
Optimal ordering of elements in a set given their distance matrix.[](https://travis-ci.com/src-d/seriate)
[](https://codecov.io/github/src-d/seriate)
[](https://pypi.python.org/pypi/seriate)

[](https://opensource.org/licenses/Apache-2.0)
[Overview](#overview) • [How To Use](#how-to-use) • [Contributions](#contributions) • [License](#license)
## Overview
This is a Python implementation of [Seriation](http://nicolas.kruchten.com/content/2018/02/seriation/)
algorithm. Seriation is an approach for ordering elements in a set so that the
sum of the sequential pairwise distances is minimal. We state this task
as a Travelling Salesman Problem (TSP) and leverage the powerful [Google's or-tools](https://github.com/google/or-tools)
to do heavy-lifting. Since TSP is NP-hard, it is not possible to calculate
the precise solution for a big number of elements. However, the or-tools'
heuristics work very well in practice, and they are used in e.g. Google Maps.Any [`numpy.roll`-ed](https://docs.scipy.org/doc/numpy-1.16.0/reference/generated/numpy.roll.html)
result is equivalent.## How To Use
```python
import numpy
from scipy.spatial.distance import pdist
from seriate import seriateelements = numpy.array([
[3, 3, 3],
[5, 5, 5],
[4, 4, 4],
[2, 2, 2],
[1, 1, 1]
])print(seriate(pdist(elements)))
# Output: [4, 3, 0, 2, 1]
```The example above shows how we order 5 elements: `[3, 3, 3]`,
`[5, 5, 5]`, `[4, 4, 4]`, `[2, 2, 2]` and `[1, 1, 1]`. The result
is expected:1. `[1, 1, 1]`
2. `[2, 2, 2]`
3. `[3, 3, 3]`
4. `[4, 4, 4]`
5. `[5, 5, 5]``pdist` from [`scipy.spatial.distance`](https://docs.scipy.org/doc/scipy/reference/spatial.distance.html)
uses Euclidean (L2) dstance metric by default, so the distance between
`[x, x, x]` and `[x + 1, x + 1, x + 1]` is constant: √3. Any other distance
is bigger, so the optimal ordering is to list our elements in the increasing
norm order.## Contributions
Contributions are very welcome and desired! Please follow the [code of conduct](doc/code_of_conduct.md)
and read the [contribution guidelines](doc/contributing.md).## License
Apache-2.0, see [LICENSE.md](LICENSE.md).