https://github.com/gibsramen/mtdist
Python package to perform mixed-type distance calculations
https://github.com/gibsramen/mtdist
categorical-data clustering data-science distance-matrix
Last synced: 2 months ago
JSON representation
Python package to perform mixed-type distance calculations
- Host: GitHub
- URL: https://github.com/gibsramen/mtdist
- Owner: gibsramen
- License: bsd-3-clause
- Created: 2019-12-11T01:17:17.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2019-12-18T04:27:01.000Z (over 5 years ago)
- Last Synced: 2025-01-19T20:37:49.957Z (4 months ago)
- Topics: categorical-data, clustering, data-science, distance-matrix
- Language: Python
- Size: 16.6 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# mtdist
**mtdist** is a Python package for computing **m**ixed-**t**ype **dist**ance metrics on high-dimensional data. Standard distance metrics (Euclidean, Manhattan, etc.) are usually restricted to numeric data, making integration with other data types (categorical, ordinal, etc.) difficult. Distance metrics that are built to handle mixed-type data, such as Gower distance (Gower 1971), are often only available in R, increasing the burden on data scientists.
mtdist aims to bring these valuable distance metrics to Python, allowing researchers to more easily analyze mixed-type data for clustering, visualization, and more.
## References
[1] Gower, J. C. "A General Coefficient of Similarity and Some of Its Properties." _Biometrics_ 27, no. 4 (1971): 857-71. doi:10.2307/2528823.