{"id":19456803,"url":"https://github.com/aidos-lab/mantra","last_synced_at":"2025-04-25T05:31:12.691Z","repository":{"id":242705473,"uuid":"809792062","full_name":"aidos-lab/mantra","owner":"aidos-lab","description":"MANTRA: The Manifold Triangulations Assemblage (A dataset of manifold triangulations)","archived":false,"fork":false,"pushed_at":"2025-04-22T07:49:12.000Z","size":26011,"stargazers_count":4,"open_issues_count":2,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-22T08:24:29.115Z","etag":null,"topics":["dataset","graph-dataset","graph-learning","iclr2025","topological-data-analysis","topological-deep-learning"],"latest_commit_sha":null,"homepage":"https://aidos.group/mantra/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aidos-lab.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-06-03T13:08:31.000Z","updated_at":"2025-04-22T07:49:10.000Z","dependencies_parsed_at":"2024-06-10T10:01:13.534Z","dependency_job_id":"6eb63ec6-55b8-40ea-8c72-afd18b381249","html_url":"https://github.com/aidos-lab/mantra","commit_stats":null,"previous_names":["aidos-lab/mantra","aidos-lab/mantradataset"],"tags_count":19,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aidos-lab%2Fmantra","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aidos-lab%2Fmantra/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aidos-lab%2Fmantra/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aidos-lab%2Fmantra/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aidos-lab","download_url":"https://codeload.github.com/aidos-lab/mantra/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250205691,"owners_count":21392102,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dataset","graph-dataset","graph-learning","iclr2025","topological-data-analysis","topological-deep-learning"],"created_at":"2024-11-10T17:18:31.052Z","updated_at":"2025-04-25T05:31:07.682Z","avatar_url":"https://github.com/aidos-lab.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# MANTRA: Manifold Triangulations Assembly\n\n[![Maintainability](https://api.codeclimate.com/v1/badges/82f86d7e2f0aae342055/maintainability)](https://codeclimate.com/github/aidos-lab/MANTRA/maintainability) ![GitHub contributors](https://img.shields.io/github/contributors/aidos-lab/MANTRA) ![GitHub](https://img.shields.io/github/license/aidos-lab/MANTRA) \n\n![image](_static/manifold_triangulation_orbit.gif)\n\n## Getting the Dataset\n\nThe raw MANTRA dataset consisting of the $2$ and $3$ manifolds with up to $10$ vertices \nis provided [here](https://github.com/aidos-lab/mantra/releases/latest). \nFor machine learning applications and research, we provide a custom [PyTorch Geometric](https://pytorch-geometric.readthedocs.io/en/stable/) dataset in the form of a python package. \n\nFor installations via pip, run  \n\nThe raw datasets, consisting of the 2 and 3 manifolds with up to 10\nvertices, can be manually downloaded \n[here](https://github.com/aidos-lab/mantra/releases/latest). \nA pytorch geometric wrapper for the dataset is installable via the following \ncommand.\n\n```python\npip install mantra-dataset\n```\n\nAfter installation the dataset can be used with the follwing snippet.\n\n```python\nfrom mantra.datasets import ManifoldTriangulations\n\ndataset = ManifoldTriangulations(root=\"./data\", manifold=\"2\", version=\"latest\")\n```\n\n## Folder Structure\n\n## Data Format\n\n\u003e This section is mostly *information-oriented* and provides a brief\n\u003e overview of the data format, followed by a short [example](#example).\n\nEach dataset consists of a list of triangulations, with each\ntriangulation having the following attributes:\n\n* `id` (required, `str`): This attribute refers to the original ID of\n  the triangulation as used by the creator of the dataset (see\n  [below](#acknowledgments)). This facilitates comparisons to the\n  original dataset if necessary.\n\n* `triangulation` (required, `list` of `list` of `int`): A doubly-nested\n  list of the top-level simplices of the triangulation.\n\n* `n_vertices` (required, `int`): The number of vertices in the\n  triangulation. This is **not** the number of simplices.\n\n* `name` (required, `str`): A canonical name of the triangulation, such\n  as `S^2` for the two-dimensional [sphere](https://en.wikipedia.org/wiki/N-sphere).\n  If no canonical name exists, we store an empty string.\n\n* `betti_numbers` (required, `list` of `int`): A list of the [Betti\n  numbers](https://en.wikipedia.org/wiki/Betti_number) of the\n  triangulation, computed using $Z$ coefficients. This implies that\n  [torsion](https://en.wikipedia.org/wiki/Homology_(mathematics))\n  coefficients are stored in another attribute.\n\n* `torsion_coefficients` (required, `list` of `str`): A list of the\n  [torsion\n  coefficients](https://en.wikipedia.org/wiki/Homology_(mathematics)) of\n  the triangulation. An empty string `\"\"` indicates that no torsion\n  coefficients are available in that dimension. Otherwise, the original\n  spelling of torsion coefficients is retained, so a valid entry might\n  be `\"Z_2\"`. \n\n* `genus` (optional, `int`): For 2-manifolds, contains the\n  [genus](https://en.wikipedia.org/wiki/Genus_(mathematics)) of the\n  triangulation.\n\n* `orientable` (optional, `bool`): Specifies whether the triangulation\n  is [orientable](https://en.wikipedia.org/wiki/Orientability) or not.\n\n### Example\n\n```json\n[\n  {\n    \"id\": \"manifold_2_4_1\",\n    \"triangulation\": [\n      [1,2,3],\n      [1,2,4],\n      [1,3,4],\n      [2,3,4]\n    ],\n    \"dimension\": 2,\n    \"n_vertices\": 4,\n    \"betti_numbers\": [\n      1,\n      0,\n      1\n    ],\n    \"torsion_coefficients\": [\n      \"\",\n      \"\",\n      \"\"\n    ],\n    \"name\": \"S^2\",\n    \"genus\": 0,\n    \"orientable\": true\n  },\n  {\n    \"id\": \"manifold_2_5_1\",\n    \"triangulation\": [\n      [1,2,3],\n      [1,2,4],\n      [1,3,5],\n      [1,4,5],\n      [2,3,4],\n      [3,4,5]\n    ],\n    \"dimension\": 2,\n    \"n_vertices\": 5,\n    \"betti_numbers\": [\n      1,\n      0,\n      1\n    ],\n    \"torsion_coefficients\": [\n      \"\",\n      \"\",\n      \"\"\n    ],\n    \"name\": \"S^2\",\n    \"genus\": 0,\n    \"orientable\": true\n  }\n]\n```\n\n### Design Decisions\n\n\u003e This section is *understanding-oriented* and provides additional\n\u003e justifications for our data format.\n\nThe datasets are converted from their original (mixed) lexicographical\nformat. A triangulation in lexicographical format could look like this:\n\n```\nmanifold_lex_d2_n6_#1=[[1,2,3],[1,2,4],[1,3,4],[2,3,5],[2,4,5],[3,4,6],\n  [3,5,6],[4,5,6]]\n```\n\nA triangulation in *mixed* lexicographical format could look like this:\n\n```\nmanifold_2_6_1=[[1,2,3],[1,2,4],[1,3,5],[1,4,6],\n  [1,5,6],[2,3,4],[3,4,5],[4,5,6]]\n```\n\nThis format is **hard to parse**. Moreover, any *additional* information\nabout the triangulations, including information about homology groups or\norientability, for instance, requires additional files.\n\nWe thus decided to use a format that permits us to keep everything in\none place, including any additional attributes for a specific\ntriangulation. A desirable data format needs to satisfy the following\nproperties:\n\n1. It should be easy to parse and modify, ideally in a number of\n   programming languages.\n\n2. It should be human-readable and `diff`-able in order to permit\n   simplified comparisons.\n\n3. It should scale reasonably well to larger triangulations.\n\nAfter some considerations, we decided to opt for `gzip`-compressed JSON\nfiles. [JSON](https://www.json.org) is well-specified and supported in\nvirtually all major programming languages out of the box. While the\ncompressed file is *not* human-readable on its own, the uncompressed\nversion can easily be used for additional data analysis tasks. This also\ngreatly simplifies maintenance operations on the dataset. While it can\nbe argued that there are formats that scale even better, they are\nnot well-applicable to our use case since each triangulation\ntypically consists of different numbers of top-level simplices. This\nrules out column-based formats like [Parquet](https://parquet.apache.org/).\n\nWe are open to revisiting this decision in the future.\n\nAs for the *storage* of the data as such, we decided to keep only the\ntop-level simplices (as is done in the original format) since this\nsubstantially saves disk space. The drawback is that the client has to\nsupply the remainder of the triangulation. Given that the triangulations\nin our dataset are not too large, we deem this to be an acceptable\ncompromise. Moreover, data structures such as [simplex\ntrees](https://en.wikipedia.org/wiki/Simplex_tree) can be used to\nfurther improve scalability if necessary.\n\nThe decision to keep only top-level simplices is **final**.\n\nFinally, our data format includes, whenever possible and available,\nadditional information about a triangulation, including the [Betti\nnumbers](https://en.wikipedia.org/wiki/Betti_number) and a *name*,\ni.e., a canonical description, of the topological space described\nby the triangulation. We opted to minimize any inconvenience that\nwould arise from having to perform additional parsing operations.\n\nPlease use the following citation for our work:\n\n```bibtex\n@misc{ballester2024mantramanifoldtriangulationsassemblage,\n      title={ {MANTRA}: {T}he {M}anifold {T}riangulations {A}ssemblage}, \n      author={Rub{\\'e}n Ballester and Ernst R{\\\"o}ell and Daniel Bin Schmid and Mathieu Alain and Sergio Escalera and Carles Casacuberta and Bastian Rieck},\n      year={2024},\n      eprint={2410.02392},\n      archivePrefix={arXiv},\n      primaryClass={cs.LG},\n      url={https://arxiv.org/abs/2410.02392}, \n}\n```\n\n## Acknowledgments\n\nThis work is dedicated to [Frank H. Lutz](https://www3.math.tu-berlin.de/IfM/Nachrufe/Frank_Lutz/stellar/),\nwho passed away unexpectedly on November 10, 2023. May his memory be\na blessing.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faidos-lab%2Fmantra","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faidos-lab%2Fmantra","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faidos-lab%2Fmantra/lists"}