{"id":37073378,"url":"https://github.com/lycosystem/lydata-package","last_synced_at":"2026-01-14T08:36:50.131Z","repository":{"id":299821236,"uuid":"1004309754","full_name":"lycosystem/lydata-package","owner":"lycosystem","description":"Python package for programmatic access to the lyDATA tables as well as utilities to handle the datasets.","archived":false,"fork":false,"pushed_at":"2025-09-04T09:23:02.000Z","size":580,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-09-04T10:37:20.252Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://lydata.readthedocs.io","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lycosystem.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-06-18T12:34:41.000Z","updated_at":"2025-09-04T09:31:50.000Z","dependencies_parsed_at":"2025-06-18T13:48:12.437Z","dependency_job_id":"dc7f6db4-88f5-481a-aebf-e59e37092b2c","html_url":"https://github.com/lycosystem/lydata-package","commit_stats":null,"previous_names":["lycosystem/lydata-package"],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/lycosystem/lydata-package","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lycosystem%2Flydata-package","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lycosystem%2Flydata-package/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lycosystem%2Flydata-package/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lycosystem%2Flydata-package/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lycosystem","download_url":"https://codeload.github.com/lycosystem/lydata-package/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lycosystem%2Flydata-package/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28414664,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-14T08:31:27.429Z","status":"ssl_error","status_checked_at":"2026-01-14T08:31:19.098Z","response_time":107,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-01-14T08:36:49.439Z","updated_at":"2026-01-14T08:36:50.100Z","avatar_url":"https://github.com/lycosystem.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Python Library for Loading and Manipulating lyDATA Tables\n\n[![Build](https://github.com/lycosystem/lydata-package/actions/workflows/release.yml/badge.svg)](https://github.com/lycosystem/lydata-package/actions/workflows/release.yml)\n[![Tests](https://github.com/lycosystem/lydata-package/actions/workflows/tests.yml/badge.svg)](https://github.com/lycosystem/lydata-package/actions/workflows/tests.yml)\n[![Documentation Status](https://readthedocs.org/projects/lydata/badge/?version=stable)](https://lydata.readthedocs.io/stable/?badge=stable)\n[![Coverage badge](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/lycosystem/lydata-package/python-coverage-comment-action-data/endpoint.json)](https://htmlpreview.github.io/?https://github.com/lycosystem/lydata-package/blob/python-coverage-comment-action-data/htmlcov/index.html)\n\nThis repository provides a Python library for loading, manipulating, and validating the datasets available on [lyDATA](https://github.com/lycosystem/lydata).\n\n\u003e [!WARNING]\n\u003e This Python library is still highly experimental!\n\u003e\n\u003e Also, it has recently been spun off from the repository of datasets, [lyDATA](https://github.com/lycosystem/lydata), and some things might still not work as expected.\n\n## Installation\n\n### 1. Install from PyPI\n\nYou can install the library from PyPI using pip:\n\n```bash\npip install lydata\n```\n\n### 2. Install from Source\n\nIf you want to install the library from source, you can clone the repository and install it using pip:\n\n```bash\ngit clone https://github.com/lycosystem/lydata-package\ncd lydata-package\npip install -e .\n```\n\n## Usage\n\nThe first and most common use case would probably listing and loading the published datasets:\n\n```python\n\u003e\u003e\u003e import lydata\n\u003e\u003e\u003e for dataset_spec in lydata.available_datasets(\n...     year=2023,              # show all datasets added in 2023\n...     ref=\"61a17e\",           # may be some specific hash/tag/branch\n... ):\n...     print(dataset_spec.name)\n2023-clb-multisite\n2023-isb-multisite\n\n# return generator of datasets that include oropharyngeal tumor patients\n\u003e\u003e\u003e first_dataset = next(lydata.load_datasets(subsite=\"oropharynx\"))\n\u003e\u003e\u003e print(first_dataset.head())\n... # doctest: +ELLIPSIS, +NORMALIZE_WHITESPACE\n  patient                              ... positive_dissected\n        #                              ...             contra\n       id         institution     sex  ...                III   IV    V\n0    P011  Centre Léon Bérard    male  ...                0.0  0.0  0.0\n1    P012  Centre Léon Bérard  female  ...                0.0  0.0  0.0\n2    P014  Centre Léon Bérard    male  ...                0.0  0.0  NaN\n3    P015  Centre Léon Bérard    male  ...                0.0  0.0  NaN\n4    P018  Centre Léon Bérard    male  ...                NaN  NaN  NaN\n[5 rows x 82 columns]\n\n```\n\nAnd since the three-level header of the tables is a little unwieldy at times, we also provide some shortcodes via a custom pandas accessor. As soon as `lydata` is imported it can be used like this:\n\n```python\n\u003e\u003e\u003e print(first_dataset.ly.age)\n... # doctest: +ELLIPSIS, +NORMALIZE_WHITESPACE\n0      67\n1      62\n      ...\n261    60\n262    60\nName: (patient, #, age), Length: 263, dtype: int64\n\n```\n\nAnd we have implemented `Q` and `C` objects inspired by Django that allow easier querying of the tables:\n\n```python\n\u003e\u003e\u003e from lydata import C\n\n# select patients younger than 50 that are not HPV positive (includes NaNs)\n\u003e\u003e\u003e query_result = first_dataset.ly.query((C(\"age\") \u003c 50) \u0026 ~(C(\"hpv\") == True))\n\u003e\u003e\u003e (query_result.ly.age \u003c 50).all()\nnp.True_\n\u003e\u003e\u003e (query_result.ly.hpv == False).all()\nnp.True_\n\n```\n\nFor more details and further examples or use-cases, have a look at the [official documentation](https://lydata.readthedocs.org/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flycosystem%2Flydata-package","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flycosystem%2Flydata-package","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flycosystem%2Flydata-package/lists"}