{"id":19958040,"url":"https://github.com/echemdb/metadata-schema","last_synced_at":"2026-03-13T11:01:34.071Z","repository":{"id":37975495,"uuid":"412340070","full_name":"echemdb/metadata-schema","owner":"echemdb","description":"Metadata schema describing electrochemical data","archived":false,"fork":false,"pushed_at":"2026-03-02T17:01:37.000Z","size":861,"stargazers_count":2,"open_issues_count":30,"forks_count":2,"subscribers_count":3,"default_branch":"main","last_synced_at":"2026-03-02T20:41:27.592Z","etag":null,"topics":["electrochemistry","json","json-schema","metadata","schema"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/echemdb.png","metadata":{"files":{"readme":"readme.md","changelog":"ChangeLog","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2021-10-01T05:29:23.000Z","updated_at":"2026-03-02T17:01:59.000Z","dependencies_parsed_at":"2026-03-02T19:04:17.417Z","dependency_job_id":null,"html_url":"https://github.com/echemdb/metadata-schema","commit_stats":{"total_commits":89,"total_committers":3,"mean_commits":"29.666666666666668","dds":0.2471910112359551,"last_synced_commit":"14cb563af3adb728fd552847047c9077fa282ed9"},"previous_names":[],"tags_count":11,"template":false,"template_full_name":null,"purl":"pkg:github/echemdb/metadata-schema","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/echemdb%2Fmetadata-schema","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/echemdb%2Fmetadata-schema/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/echemdb%2Fmetadata-schema/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/echemdb%2Fmetadata-schema/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/echemdb","download_url":"https://codeload.github.com/echemdb/metadata-schema/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/echemdb%2Fmetadata-schema/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30466310,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-13T11:00:43.441Z","status":"ssl_error","status_checked_at":"2026-03-13T11:00:23.173Z","response_time":60,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["electrochemistry","json","json-schema","metadata","schema"],"created_at":"2024-11-13T01:39:54.931Z","updated_at":"2026-03-13T11:01:34.062Z","avatar_url":"https://github.com/echemdb.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Metadata Schema\n\nDevelopment of a metadata schema for experimental data, specifically electrochemical and electrocatalytic data.\n\n## Install\n\nInstall [pixi](https://pixi.sh) and get a copy of the metadata-schema:\n\n```sh\ngit clone https://github.com/echemdb/metadata-schema.git\ncd metadata-schema\n```\n\n## CLI\n\n### Flatten metadata to Excel/CSV\n\nThe `mdstools` package provides tools to flatten nested YAML metadata into tabular Excel/CSV formats with optional schema-based enrichment (descriptions and examples from JSON schemas).\n\nFlatten a YAML file to enriched Excel and CSV:\n\n```sh\nmdstools flatten tests/example_metadata.yaml\n```\n\nThis creates three files in `generated/`:\n- `example_metadata.csv` - Flat CSV with all metadata\n- `example_metadata.xlsx` - Single-sheet Excel file\n- `example_metadata_sheets.xlsx` - Multi-sheet Excel (one sheet per top-level key)\n\nAll exported files include `Description` and `Example` columns populated from the JSON schemas, making it easier for users to understand and fill out the metadata templates.\n\n#### Options\n\n```sh\nmdstools flatten \u003cyaml_file\u003e [--schema-dir DIR] [--output-dir DIR] [--no-enrichment]\n```\n\n- `--schema-dir` - Directory with JSON schemas (default: `schemas`)\n- `--output-dir` - Output directory (default: `generated`)\n- `--no-enrichment` - Disable enrichment (no Description/Example columns)\n\n### Unflatten Excel/CSV back to YAML\n\n```sh\nmdstools unflatten generated/example_metadata.xlsx --schema-file schemas/minimum_echemdb.json\n```\n\n\u003e **Note**: All CLI commands can also be run via pixi, e.g., `pixi run flatten ...` and `pixi run unflatten ...`.\n\n## Python API\n\nThe `mdstools` package can also be used programmatically:\n\n```python\nfrom mdstools.metadata.metadata import Metadata\nfrom mdstools.metadata.enriched_metadata import EnrichedFlattenedMetadata\n\n# Load YAML metadata\nmetadata = Metadata.from_yaml('metadata.yaml')\n\n# Flatten to tabular format\nflattened = metadata.flatten()\n\n# Add schema enrichment (descriptions and examples)\nenriched = EnrichedFlattenedMetadata(flattened.rows, schema_dir='schemas')\n\n# Get enriched DataFrame\ndf = enriched.to_pandas()\n\n# Export to various formats\nenriched.to_csv('output.csv')\nenriched.to_excel('output.xlsx')\nenriched.to_excel('output_multi.xlsx', separate_sheets=True)  # One sheet per top-level key\nenriched.to_markdown('output.md')\n```\n\nYou can also load a flat Excel/CSV file, reconstruct the nested dict, and\noptionally write YAML. This workflow expects columns named `Number`, `Key`,\nand `Value` and is intended for unflattening back to dict/YAML.\nAn enriched Excel can also be loaded.\n\n```python\nfrom mdstools.metadata.flattened_metadata import FlattenedMetadata\n\nflattened = FlattenedMetadata.from_excel(\"generated/example_metadata.xlsx\")\nmetadata = flattened.unflatten()\n\ndata = metadata.data  # Nested dict\nmetadata.to_yaml(\"generated/example_metadata.yaml\")\n```\n\n## Developer\n\n### Run tests\n\n```sh\npixi run test              # Run all tests\npixi run doctest           # Run doctests only\npixi run test-comprehensive # Run integration tests only\n```\n\nor all\n\n```sh\npixi run -e dev test-all\n```\n\n### Generate schemas from LinkML\n\nGenerate JSON schemas and Pydantic models from the LinkML definitions in `linkml/`:\n\n```sh\npixi run generate-schemas        # JSON Schema only\npixi run generate-models          # Pydantic models only\npixi run generate-all             # Both\n```\n\nThe generated JSON schemas are written to `schemas/`.\n\nAfter intentional changes to LinkML files, update the expected baseline files:\n\n```sh\npixi run update-expected-schemas\n```\n\n### Validate schema files\n\nTo validate the example files against the JSON schemas:\n\n```sh\npixi run validate              # Run all validations\npixi run validate-objects      # Validate individual object examples\npixi run validate-file-schemas # Validate file-level YAML examples\npixi run validate-package-schemas  # Validate package JSON examples\npixi run check-naming          # Enforce naming conventions\n```\n\nPackage schema validation requires the Frictionless Data Package standard\nschemas.  They are **downloaded automatically on first run** into\n`schemas/frictionless/` (gitignored) and cached for subsequent offline use.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fechemdb%2Fmetadata-schema","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fechemdb%2Fmetadata-schema","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fechemdb%2Fmetadata-schema/lists"}