{"id":30307102,"url":"https://github.com/biolink/resource-ingest-guide-schema","last_synced_at":"2025-08-17T10:15:07.759Z","repository":{"id":309430268,"uuid":"1036247250","full_name":"biolink/resource-ingest-guide-schema","owner":"biolink","description":"A LinkML schema for describing the scope, rationale, and modeling approach for ingesting content from a single source.","archived":false,"fork":false,"pushed_at":"2025-08-11T22:05:49.000Z","size":926,"stargazers_count":0,"open_issues_count":2,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-08-11T22:12:07.633Z","etag":null,"topics":["etl","linkml","model","schema"],"latest_commit_sha":null,"homepage":"https://biolink.github.io/resource-ingest-guide-schema/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/biolink.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-08-11T19:30:10.000Z","updated_at":"2025-08-11T22:05:14.000Z","dependencies_parsed_at":"2025-08-11T22:24:00.027Z","dependency_job_id":null,"html_url":"https://github.com/biolink/resource-ingest-guide-schema","commit_stats":null,"previous_names":["biolink/resource-ingest-guide-schema"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/biolink/resource-ingest-guide-schema","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/biolink%2Fresource-ingest-guide-schema","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/biolink%2Fresource-ingest-guide-schema/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/biolink%2Fresource-ingest-guide-schema/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/biolink%2Fresource-ingest-guide-schema/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/biolink","download_url":"https://codeload.github.com/biolink/resource-ingest-guide-schema/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/biolink%2Fresource-ingest-guide-schema/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270833769,"owners_count":24653818,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-17T02:00:09.016Z","response_time":129,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["etl","linkml","model","schema"],"created_at":"2025-08-17T10:15:05.616Z","updated_at":"2025-08-17T10:15:07.749Z","avatar_url":"https://github.com/biolink.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Resource Ingest Guide Schema\n\nA LinkML schema for describing Reference Ingest Guides (RIGs) - structured documents that capture the scope, rationale, and modeling approach for ingesting content from external sources into Biolink Model-compliant data repositories.\n\n## Overview\n\nThis repository provides:\n\n- **LinkML Schema**: Formal specification for Reference Ingest Guides in `src/resource_ingest_guide_schema/schema/`\n- **Documentation Generator**: Automated conversion of RIG YAML files to human-readable markdown\n- **Validation Tools**: Schema validation for RIG files using LinkML\n- **Template System**: Standardized templates and creation tools for new RIGs\n- **Example RIGs**: Real-world examples from CTD, DISEASES, and Clinical Trials KP\n\n### What are Reference Ingest Guides (RIGs)?\n\nRIGs are structured documents that describe:\n\n- **Source Information**: Details about data sources (access, formats, licensing)\n- **Ingest Information**: What content is included/excluded and filtering rationale\n- **Target Information**: How data is modeled in the output knowledge graph\n- **Provenance Information**: Contributors and related artifacts\n\nRIGs help ensure reproducible, well-documented data ingestion processes for biomedical knowledge graphs.\n\n## Website\n\n[https://biolink.github.io/resource-ingest-guide-schema](https://biolink.github.io/resource-ingest-guide-schema)\n\n## Repository Structure\n\n```\n├── src/\n│   ├── resource_ingest_guide_schema/\n│   │   └── schema/                    # LinkML schema definition\n│   ├── docs/\n│   │   ├── files/                     # Static documentation files\n│   │   ├── rigs/                      # Example RIG YAML files\n│   │   └── doc-templates/             # Jinja2 templates for docs\n│   └── scripts/                       # Python utilities for RIG processing\n├── docs/                              # Generated documentation\n├── tests/                             # Test suite\n└── project/                           # Generated LinkML artifacts\n```\n\n\n\n## Developer Documentation\n\n### Prerequisites\n\nThis project uses [uv](https://docs.astral.sh/uv/) for dependency management. Install it with:\n\n```bash\n# On macOS and Linux\ncurl -LsSf https://astral.sh/uv/install.sh | sh\n\n# On Windows\npowershell -c \"irm https://astral.sh/uv/install.ps1 | iex\"\n\n# Or with pip\npip install uv\n```\n\n### Getting Started\n\n1. **Install dependencies:**\n   ```bash\n   uv sync --extra dev\n   ```\n\n2. **Run tests:**\n   ```bash\n   make test\n   ```\n\n3. **Generate documentation:**\n   ```bash\n   make gendoc\n   ```\n\n4. **Create a new RIG:**\n   ```bash\n   make new-rig INFORES=infores:example NAME=\"Example Data Source\"\n   ```\n\n### Working with RIGs\n\n#### Creating a New RIG\n\n```bash\n# Create a new RIG from the template\nmake new-rig INFORES=infores:mydatasource NAME=\"My Data Source RIG\"\n\n# This creates src/docs/rigs/mydatasource_rig.yaml\n# Edit the file to fill in your specific information\n```\n\n#### Validating RIGs\n\n```bash\n# Validate all RIG files against the schema\nmake validate-rigs\n\n# Validate a specific RIG\nuv run linkml-validate --schema src/resource_ingest_guide_schema/schema/resource_ingest_guide_schema.yaml src/docs/rigs/my_rig.yaml\n```\n\n#### Building Documentation\n\n```bash\n# Generate all documentation including RIG index and markdown versions\nmake gendoc\n\n# Test documentation locally\nmake testdoc  # Builds docs and starts local server\n```\n\n### Development Workflow\n\n#### 1. Schema Development\n\nThe LinkML schema is defined in `src/resource_ingest_guide_schema/schema/resource_ingest_guide_schema.yaml`. After making changes:\n\n```bash\n# Regenerate Python datamodel and other artifacts\nmake gen-project\n\n# Test the schema\nmake test-schema\n\n# Lint the schema\nmake lint\n```\n\n#### 2. Script Development\n\nPython utilities are in `src/scripts/`:\n- `create_rig.py`: Generate new RIG from template\n- `rig_to_markdown.py`: Convert RIG YAML to markdown\n- `generate_rig_index.py`: Create RIG index table\n\nTo test script changes:\n```bash\n# Run scripts directly\nuv run python src/scripts/create_rig.py --help\nuv run python src/scripts/rig_to_markdown.py --input-dir src/docs/rigs --output-dir docs\n```\n\n#### 3. Documentation Development\n\nTemplates are in `src/docs/doc-templates/` and static files in `src/docs/files/`:\n\n```bash\n# Regenerate docs after template changes\nmake gendoc\n\n# View changes locally\nmake serve  # or make testdoc\n```\n\n### Available Commands\n\n| Command | Description |\n|---------|-------------|\n| `make help` | Show all available commands |\n| `make install` | Install dependencies with uv |\n| `make test` | Run full test suite |\n| `make test-schema` | Test schema generation |\n| `make test-python` | Run Python tests |\n| `make lint` | Lint the LinkML schema |\n| `make gen-project` | Generate LinkML artifacts (Python, JSON Schema, etc.) |\n| `make gendoc` | Generate documentation including RIG processing |\n| `make serve` | Start local documentation server |\n| `make testdoc` | Build docs and start server |\n| `make new-rig` | Create new RIG (requires INFORES and NAME) |\n| `make validate-rigs` | Validate all RIG files |\n| `make clean` | Clean generated files |\n| `make deploy` | Deploy documentation |\n\n### Project Structure Details\n\n#### Key Directories\n\n- **`src/resource_ingest_guide_schema/schema/`**: LinkML schema definition\n- **`src/docs/rigs/`**: Example RIG YAML files (CTD, DISEASES, Clinical Trials KP)\n- **`src/docs/files/`**: Static documentation files copied to output\n- **`src/docs/doc-templates/`**: Jinja2 templates for documentation generation\n- **`src/scripts/`**: Python utilities for RIG creation and processing\n- **`docs/`**: Generated documentation output (do not edit directly)\n- **`project/`**: Generated LinkML artifacts (Python models, JSON Schema, etc.)\n\n#### Generated Artifacts\n\nThe `make gen-project` command generates:\n- **Python datamodel**: `src/resource_ingest_guide_schema/datamodel/`\n- **JSON Schema**: `project/jsonschema/`\n- **OWL ontology**: `project/owl/`\n- **GraphQL schema**: `project/graphql/`\n- **SQL DDL**: `project/sqlschema/`\n- **And more**: See `project/` directory\n\n### Contributing\n\n1. Fork the repository\n2. Create a feature branch\n3. Make changes following the existing patterns\n4. Ensure tests pass: `make test`\n5. Update documentation if needed: `make gendoc`\n6. Submit a pull request\n\n#### Adding New RIG Examples\n\n1. Create YAML file in `src/docs/rigs/`\n2. Follow the schema structure (see existing examples)\n3. Validate: `make validate-rigs`\n4. Regenerate docs: `make gendoc`\n5. The RIG will automatically appear in the documentation index\n\n#### Schema Changes\n\n1. Modify `src/resource_ingest_guide_schema/schema/resource_ingest_guide_schema.yaml`\n2. Regenerate artifacts: `make gen-project`\n3. Update any affected RIG files\n4. Test: `make test`\n5. Update documentation as needed\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbiolink%2Fresource-ingest-guide-schema","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbiolink%2Fresource-ingest-guide-schema","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbiolink%2Fresource-ingest-guide-schema/lists"}