{"id":15048093,"url":"https://github.com/github/kustoschematools","last_synced_at":"2026-02-25T07:13:38.474Z","repository":{"id":185813881,"uuid":"673275529","full_name":"github/KustoSchemaTools","owner":"github","description":"This repository contains C# code to synchronize database schemas from Azure Data Explorer (Kusto) to yaml files and back.","archived":false,"fork":false,"pushed_at":"2025-01-23T10:29:52.000Z","size":228,"stargazers_count":12,"open_issues_count":6,"forks_count":6,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-01-30T07:42:37.956Z","etag":null,"topics":["kusto"],"latest_commit_sha":null,"homepage":"","language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/github.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":null,"code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":"CODEOWNERS","security":"SECURITY.md","support":"SUPPORT.md","governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-08-01T09:10:14.000Z","updated_at":"2025-01-23T10:29:55.000Z","dependencies_parsed_at":null,"dependency_job_id":"79ca4650-72b2-460c-94a3-bf650cd0e962","html_url":"https://github.com/github/KustoSchemaTools","commit_stats":null,"previous_names":["github/kustoschematools"],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/github%2FKustoSchemaTools","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/github%2FKustoSchemaTools/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/github%2FKustoSchemaTools/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/github%2FKustoSchemaTools/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/github","download_url":"https://codeload.github.com/github/KustoSchemaTools/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":237224879,"owners_count":19275102,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["kusto"],"created_at":"2024-09-24T21:08:03.686Z","updated_at":"2025-10-19T22:32:47.602Z","avatar_url":"https://github.com/github.png","language":"C#","funding_links":[],"categories":[],"sub_categories":[],"readme":"# KustoSchemaTools\n\nThis C# project provides functionality to work with schemas in Azure Data Explorer (Kusto). You can load a schema from yaml files or a database to the interal data structure. This can be used for creating diffs of two databases as scripts or markdown, and also to write it back to files or update schemas in a database.\n\nA second project \"[KustoSchemaToolsAction](https://github.com/github/KustoSchemaToolsAction)\" wraps that into a CLI tool inside a docker container for usage in GitHub Actions.\n\n## Getting started\n\n### Database management\n\nThe `database` object holds all schema related information for a Kusto database. It can be loaded from, or written to a cluster using the `KustoDatabaseHandler` which can be created by the `KustoDatabaseHandlerFactory`. There are several steps involved for loading all relevant information from a kusto database into the `database` object. These are covered by different plugins, which can be configured for the `KustoDatabaseHandlerFactory`. \n\n```csharp\nvar dbFactory = new KustoDatabaseHandlerFactory(sp.GetService\u003cILogger\u003cKustoDatabaseHandler\u003e\u003e())\n    .WithPlugin\u003cKustoDatabasePrincipalLoader\u003e()\n    .WithPlugin\u003cKustoDatabaseRetentionAndCacheLoader\u003e()\n    .WithPlugin\u003cKustoTableBulkLoader\u003e()\n    .WithPlugin\u003cKustoFunctionBulkLoader\u003e()\n    .WithPlugin\u003cKustoMaterializedViewBulkLoader\u003e()\n    .WithPlugin\u003cDatabaseCleanup\u003e()\n```\n\n\n\n For synchronizing it to files, the `YamlDatabaseHandler` and the `YamlDatabaseHandlerFactory` are the right tools. To prevent super large files, there are plugins that handle reading and writing functions, tables and materialized views to separate files and folders. They can be configured for the `YamlDatabaseHandlerFactory`.\n\n```csharp\nvar yamlFactory = new YamlDatabaseHandlerFactory()\n    .WithPlugin(new TablePlugin())\n    .WithPlugin(new FunctionPlugin())\n    .WithPlugin(new MaterializedViewsPlugin())\n    .WithPlugin\u003cDatabaseCleanup\u003e();\n```\n\nAdditional features can be added with custom plugins. A sample for `table groups`, where some parts of the schema are defined once, but are applied for several tables can be found in [here](https://github.com/github/KustoSchemaToolsAction/blob/main/KustoSchemaCLI/Plugins/TableGroupPlugin.cs).\n\nThe `KustoSchemaHandler` is the central place for synching schemas between yaml and a database. It offers functions for generating changes formatted in markdown, writing a database to yaml files and applying changes from yaml files to a database.\n\n### Cluster configuration management\n\nCluster configuration changes are handled by the `KustoClusterOrchestrator`. Currently supported features include [`Capacity Policies`](https://learn.microsoft.com/en-us/kusto/management/capacity-policy?view=azure-data-explorer) and [`Workload Groups`](https://learn.microsoft.com/en-us/kusto/management/workload-groups?view=azure-data-explorer). The orchestrator expects a file path to a configuration file. A key design principle is that you only need to specify the properties you wish to set or change. Any property omitted in your policy file will be ignored, preserving its current value on the cluster.\nA sample file could look like this:\n\n```yaml\nconnections:\n- name: test\n  url: test.eastus\n  capacityPolicy:\n    ingestionCapacity:\n      clusterMaximumConcurrentOperations: 512\n      coreUtilizationCoefficient: 0.75\n    extentsMergeCapacity:\n      minimumConcurrentOperationsPerNode: 1\n      maximumConcurrentOperationsPerNode: 3\n    extentsPurgeRebuildCapacity:\n      maximumConcurrentOperationsPerNode: 1\n  workloadGroups:\n  - workloadGroupName: DataScience\n    workloadGroupPolicy:\n      requestRateLimitsEnforcementPolicy:\n        commandsEnforcementLevel: Cluster\n```\n\nThe `KustoClusterOrchestrator` coordinates between cluster handlers to manage cluster configuration changes:\n\n1. **Loading Configuration**: Uses `YamlClusterHandler` to parse the YAML configuration file and load the desired cluster state\n2. **Reading Current State**: Uses `KustoClusterHandler` to connect to each live cluster and retrieve the current capacity policy and workload group settings\n3. **Generating Changes**: Compares the desired state (from YAML) with the current state (from Kusto) to identify differences\n4. **Creating Scripts**: Generates the necessary Kusto control commands (like `.alter-merge cluster policy capacity` and `.create-or-alter workload_group`) to apply the changes\n5. **Applying Updates**: Executes the generated scripts against the live clusters to synchronize them with the desired configuration\n\nCurrently no plugins are supported. The orchestrator expects all cluster configuration in a central file.\n\n## Supported Features\n\nCurrently following features are supported:\n\n* Cluster\n    * Capacity Policies\n    * Workload Groups\n* Database\n    * Permissions\n    * Default Retention\n    * Default Hot Cache\n* Tables\n    * Columns\n    * Retention\n    * HotCache\n    * Update Policies\n    * Docstring\n    * Folder\n* Functions\n    * Body\n    * Docstring\n    * Folder\n    * Preformatted\n* Materialized Views\n    * Query\n    * Retention\n    * HotCache\n    * Docstring\n    * Folder\n    * Preformatted\n* External Tables (managed identity/impersonation only)\n    * Storage / Delta / SQL\n    * Folder\n    * Docstring\n* Continuous Exports\n* Entity Groups\n* Deleting existing items using deletions in the database definition\n    * Tables\n    * Columns\n    * Functions\n    * Materialized Views\n    * Extenal Tables\n    * Continuous Exports\n\nThe `DatabaseCleanup` will remove redundant retention and hotcache definitions. \nIt will also pretty print KQL queries in functions (unless the `preformatted` feature is used) , update policies, materialized views and continuous exports.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgithub%2Fkustoschematools","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgithub%2Fkustoschematools","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgithub%2Fkustoschematools/lists"}