{"id":18972371,"url":"https://github.com/oskar11120/apache.druid.querying","last_synced_at":"2025-08-21T15:10:40.457Z","repository":{"id":223682991,"uuid":"758714645","full_name":"oskar11120/Apache.Druid.Querying","owner":"oskar11120","description":"Apache Druid client library for dotnet 8+.","archived":false,"fork":false,"pushed_at":"2025-02-15T16:54:14.000Z","size":2792,"stargazers_count":3,"open_issues_count":0,"forks_count":2,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-06-28T03:51:52.563Z","etag":null,"topics":["client","dotnet","druid","druid-io","http","query","query-builder"],"latest_commit_sha":null,"homepage":"https://www.nuget.org/packages/Apache.Druid.Querying","language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/oskar11120.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-02-16T22:46:04.000Z","updated_at":"2025-06-06T18:57:02.000Z","dependencies_parsed_at":"2024-03-09T16:43:23.537Z","dependency_job_id":"28e693b6-87f5-4ceb-9f18-2a5bde24963a","html_url":"https://github.com/oskar11120/Apache.Druid.Querying","commit_stats":null,"previous_names":["oskar11120/apache.druid.querying"],"tags_count":19,"template":false,"template_full_name":null,"purl":"pkg:github/oskar11120/Apache.Druid.Querying","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oskar11120%2FApache.Druid.Querying","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oskar11120%2FApache.Druid.Querying/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oskar11120%2FApache.Druid.Querying/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oskar11120%2FApache.Druid.Querying/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/oskar11120","download_url":"https://codeload.github.com/oskar11120/Apache.Druid.Querying/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oskar11120%2FApache.Druid.Querying/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262385320,"owners_count":23302797,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["client","dotnet","druid","druid-io","http","query","query-builder"],"created_at":"2024-11-08T15:08:17.509Z","updated_at":"2025-07-14T11:06:47.784Z","avatar_url":"https://github.com/oskar11120.png","language":"C#","funding_links":[],"categories":[],"sub_categories":[],"readme":"# [Apache Druid](http://druid.io/) client library/orm for dotnet 8+ inspired by [EF Core](https://learn.microsoft.com/pl-pl/ef/core/).\n\nhttps://www.nuget.org/packages/Apache.Druid.Querying\n\n## Setup\nTo make your Druid data sources available for querying create a class deriving from `Apache.Druid.Querying.DataSourceProvider`. The class represents collection of data sources available for querying similarly to how `EfCore`'s `DbContext` represents collection of database tables. The class contains methods `Table`, `Lookup` and `Inline` which you can use to create instances of `Apache.Druid.Querying.DataSource` (similar to `EfCore`'s `DbSet`) which in turn can be used of querying. The instances are thread-safe and so can be used for executing multiple queries at the same time. Some of the `DataSource` creating methods require parameter `id` which corresponds to id of related `Druid` data source.\n\nThe method `Table` additionally requires generic parameter `TSource` depicting a row of your table data, similarly to how `EfCore`'s `Entities` depict database rows. The type's public properties correspond to the data source columns.\n\nBy default `TSource` property names map 1-to-1 into `Druid` data source column names. This can be overridden in two ways:\n- By decorating `TSource` with `Apache.Druid.Querying.DataSourceNamingConvention` attribute. The convention will applied to all `TSource`'s property names.\n- By decorating `TSource`'s properties with `Apache.Druid.Querying.DataSourceColumn` attribute. The string parameter passed to the attribute will become the data source column name. As most `Druid` data sources contain column `__time` for convenience there exists attribute `Apache.Druid.Querying.DataSourceTimeColumn` equivalent to `Apache.Druid.Querying.DataSourceColumn(\"__time\")`.\n\n```cs\n    [DataSourceColumnNamingConvention.CamelCase]\n    public record Edit(\n        [property: DataSourceTimeColumn] DateTimeOffset Timestamp,\n        bool IsRobot,\n        string Channel,\n        string Flags,\n        bool IsUnpatrolled,\n        string Page,\n        [property: DataSourceColumn(\"diffUrl\")] string DiffUri,\n        int Added,\n        string Comment,\n        int CommentLength,\n        bool IsNew,\n        bool IsMinor,\n        int Delta,\n        bool IsAnonymous,\n        string User,\n        int DeltaBucket,\n        int Deleted,\n        string Namespace,\n        string CityName,\n        string CountryName,\n        string? RegionIsoCode,\n        int? MetroCode,\n        string? CountryIsoCode,\n        string? RegionName);\n\n    public class WikipediaDataSourceProvider : DataSourceProvider\n    {\n        public WikipediaDataSourceProvider()\n        {\n            // Druid's example wikipedia edits data source.\n            Edits = Table\u003cEdit\u003e(\"wikipedia\");\n        }\n\n        public DataSource\u003cEdit\u003e Edits { get; }\n    }\n```\n\nThen connect up your data source provider to a dependency injection framework of your choice:\n- [Microsoft.Extensions.DependencyInjection](Apache.Druid.Querying.Microsoft.Extensions.DependencyInjection/README.md)\n\n## Querying\nChoose query type and models representing query's data using nested types of `Apache.Druid.Querying.Query\u003cTSource\u003e`. Create a query by instantiating chosen nested type. Set query data by calling the instance methods. The methods often accept `Expression\u003cDelegate\u003e`, using which given an object representing input data available at that point in a query and an object representing all possible operations on that input data, you create an object representing results of your chosen operations. To get an idea on what's possible it's best to look into project's tests.\n\nGet query json representation (to be sent to druid upon query execution) by calling `Apache.Druid.Querying.DataSource\u003cTSource\u003e.MapQueryToJson`. Execute query by calling `Apache.Druid.Querying.DataSource\u003cTSource\u003e.ExecuteQuery`.\n\nAvailable query types:\n- TimeSeries\n- TopN\n- GroupBy\n- Scan\n- SegmentMetadata\n- DataSourceMetadata\n\n```cs\n    // Getting DataSourceProvider from dependency injection container.\n    private static WikipediaDataSourceProvider Wikipedia \n        =\u003e Services.GetRequiredService\u003cWikipediaDataSourceProvider\u003e();\n\n    private record Aggregations(int Count, int TotalAdded);\n    private record PostAggregations(double AverageAdded);\n    public void ExampleTimeSeries()\n    { \n        var query = new Query\u003cEdit\u003e\n            .TimeSeries\n            .WithNoVirtualColumns\n            .WithAggregations\u003cAggregations\u003e\n            .WithPostAggregations\u003cPostAggregations\u003e()\n            .Order(OrderDirection.Descending)\n            .Aggregations(type =\u003e new Aggregations( // Explicitly stating data types in the methods for the sake of clarity in the example. Query is able to infer them.\n                type.Count(),\n                type.Sum((Edit edit) =\u003e edit.Added)))\n            .PostAggregations(type =\u003e new PostAggregations(type.Arithmetic(\n                ArithmeticFunction.Divide,\n                type.FieldAccess(aggregations =\u003e aggregations.TotalAdded),\n                type.FieldAccess(aggregations =\u003e aggregations.Count))))\n            .Filter(type =\u003e type.Selector(edit =\u003e edit.CountryIsoCode, \"US\"))\n            .Interval(new(DateTimeOffset.UtcNow, DateTimeOffset.UtcNow.AddDays(1)))\n            .Granularity(Granularity.Hour)\n            .Context(new QueryContext.TimeSeries() { SkipEmptyBuckets = true });\n        var json = Wikipedia.Edits.MapQueryToJson(query); // Use MapQueryToJson to look up query's json representation.\n        IAsyncEnumerable\u003cWithTimestamp\u003cAggregations_PostAggregations\u003cAggregations, PostAggregations\u003e\u003e\u003e results \n            = Wikipedia.Edits.ExecuteQuery(query);\n    }\n```\n\n## Data types\n\nIn Apache Druid operations on data have multiple \"variants\". Which variant you may want to choose in which query depends on:\n\n- Data type of column used in the operation.\n- Expected result of the operation.\n\nFor example, to perform a sum over some column's values, you may use:\n\n- doubleSum \n- floatSum \n- longSum.\n\nMost often though, you want the operation to match your column's data type. For this reason, such operations have been \"merged\" into one, accepting optional parameter of type `SimpleDataType`. Given example of operation `Sum`:\n\u003ctable\u003e\n\u003cthead\u003e\n  \u003ctr\u003e\n    \u003cth\u003eApache.Druid.Querying\u003c/th\u003e\n    \u003cth\u003eApache Druid\u003c/th\u003e\n  \u003c/tr\u003e\n\u003c/thead\u003e\n\u003ctbody\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\n\n```cs\nquery\n    .Aggregations(type =\u003e new(\n        type.Sum(edit =\u003e edit.Added, SimpleDataType.Double)));\n```\n\u003c/td\u003e    \n\u003ctd\u003e\n\n```json\n{\n    \"aggregations\": [\n        {\n          \"type\": \"doubleSum\",\n          \"name\": \"TotalAdded\",\n          \"fieldName\": \"added\"\n        }\n      ]\n}\n```\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\n\n```cs\nquery\n    .Aggregations(type =\u003e new(\n        type.Sum(edit =\u003e edit.Added, SimpleDataType.Float)));\n```\n\u003c/td\u003e\n\u003ctd\u003e\n\n```json\n{\n    \"aggregations\": [\n        {\n          \"type\": \"floatSum\",\n          \"name\": \"TotalAdded\",\n          \"fieldName\": \"added\"\n        }\n      ]\n}\n```\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\n\n```cs\nquery\n    .Aggregations(type =\u003e new(\n        type.Sum(edit =\u003e edit.Added, SimpleDataType.Long)));\n```\n\u003c/td\u003e\n\u003ctd\u003e\n\n```json\n{\n    \"aggregations\": [\n        {\n          \"type\": \"longSum\",\n          \"name\": \"TotalAdded\",\n          \"fieldName\": \"added\"\n        }\n      ]\n}\n```\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/tbody\u003e\n\u003c/table\u003e\n\nIn case `SimpleDataType` has not been specified, the library will infer it from related property type with following logic:\n\u003ctable\u003e\n\u003cthead\u003e\n  \u003ctr\u003e\n    \u003cth\u003eProperty type\u003c/th\u003e\n    \u003cth\u003eDruid data type\u003c/th\u003e\n  \u003c/tr\u003e\n\u003c/thead\u003e\n\u003ctbody\u003e\n  \u003ctr\u003e\n    \u003ctd\u003estring, Guid, char, Uri, Enum\u003c/td\u003e\n    \u003ctd\u003eString\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003edouble\u003c/td\u003e\n    \u003ctd\u003eDouble\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003efloat\u003c/td\u003e\n    \u003ctd\u003eFloat\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eshort, int, long, DateTime, DateTimeOffset\u003c/td\u003e\n    \u003ctd\u003eLong\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eNullable\u0026lt;T\u0026gt;\u003c/td\u003e\n    \u003ctd\u003eResult of type inference on T\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIEnumerable\u0026lt;T\u0026gt;\u003c/td\u003e\n    \u003ctd\u003eArray\u0026lt;Result of type inference on T\u0026gt;\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eIf property type does not match any above types\u003c/td\u003e\n    \u003ctd\u003eComplex\u0026lt;json\u0026gt;\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/tbody\u003e\n\u003c/table\u003e\n\n## Refering to objects representing data\nYou can refer objects representing your query data in two way:\n- by its properties, resulting in library mapping them to Druid columns\n- by it as a whole, resulting in library mapping the whole object to a column.\n\nThis means the following queries will give you equivalent results.\n\n```cs\nrecord Aggregations(int AddedSum);\nvar first = new Query\u003cEdit\u003e\n    .TimeSeries\n    .WithNoVirtualColumns\n    .WithAggregations\u003cAggregations\u003e()\n    .Aggregations(type =\u003e new(\n        type.Sum(edit =\u003e edit.Added)));\nvar second = new Query\u003cEdit\u003e\n    .TimeSeries\n    .WithNoVirtualColumns\n    .WithAggregations\u003cint\u003e()\n    .Aggregations(type =\u003e type.Sum(edit =\u003e edit.Added));\n```\n\n## Ternary expressions and type.None\n`Expression\u003cDelegate\u003e` query paramers in your queries may contain [ternary expressions](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/operators/conditional-operator). Upon query execution (or mapping of a query to json) any ternary expressions will have their conditions evaluated and then will get replated with the result expressions matching the condition values.\n\n```cs\nvar value = 1;\nFunc\u003cbool\u003e valueGreaterThanZero = () =\u003e value \u003e 0;\nvar okTernaryExpressions = new Query\u003cIotMeasurement\u003e\n    .TimeSeries\n    .WithNoVirtualColumns\n    .WithAggregations\u003cAggregationsFromTernary\u003e\n    .WithPostAggregations\u003cint\u003e()\n    .Aggregations(type =\u003e new(\n        value \u003e 0 ? type.Max(data =\u003e data.Value) : type.Min(data =\u003e data.Value),\n        type.Last(data =\u003e valueGreaterThanZero() ? data.Timestamp : data.ProcessedTimestamp),\n        value \u003e 0 ? \n            (value == 1 ?\n                type.First(data =\u003e data.Value) :\n                type.Last(data =\u003e data.Value)) :\n            type.Min(data =\u003e data.Value),\n        type.Last(data =\u003e valueGreaterThanZero() ? \n            (valueGreaterThanZero() ? data.Timestamp : data.ProcessedTimestamp) :\n            data.ProcessedTimestamp)))\n    .PostAggregations(type =\u003e valueGreaterThanZero() ? type.Constant(1) : type.Constant(0));\n```\n\nObjects representing all possible operations on input data contain method `None`, calling which is equivalent to calling no method at all.\n\n```cs\nbool includeCount = true;\nvar conditionalCount = new Query\u003cEdit\u003e\n    .TimeSeries\n    .WithNoVirtualColumns\n    .WithAggregations\u003cint\u003e()\n    .Aggregations(type =\u003e includeCount ? type.Count() : type.None\u003cint\u003e());\n```\n\n## Druid expressions\nThe library accepts [Druid expressions](https://druid.apache.org/docs/latest/querying/math-expr) in form of a delegate where given object representing data available at that point in a query you are supposed to return an [interpolated string using $](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/tokens/interpolated) where each string's parameter is either:\n\n- a property of object representing data, which will get mapped to appropriate column\n- a constant, which will get converted to a string.\n\nPassing any other parameters will result in an `InvalidOperationException` being thrown upon execution of the query.\n\n```cs\nvar okExpressions = new Query\u003cEdit\u003e\n    .TimeSeries\n    .WithVirtualColumns\u003cint\u003e()\n    .VirtualColumns(type =\u003e type.Expression\u003cint\u003e(edit =\u003e \"1\"))\n    .VirtualColumns(type =\u003e type.Expression\u003cint\u003e(edit =\u003e $\"{edit.Added} * 2\"))\n    .VirtualColumns(type =\u003e type.Expression\u003cint\u003e(edit =\u003e\n        $\"{edit.Added} * 2\" +\n        $\"- {edit.Deleted}\"));\n```\n\n## Query result deserialization\nThe library serializes queries and deserializes query results using System.Text.Json. The serializer has been altered in following ways:\n- applied `System.Text.Json.JsonSerializerDefaults.Web`\n- `DateTime` and `DateTimeOffset` can additionally be deserialized from unix timestamps\n- `bool` can additionally be deserialized from \"true\", \"false', \"True\" and \"False\" string literals in quotes\n- `bool` can additionally be deserialized from numbers, where `1` will get deserialized to `true`, other numbers - to `false`\n- applied various json converters for types defined in the library.\n\nGet the default altered serializer options by calling `Apache.Druid.Querying.Json.DefaultSerializerOptions.Create()`.\n\nWherever possible, the query results have been \"flattened\" so they are streamed to consumers as soon as possible.\n\n## Truncated query result handling\nApache Druid returns query results in form of http/1.1 responses with content-encoding: chunked. Because of that there's a chance of query results getting truncated, resulting in consumers getting only part of them. `Apache.Druid.Querying.DataSource\u003cTSource\u003e.ExecuteQuery` accepts parameter `onTruncatedResultsQueryForRemaining`, which if set to `true` (the default) will result in the library requesting the rest of the results in most of such cases, specifically:\n1. Tcp connections closing or resetting before having streamed whole the response content.\n2. Http responses completing successfully, but containing incomplete json.\n\nIn practice, the only unhandled case is when results are truncated due to [Apache Druid timeout feature](https://druid.apache.org/docs/latest/querying/query-context/#general-parameters). The way it works is when the timeout is reached, related http response completes successfully, with a complete json missing some of the results. [There is an (unfortunately stale) pull request changing the behaviour to follow case 1. from the previous paragraph](https://github.com/apache/druid/pull/13492). I consider this a bug in Druid itself. Until addressed by the Druid team, I recommend not to use Druid timeouts at all. Instead, if needed, apply timeouts through an http proxy or using cancellation tokens passed to `Apache.Druid.Querying.DataSource\u003cTSource\u003e.ExecuteQuery`.\n\nTruncated result handling applies only in cases of truncated results, meaning http responses where at least response headers have successfully been read and so is not a retry policy. If needed, set up a retry policy yourself, using extensibility points provided by your chosen dependency injection library.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foskar11120%2Fapache.druid.querying","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Foskar11120%2Fapache.druid.querying","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foskar11120%2Fapache.druid.querying/lists"}