{"id":17449762,"url":"https://github.com/bolner/fatcatdb","last_synced_at":"2026-04-18T06:37:31.162Z","repository":{"id":38087618,"uuid":"239372962","full_name":"bolner/FatCatDB","owner":"bolner","description":"Zero configuration, high performance database library for ETL workflows","archived":false,"fork":false,"pushed_at":"2022-06-26T13:53:48.000Z","size":153,"stargazers_count":2,"open_issues_count":0,"forks_count":2,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-10-28T10:44:57.172Z","etag":null,"topics":["csharp","database","dotnet-core","etl","library"],"latest_commit_sha":null,"homepage":null,"language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bolner.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-02-09T20:55:57.000Z","updated_at":"2022-08-03T19:08:57.000Z","dependencies_parsed_at":"2022-09-20T07:03:41.026Z","dependency_job_id":null,"html_url":"https://github.com/bolner/FatCatDB","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bolner%2FFatCatDB","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bolner%2FFatCatDB/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bolner%2FFatCatDB/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bolner%2FFatCatDB/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bolner","download_url":"https://codeload.github.com/bolner/FatCatDB/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":229847512,"owners_count":18133641,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csharp","database","dotnet-core","etl","library"],"created_at":"2024-10-17T21:52:27.113Z","updated_at":"2026-04-18T06:37:26.130Z","avatar_url":"https://github.com/bolner.png","language":"C#","funding_links":[],"categories":[],"sub_categories":[],"readme":"FatCatDB\n========\n\n`- This project is still in its beta phase. If you want a well tested version, please come back in April. -`\n\nFatCatDB is a `zero configuration` database library for `.NET Core`. Its main target segment is `ETL workflows` (e.g. time-series data), therefore it's optimized for high throughput. Supports class-based [schema definition](#creating-a-table-schema), multiple indices per table and fluid, object-oriented query expressions. One would use it for a smaller project to avoid managing a PostgreSQL or another full-fledged database system. With this library your project will already have data storage capability after just cloned from a GIT repo. You don't need to create and maintain Docker images for a database server.\n\n# Example query\n\nYou can make fluid style queries using lambda expressions:\n\n```csharp\nvar db = new DbContext();\nvar cursor = db.Metrics.Query()\n    .Where(x =\u003e x.Date, \"2020-01-02\")\n    .Where(x =\u003e x.AccountID, \"a11\")\n    .FlexFilter(x =\u003e x.Revenue \u003e x.Cost * 2.2 \u0026\u0026 x.Impressions \u003e 10)\n    .OrderByAsc(x =\u003e x.CampaignID)\n    .OrderByDesc(x =\u003e x.Cost)\n    .Limit(100)\n    .AfterBookmark(bookmark)\n    .GetCursor();\n\nforeach(var item in cursor) {\n    ...\n}\n```\n\n# Table of contents\n\n- [FatCatDB](#fatcatdb)\n- [Example query](#example-query)\n- [Table of contents](#table-of-contents)\n- [NuGet package](#nuget-package)\n- [Creating a table schema](#creating-a-table-schema)\n- [Creating a database context class](#creating-a-database-context-class)\n- [Inserting and modifying data](#inserting-and-modifying-data)\n- [Queries](#queries)\n- [Paging with bookmarks](#paging-with-bookmarks)\n- [Query plans](#query-plans)\n- [Atomic operations with the OnUpdate event](#atomic-operations-with-the-onupdate-event)\n- [Making fields unchangeable using OnUpdate](#making-fields-unchangeable-using-onupdate)\n- [Async support](#async-support)\n- [Adding new types](#adding-new-types)\n- [Hinting the query planner](#hinting-the-query-planner)\n- [ACID and durability](#acid-and-durability)\n- [Configurations](#configurations)\n- [TODO](#todo)\n\n# NuGet package\n\nAvailable at: https://www.nuget.org/packages/FatCatDB\n\nTo include it in a `.NET Core` project, execute:\n\n```bash\n$ dotnet add package FatCatDB\n```\n\n# Creating a table schema\n\nSee the example below. You only have to add annotations to a class and some of its public properties. Please find an explanation below the example. All annotated columns must have `Nullable` type, which you can either achieve by adding a question mark after non-nullable types, like `long?` or through `Nullable\u003clong\u003e`. They also need to be `comparable`, which means that they have to implement the `IComparable` interface.\n\n```csharp\nusing System;\nusing FatCatDB.Annotation;\nusing NodaTime;\n\nnamespace FatCatDB.Test {\n    [Table(Name = \"test_event\", Unique = \"campaign_id, ad_id\", NullValue = \"n.a.\")]\n    [TableIndex(Name = \"account_date\", Columns = \"account_id, date\")]\n    [TableIndex(Name = \"date_account\", Columns = \"date, account_id\")]\n    public class MetricsRecord {\n        [Column(Name = \"date\")]\n        public LocalDate? Date { get; set; }\n\n        [Column(Name = \"account_id\")]\n        public string AccountID { get; set; }\n\n        [Column(Name = \"campaign_id\")]\n        public string CampaignID { get; set; }\n\n        [Column(Name = \"ad_id\")]\n        public string AdID { get; set; }\n\n        [Column(Name = \"last_updated\")]\n        public LocalDateTime? LastUpdated { get; set; }\n\n        [Column(Name = \"impressions\")]\n        public long? Impressions { get; set; }\n\n        [Column(Name = \"clicks\")]\n        public long? Clicks { get; set; }\n\n        [Column(Name = \"conversion\")]\n        public long? Conversions { get; set; }\n\n        [Column(Name = \"revenue\")]\n        public decimal? Revenue { get; set; }\n\n        [Column(Name = \"cost\")]\n        public decimal? Cost { get; set; }\n    }\n}\n```\n\nAnnotation | Description\n--- | ---\n`Table.Name` | The name of the database table. Used in error messages and in the filesystem structure.\n`Table.Unique` | Each `TableIndex` defines a way to partition the data into packets. This `Unique` property defines uniqueness inside a packet only. Do not use the same list of fields here as in any of the `TableIndex` annotations, because you would end up with packets containing a single record. Think of this as a continuation of the indices.\n`Table.NullValue` | The string representation of how to store \"unknown\" or NULL values.\n`TableIndex.Name` | The name of a database table index which speeds up queries. If you define 3 indices, then the data is stored 3 times on the disk, redundantly.\n`TableIndex.Columns` | Comma-separated list of columns. This works the same way how you define composite indices in a relational database. The encoded content of the indexed field cannot be longer than 248 characters. In FatCatDB an index defines a multi-level directory structure on the disk, which contain `.tsv.gz` files, called `packets`. The column list tells how to partition the data into packets. The optimal size of a packet is around between 10 KB -\u003e 1 MB. This database uses multi-level directory structures for quick queries.\n`Column.Name` | If you want a property to be part of the database table, then add a `Column` annotation to it. The `Name` tells how the `Table.Unique` and the `TableIndex.Columns` fields refer to it. The data is also exported by default on this name.\n\nAll properties without annotation are just ignored by FatCatDB, and they won't cause any problem. Feel free to include arbitrary logic (methods, custom properties, private members, etc.) in your record classes.\n\n# Creating a database context class\n\nThe design of FatCatDB follows [dependency injection](https://en.wikipedia.org/wiki/Dependency_injection) to make implementing unit tests possible. Therefore to use it, you have to instantiate a `database context` class, which is derived from `DbContextBase`.\n\nA minimal database context class contains only your tables:\n\n```csharp\npublic class DbContext : DbContextBase {\n    public Table\u003cCatRecord\u003e Cats { get; } = new Table\u003cCatRecord\u003e();\n    public Table\u003cDogRecord\u003e Dogs { get; } = new Table\u003cDogRecord\u003e();\n    public Table\u003cBunnyRecord\u003e Bunnies { get; } = new Table\u003cBunnyRecord\u003e();\n}\n```\n\nBut you can also change the default configuration:\n\n```csharp\npublic class DbContext : DbContextBase {\n    public Table\u003cMetricsRecord\u003e Metrics { get; } = new Table\u003cMetricsRecord\u003e();\n    \n    protected override void OnConfiguring (TypeConverterSetup typeConverterSetup, Configurator configurator) {\n        configurator\n            .SetTransactionParallelism(8)\n            .SetQueryParallelism(8)\n            .EnableDurability(false);\n    }\n}\n```\n\nTwo imporant things to note:\n- All tables must be defined inside the context class the above way. As a property with a `get` accessor, and also setting it to an instance with the `new` operator. In the above example, `Metrics` is a table that contains `MetricsRecord` records.\n- You can optionally override the `OnConfiguring` method to change the default configuration or to extend the system with your custom types. For the later see [the section about custom types](#adding-new-types).\n\nThe available configuration options are the following:\n\nExample | Description\n--- | ---\n`.SetTransactionParallelism(8)` | Specify the number of threads working on a single data modification transaction. This should have a high value for console applications and low for servers. Default value: 4\n`.SetQueryParallelism(8)` | Specify the number of threads working on a single query. This should have a high value for console applications and low for servers. Default value: 4\n`.EnableDurability(false)` | If durability is enabled then instead of overwriting files they are first written to a temporary file, and then swapped with the old one. Disabled by default.\n`.SetDatabasePath(\"/path/to/dir\")` | You can configure a custom path to the database folder. By default it is: `{WorkDirectory}/var/data`. Use a relative path to specify a path relative to the working directory.\n\n# Inserting and modifying data\n\nThe data is modified in bigger chunks, called \"transactions\". To create one, just use the `NewTransaction()` method on one of your tables:\n\n```csharp\nvar db = new DbContext();\nvar transaction = db.YourTable.NewTransaction();\n```\n\nAdding and updating records are both done using the same `Add` method. By default FatCatDB is always doing an `upsert`. You can change this behaviour with [OnUpdate event handlers](#atomic-operations-with-the-onupdate-event).\n\n```csharp\nvar record = new MyRecord();\nrecord.Name = \"Name1\";\nrecord.Time = db.NowUTC;\n\ntransaction.Add(record);\n```\n\nThe `unique` fields determine when two records belong to the same entity. If a record exists already, then it gets updated automatically. You can also remove records using:\n\n```csharp\ntransaction.Remove(record);\n```\n\nAt the end you have to commit the transaction to save the changes to disk:\n\n```csharp\nforeach(var record in records) {\n    transaction.Add(record);\n}\n\ntransaction.Commit();\n```\n\nThe bigger the transactions are the higher performance you get. Feel free to store multile gigabytes in a single commit.\nIf you provide a `true` parameter to the Commit method, then it also forces garbage collection in the .NET assembly at the end:\n\n```csharp\ntransaction.Commit(true);\n```\n\n# Queries\n\nThe following are the typical levels that are involved in a query: `database context`, `query`, `cursor` and optionally an `exporter`, when you don't iterate through the records yourself.\n\n```csharp\nvar db = new DbContext();\nvar query = db.MyTable.Query()\n    .Where(x =\u003e x.Name, \"John Smith\")\n    .Where(x =\u003e x.Age, 25)\n    .OrderByAsc(x =\u003e x.LastModified);\n\nvar cursor = query.GetCursor();\nvar exporter = cursor.GetExporter();\nexporter.Print();\n```\n\nThe above example printed the results to the standard output in Linear TSV format, but there's always a shortcut for everything:\n```csharp\ndb.MyTable.Query()\n    .Where(x =\u003e x.Name, \"John Smith\")\n    .Where(x =\u003e x.Age, 25)\n    .OrderByAsc(x =\u003e x.LastModified)\n    .Print();\n```\n\nThe cursor is an enumerable of your record class, which you can loop through:\n```csharp\nforeach(var item in cursor) {\n    ...\n}\n```\n\nYou can fetch the first item by the `FindOne()` method. The response is `null` if none found:\n\n```csharp\nvar person = db.MyTable.Query()\n    .Where(x =\u003e x.Name, \"John Smith\")\n    .Where(x =\u003e x.Age, 25)\n    .OrderByAsc(x =\u003e x.LastModified)\n    .FindOne();\n\nif (person != null) {\n    ...\n}\n```\n\nPlease find below a complete list of query directives:\n\nExample directive | Description\n--- | ---\n`.Where(x =\u003e x.Date, \"2020-02-09\")` | Filtering on a specific value (exact match). This kind of filtering is fast, because it uses the indices. You can use `Where` on both a value of the original type of the column, or on the string representation of it. (See [Adding new types](#adding-new-types) about the string conversion.)\n`.Where(x =\u003e x.Date, new LocalDate(2020, 2, 9))` | You can also use the original type of the column in `Where` filters. This also uses the indices.\n`.FlexFilter(x =\u003e x.Cost \u003e x.Revenue \u0026\u0026 x.Impressions \u003e 10)` | In flex filters, you can specify an arbitrary expression over the columns. This filtering is slow as it doesn't use the indices.\n`.OrderByAsc(x =\u003e x.Budget)` `.OrderByDesc(x =\u003e x.Budget)` | Ordering by a column in ascending or descending way. You can append multiple sorting directives to sort over multiple fields, in which case the order of the directives is important.\n`.Limit(limit)` | The limit value specifies the maximum number of items to return. For the `offset` see the next line.\n`.AfterBookmark(bookmark)` | Instead of an `offset` value, FatCatDB uses strings called `Bookmarks`. They provide a much more efficient way to continue a query than offset values. See the chapter [Paging with bookmarks](#paging-with-bookmarks).\n`.HintIndexPriority( IndexPriority.Sorting )` | Hinting an index selection algorithm. See the section [Hinting the query planner](#hinting-the-query-planner) for more details.\n`.HintIndex(\"index_name\")` | Hinting a specific index. See the section [Hinting the query planner](#hinting-the-query-planner) for more details.\n\nNote that since the cursor is an enumerable of record objects, you can use `Linq expressions` on them. But if you do that, then the whole result set gets loaded into the memory (if there's enough memory for it). Therefore it's recommended to use Linq only in the presence of a `Limit` directive.\n\n# Paging with bookmarks\n\nPaging in most database systems is done using the combination of a `limit` and an `offset` directive.\nInstead of `offset` FatCatDB uses bookmarks. Basically a bookmark describes the last item fetched during a query,\nso the query can be continued later in a different request. Bookmarks are more efficient than using offsets.\n\nYou can get a bookmark from either a cursor or from an exporter:\n\n```csharp\nvar cursor = db.People.Query()\n    .Where(x =\u003e x.City, \"Amsterdam\")\n    .Where(x =\u003e x.Age, 25)\n    .OrderByAsc(x =\u003e x.ID)\n    .Limit(10)\n    .GetCursor();\n\n// ... process data ...\n\nstring bookmark = cursor.GetBookMark();\n```\n\n```csharp\nvar exporter = db.People.Query()\n    .Where(x =\u003e x.City, \"Amsterdam\")\n    .Where(x =\u003e x.Age, 25)\n    .OrderByAsc(x =\u003e x.ID)\n    .Limit(10)\n    .GetExporter();\n\n// ... export data ...\n\nstring bookmark = exporter.GetBookMark();\n```\n\nThe bookmark is something like:\n```\neyJGcmFnbWVudHMiOlt7InRhYmxlTmFtZSI6InRlc3RfZXZlbnQiLCJpbmRleE5hbWUiOiJhY2NvdW50X2RhdGUiLCJQYXRoIjp7ImFjY291bnRfaWQiOiJhMTAiLCJkYXRlIjoiMjAyMC0wMS0wMSIsImFkX2lkIjoiMTAwMTIifX1dfQ==\n```\n\nThen to continue the same query, supply the bookmark using the `AfterBookmark` directive:\n\n```csharp\nvar cursor = db.People.Query()\n    .Where(x =\u003e x.City, \"Amsterdam\")\n    .Where(x =\u003e x.Age, 25)\n    .OrderByAsc(x =\u003e x.ID)\n    .Limit(10)\n    .AfterBookmark(bookmark)\n    .GetCursor();\n```\n\nIf the bookmark is `null` then it is disabled:\n```csharp\n    .AfterBookmark(null)\n```\n\n# Query plans\n\nYou can generate a user-friendly description of the query plan which would be used for your query. Example:\n\n```csharp\nvar db = new DbContext();\nvar plan = db.Metrics.Query()\n    .Where(x =\u003e x.AccountID, \"a11\")\n    .OrderByAsc(x =\u003e x.Date)\n    .OrderByAsc(x =\u003e x.Cost)\n    .FlexFilter(x =\u003e x.Impressions \u003e x.Clicks \u0026\u0026 x.Revenue \u003e 0)\n    .Limit(100)\n    .GetQueryPlan();\n\nConsole.Write(plan);\n```\n\nThe response:\n\n```\n- The default index selection mode was selected which gives priority to filtering over sorting.\n- The selected index is 'account_date'. The steps of the query are:\n    - Index levels:\n        - 1. account_id: Select one (exact match)\n        - 2. date: Sort by (full scan)\n    - Apply flex filtering.\n    - Apply the sorting directives inside the packets, which weren't used for an index level:\n        - cost\n    - Limit: The maximal number of records to return is 100\n```\n\n# Atomic operations with the OnUpdate event\n\nDuring the update of a records, there's a narrow window of time, when both the old and the new versions of a record are available in memory, and there's an exclusive lock on the packet of the records. You can exploit this opportunity by the `OnUpdate` event.\n\nYou can specify a `lambda function` as an update event handler on a transaction. It will be called during the `commit` phase, when a record you pushed has the same unique key in a packet as an existing one. (So it would need to be updated.) The return value of it is the new version of the record to be stored.\n\n```csharp\nvar db = new DbContext();\nvar transaction = db.MyTable.NewTransaction();\n\ntransaction.OnUpdate((oldRecord, newRecord) =\u003e {\n    if (newRecord.Type == MyRecordTypes.NoUpdate) {\n        // If you return null, then no changes will be made.\n        return null;\n    }\n\n    // This incrementation is an atomic change\n    newRecord.Counter = oldRecord.Counter + 1;\n    \n    return newRecord;\n});\n\n// The contents of \"records\" is probably imported from an external server.\nforeach(var record in records) {\n    ...\n    // These are the \"new records\"\n    transaction.Add(record);\n}\n\ntransaction.Commit();\n```\n\nNote that a lambda function always brings its context with it. Meaning: it can see all variables/fields that are visible inside the method you defined it. This can give great flexibility.\n\nThe return value can be of 4 kinds:\n- You can return the old record (or a modified version of it), if you would like to minimize the changes.\n- You can return the new record (or a modified version of it), if you would like to change the most of the fields.\n- You can return `null` in order to avoid any modifications done to the old record. (The new one will just be ignored and not stored anywhere.)\n- You can create a completely new record of the same type and return it.\n\nBut you can also throw an exception to stop the `commit` process. (Packets that were saved already, will remain that way.)\n\nTwo things to note:\n- Changing the fields of the `unique key` is safe. You won't have any duplicate records, no worries.\n- BUT, changing fields which are used in any of the indices defined will result in an exception. It isn't allowed to change the indexed fields inside an `OnUpdate` event, because then the records would need to be relocated into another packet, which cannot be done in an efficient way. (You can do that in application code with the combination of a `remove` and an `add` on a transaction.)\n\nIf you don't provide an OnUpdate event handler, then the default way of operation is to replace the old record with the new one. As by default FatCatDB always does an `upsert` for conflicting unique keys, but you can change this behavior with an `OnUpdate` event handler.\n\n# Making fields unchangeable using OnUpdate\n\nIn the previous section we described how to use the OnUpdate event handler in general and specifically for atomic operations.\n\nLet's say that you are importing data from an external server. You would like to insert new records and update the old ones based on the unique key of the data. One of the fields of your schema is the date of creation, called `Created`. You don't want to change that. One solution is (the bad solution) to query the existing records, modify them based on the imported data and persist the result.\n\nBut you can do this more efficiently by just pushing all your data into the table (without any previous queries), and doing the fine-tuning inside the update event handler:\n\n```csharp\nvar db = new DbContext();\nvar transaction = db.MyTable.NewTransaction();\n\ntransaction.OnUpdate((oldRecord, newRecord) =\u003e {\n    // Keep the creation date always unchanged\n    newRecord.Created = oldRecord.Created;\n    \n    return newRecord;\n});\n\nforeach(var record in importedData) {\n    ...\n    transaction.Add(record);\n}\n\ntransaction.Commit();\n```\n\n# Async support\n\nAsynchronous versions of all methods are available which are involved in input-output operations. Using async is only recommended for server applications. The only case one would use async in a console application is, when there's a source of async events, for example a fast-CGI client, or a hardware interface.\n\nExamples for the query object:\n```csharp\nawait query.FindOneAsync();\nawait query.PrintAsync();\n```\n\nAsync iteration over the cursor:\n```csharp\nwhile ((var item = await cursor.FetchNextAsync()) != null) {\n    ...\n}\n```\n\nYou can also fetch multiple items in one call:\n```csharp\nList\u003cMyRecord\u003e items = await cursor.FetchAsync(int count);\n```\n\nAsync methods for the exporter that output the data in `linear TSV` text format:\n```csharp\nawait exporter.PrintAsync();\nawait exporter.PrintToTsvWriterAsync(TsvWriter output);\nawait exporter.PrintToStreamAsync(Stream stream);\nawait exporter.PrintToFileAsync(string path);\n```\n\n# Adding new types\n\nWith FatCatDB you can use columns of arbitrary types. It's very easy to extend it. The only thing to do is to use the `TypeConverterSetup` parameter in the `OnConfiguring` event of your database context class.\n\nThe following example adds the `LocalDateTime` type of the [NodaTime](https://github.com/nodatime/nodatime) library to FatCatDB. (This type is added by default already, this is only an example. See below.)\n\n```csharp\ninternal class DbContext : DbContextBase {\n    public Table\u003cMetricsRecord\u003e Metrics { get; } = new Table\u003cMetricsRecord\u003e();\n    \n    private LocalDateTimePattern pattern = LocalDateTimePattern.CreateWithInvariantCulture(\n        \"yyyy-MM-dd HH:mm:ss\"\n    );\n\n    protected override void OnConfiguring (TypeConverterSetup typeConverterSetup, Configurator configurator) {\n        typeConverterSetup\n            .RegisterTypeConverter\u003cLocalDateTime, string\u003e((x) =\u003e {\n                return pattern.Format(x);\n            })\n            .RegisterTypeConverter\u003cstring, LocalDateTime\u003e((x) =\u003e {\n                return pattern.Parse(x).Value;\n            });\n    }\n}\n```\n\nWhen you add a new type, you alwas have to add 2 converters: one that converts to string, and another that converts back from a string. The 2 template parameters of `RegisterTypeConverter` are the source and the target type. Examples:\n\n```csharp\ntypeConverterSetup\n    .RegisterTypeConverter\u003cMyType, string\u003e((x) =\u003e {\n        return x.ConvertToString( ... );\n    })\n    .RegisterTypeConverter\u003cstring, MyType\u003e((x) =\u003e {\n        return new MyType(x);\n    });\n```\n\nBTW the `LocalDateTime` and `LocalDate` types of [NodaTime](https://github.com/nodatime/nodatime) are added by default to FatCatDB, as this library is the recommended way of dealing with time, instead of the built-in classes of `.NET`.\n\nIf you want to sort by your custom type, then it has to implement the `IComperable` interface.\n\nYou can also overwrite the built-in converters with your own ones. Just use the same `RegisterTypeConverter` method as in the above examples. You can find the built-in ones in [the constructor of TypeConverterSetup](FatCatDB/TypeConverter.cs).\n\n# Hinting the query planner\n\nThe query planner tries to select the best index to execute a query. It has two modes of operation:\n\n- `Filtering priority`: This is the default. Selects the index by first looking at the `Where` statements and just then at the sorting directives. This mode gives the best performance if only a small fraction of all records are queried, but it can happen that sorting is not possible (considering your directives and the indexed fields). In that case you get an error message.\n- `Sorting priority`: Let's say for example that you have 10 GBytes of data in a table, and you want to query the 95% of it with a complex sorting on multiple fields. In this case `sorting priority` is the best way to go (performance wise). Use it only when you need the majority of records returned, and you also have sorting directives in your query, which matches an index you defined.\n\nExample:\n```csharp\nvar cursor = db.Metrics.Query()\n    .FlexFilter(x =\u003e x.Impressions \u003e 10)\n    .OrderByAsc(x =\u003e x.AccountID)\n    .OrderByAsc(x =\u003e x.CampaignID)\n    .OrderByAsc(x =\u003e x.AdID)\n    .OrderByDesc(x =\u003e x.Date)\n    .HintIndexPriority(IndexPriority.Sorting)\n    .GetCursor();\n```\n\nYou can also hint a specific index if you know what you are doing:\n\n```csharp\nquery.HintIndex(\"index_name\")\n```\n\n# ACID and durability\n\nFatCatDB is thread safe, but provides only the `read uncommitted` isolation level for transactions. The primary usage scenario in mind is a single-threaded console application, which loads data (most likely time-series data) from multiple sources, then transforms and stores them before pushing the data to destination endpoints. Application in servers is possible (since `async` methods are provided for everything), but not recommended, because of high memory usage (packet size * concurrency) and the lack of a complete ACID support.\n\nIn an average case the schema should be the same as - or at least, it should resemble - the export format. So the data transformation ideally happens during the import and before the storage. This means that `high redundancy` in the schema is normal and expected, in contrary to relational databases.\n\nDurability is provided by two different mechanisms. The first is that the data is stored independently for each index defined. If you define 3 indices for a table, then the data is stored [redundantly 3 times](https://www.youtube.com/watch?v=XmCs-3_DGNE) on the disk in separate folder structures.\n\nThe other source of the durability is explicit, and can be enabled by a configuration setting in the [database context](#creating-a-database-context-class) class. If that setting is enabled, then instead of overwriting files, the library first creates temporary ones and then swaps them with the old ones.\n\n# Configurations\n\nDebug build for development:\n```bash\n$ cd IntegrationTests\n$ dotnet build -c Debug\n```\n\nRun the integration tests:\n```bash\n$ cd IntegrationTests\n$ dotnet publish -c Release\n$ run.sh\n```\n\nCreate package for NuGet:\n```bash\n$ cd FatCatDB\n$ dotnet build -c Release\n$ dotnet pack -c Release\n```\n\n# TODO\n\n- Add benchmarks\n- Implement tools for data recovery and maintenance\n- Extend the integration tests\n- Implement unit tests after the interfaces are finalized\n- Implement aggregation functionality\n- Implement `left join` and `inner join`\n- Implement query.Delete(), table.Truncate() and db.Drop()\n- Delete packets which became empty after removal of records.\n- Use local thread pool instead of global\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbolner%2Ffatcatdb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbolner%2Ffatcatdb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbolner%2Ffatcatdb/lists"}