{"id":20526419,"url":"https://github.com/fmarotta/blort","last_synced_at":"2025-10-12T19:54:45.501Z","repository":{"id":110570407,"uuid":"283777289","full_name":"fmarotta/blort","owner":"fmarotta","description":"A new sorting algorithm optimised for files made of 'blocks'. A block consists of all the contiguous rows that have the same value in the first field.","archived":false,"fork":false,"pushed_at":"2021-03-21T15:11:29.000Z","size":10,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-01-16T11:27:03.432Z","etag":null,"topics":["algorithm","sort"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fmarotta.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-07-30T13:05:35.000Z","updated_at":"2022-05-24T11:51:09.000Z","dependencies_parsed_at":"2023-03-14T02:30:40.500Z","dependency_job_id":null,"html_url":"https://github.com/fmarotta/blort","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fmarotta%2Fblort","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fmarotta%2Fblort/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fmarotta%2Fblort/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fmarotta%2Fblort/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fmarotta","download_url":"https://codeload.github.com/fmarotta/blort/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":242133430,"owners_count":20077095,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["algorithm","sort"],"created_at":"2024-11-15T23:14:05.938Z","updated_at":"2025-10-12T19:54:45.405Z","avatar_url":"https://github.com/fmarotta.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# blort\n\nA sorting algorithm optimised for files made of 'blocks'. A block consists of\nall the contiguous rows that have the same value in the first field.\n\nThis program efficiently sorts files where there are several consecutive lines\nwith the same key. Each block of lines with the same key is considered as a\nsingle item as far as the sorting is concerned. This can dramatically reduce\nthe number of comparisons, as well as the I/O operations, with respect to GNU\nsort, thereby speeding up the process. The drawback is that it is not possible\nto sort piped files.\n\nThe time complexity of the algorithm is `O(N + M log(M))` where N is the\nsize of the file and M is the number of different blocks. Usually `M \u003c\u003c\nN.`\n\n## Compilation\n\nFrom the root directory of the repository, compile the program with\n\n```\ngcc blort.c blarr.c -o blort\n```\n\n(Optional) make it executable with\n\n```\nchmod u+x blort\n```\n\n## Execution\n\n```\nblort \u003cSOURCE\u003e [\u003cTARGET\u003e]\n```\n\nwhere:\n\n* _SOURCE_ is the file to be sorted; it should have a block structure,\nwhere the first field is the same for a lot of consecutive lines before\nchanging. The first field is also the key for the sorting.\n\n* _TARGET_ is the name of the output file. If omitted, the sorted file\nis printed directly to standard output.\n\nNOTE that the blort executable should be either in the working directory\nor in a directory listed in the PATH environmental variable. Otherwise,\nspecify the full path to it.\n\n## TODO (notes to self)\n\n* multi-thread reading/writing (e.g. divide the file in two, and use\n  fseek to assign each part to one thread. Also see\n  https://www.codeproject.com/Questions/1251979/Multi-threading-to-read-write-file)\n\n* Support a field different from the first\n\n* Support numeric sorting etc.\n\n* Ignore header\n\n## Bug reports\n\nfedericomarotta AT mail DOT com\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffmarotta%2Fblort","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffmarotta%2Fblort","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffmarotta%2Fblort/lists"}