{"id":21926652,"url":"https://github.com/plainas/cdi_cardinality","last_synced_at":"2025-03-22T11:49:06.086Z","repository":{"id":65854160,"uuid":"145343327","full_name":"plainas/cdi_cardinality","owner":"plainas","description":"Get the carnality of each column on a CSV file","archived":false,"fork":false,"pushed_at":"2018-09-17T22:58:06.000Z","size":70,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-01-27T11:23:47.298Z","etag":null,"topics":["cardinality","cli","csv","python","statistics"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/plainas.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-08-19T22:56:33.000Z","updated_at":"2023-02-14T03:55:54.000Z","dependencies_parsed_at":"2023-02-14T04:25:13.129Z","dependency_job_id":null,"html_url":"https://github.com/plainas/cdi_cardinality","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plainas%2Fcdi_cardinality","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plainas%2Fcdi_cardinality/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plainas%2Fcdi_cardinality/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plainas%2Fcdi_cardinality/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/plainas","download_url":"https://codeload.github.com/plainas/cdi_cardinality/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244952771,"owners_count":20537472,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cardinality","cli","csv","python","statistics"],"created_at":"2024-11-28T22:09:36.576Z","updated_at":"2025-03-22T11:49:06.067Z","avatar_url":"https://github.com/plainas.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# cdi_cardinality\n\nPrints the cardinality of each column of a csv file.\n\n## Installation\n\n```\nsudo pip3 isntall https://github.com/plainas/cdi_cardinality/zipball/master\n```\n\nThis package is not listed in python package index (pypy). I do **not** intend to submit it. This code is shared as is. If you would like to have this available on pypi or if you would like a certain feature, you are encouraged to fork this project.\n\n## usage\n\n```\ncdi_cardinality [-h] [-m MAX] [-n] [-v] filename\n```\n\nExamples:\n\n```\n\u003e cdi_cardinality towed-vehicles.csv\nTow_Date 91\nMake 76\nStyle 33\nModel 69\nColor 26\nPlate 4025\nState 39\nTowed_to_Address 5\nTow_Facility_Phone 7\nInventory_Number 4816\n```\n\nColumn names will have spaces replaced by underscores for easier integration with other command line tools such as awk, sed, grep, etc.\n\nFor readability by humans, the `-v` or `--valign` switch can be helpful.\n\n```\n\u003e cdi_cardinality -v towed-vehicles.csv\nTow_Date           91\nMake               76\nStyle              33\nModel              69\nColor              26\nPlate              4025\nState              39\nTowed_to_Address   5\nTow_Facility_Phone 7\nInventory_Number   4816\n```\n\nMemory usage is a linear function of column cardinality. If you are working with very large files you can mitigate this problem by setting an upper limit for cardinality which will prevent filling up the memory with new values once the given limit is reached for each colum.\n\nThis is achieved by passing a `-m` or `--max` parameter. \n\n```\n\u003ecdi_cardinality -v -m 100  towed-vehicles.csv\nTow_Date           91\nMake               76\nStyle              33\nModel              69\nColor              26\nPlate              100\nState              39\nTowed_to_Address   5\nTow_Facility_Phone 7\nInventory_Number   100\n```\n\nNotice that the columns `Plate` and `Inventory_Number` display the specified maximum rather than their actual cardinality.\n\n\n## Command arguments\n\n```\npositional arguments:\n  filename\n\noptional arguments:\n  -h, --help         show this help message and exit\n  -m MAX, --max MAX  Specify maximum count value.\n  -n, --no-header    Read values from the first line of the input. Use this is\n                     the input has no header.\n  -v, --valign       Vertically align columns on the output for beter human\n                     readability\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fplainas%2Fcdi_cardinality","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fplainas%2Fcdi_cardinality","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fplainas%2Fcdi_cardinality/lists"}