{"id":34039081,"url":"https://github.com/mark-hoffmann/icd","last_synced_at":"2026-04-02T02:09:25.428Z","repository":{"id":52454501,"uuid":"106884565","full_name":"mark-hoffmann/icd","owner":"mark-hoffmann","description":"Tools for working with icd codes and comorbidities","archived":false,"fork":false,"pushed_at":"2021-04-28T20:09:14.000Z","size":20,"stargazers_count":26,"open_issues_count":2,"forks_count":8,"subscribers_count":1,"default_branch":"master","last_synced_at":"2026-03-19T06:56:10.767Z","etag":null,"topics":["data-analysis","icd","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mark-hoffmann.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-10-14T00:14:25.000Z","updated_at":"2025-11-02T19:45:13.000Z","dependencies_parsed_at":"2022-09-18T15:23:17.628Z","dependency_job_id":null,"html_url":"https://github.com/mark-hoffmann/icd","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mark-hoffmann/icd","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mark-hoffmann%2Ficd","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mark-hoffmann%2Ficd/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mark-hoffmann%2Ficd/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mark-hoffmann%2Ficd/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mark-hoffmann","download_url":"https://codeload.github.com/mark-hoffmann/icd/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mark-hoffmann%2Ficd/sbom","scorecard":{"id":619593,"data":{"date":"2025-08-11","repo":{"name":"github.com/mark-hoffmann/icd","commit":"146b36d7ceb3f0dfd241b94f16fbf5df417cd0e3"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3,"checks":[{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Code-Review","score":0,"reason":"Found 0/20 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE.txt:0","Info: FSF or OSI recognized license: MIT License: LICENSE.txt:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 3 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-21T04:53:46.621Z","repository_id":52454501,"created_at":"2025-08-21T04:53:46.621Z","updated_at":"2025-08-21T04:53:46.621Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31294404,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-02T01:43:37.129Z","status":"online","status_checked_at":"2026-04-02T02:00:08.535Z","response_time":89,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-analysis","icd","python"],"created_at":"2025-12-13T21:22:16.423Z","updated_at":"2026-04-02T02:09:25.416Z","avatar_url":"https://github.com/mark-hoffmann.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"icd\n===\n\n.. image:: https://img.shields.io/pypi/v/icd.svg\n    :target: https://pypi.python.org/pypi/icd\n    :alt: Latest PyPI version\n\n.. image:: https://travis-ci.org/mark-hoffmann/icd.png\n   :target: https://travis-ci.org/mark-hoffmann/icd\n   :alt: Latest Travis CI build status\n\n.. image:: https://codecov.io/gh/mark-hoffmann/icd/branch/master/graph/badge.svg\n   :target: https://codecov.io/gh/mark-hoffmann/icd\n   :alt: Coverage\n\nTools for working with icd codes and comorbidities. This was inspired by the R package, `icd \u003chttps://cran.r-project.org/web/packages/icd/index.html\u003e`_, as a simple python implementation for some of the base functionality. This has been benchmarked to be able to hand large datasets (tens of millions of rows) for various icd code manipulation tasks.\n\nIf you would be interested in helping contribute to this repository, feel free to `send me an email \u003cmarkkhoffmann@gmail.com\u003e`_.\n\nUsage\n-----\nBasic usage includes two very common tasks while dealing with icd code data. \n\n- Transforming datasets from a long to wide format\n- Processing icd codes for known comorbidity mappings\n\n|\n|\n\n**Transforming from long to wide**\n\n\nData is commonly in a long format that may have a key for an individual such as *person_id* with many claims *claim_id* belonging to it. \n\nFor example:\n\n+------------+------------+-----------+------------+------------+\n| claim_id   | person_id  | icd_cd_1  |  icd_cd_2  |  icd_cd_3  |\n+============+============+===========+============+============+\n|    001     |    A       | code_6    |  code_2    |            |\n+------------+------------+-----------+------------+------------+\n|    002     |    A       | code_8    |            |            |\n+------------+------------+-----------+------------+------------+\n|    003     |    A       | code_3    |  code_2    |  code_6    |\n+------------+------------+-----------+------------+------------+\n|    004     |    B       | code_1    |            |            |\n+------------+------------+-----------+------------+------------+\n|    005     |    B       | code_2    |  code_3    |            |\n+------------+------------+-----------+------------+------------+\n|    006     |    C       | code_4    |  code_2    |  code_5    |\n+------------+------------+-----------+------------+------------+\n\nFor easier processing, we must transform the table into a more collapsed version. The number of *icd* columns then becomes the maximum unique codes for any given *person_id*.\n\nSuch as:\n\n+------------+-----------+------------+------------+------------+\n| person_id  | icd_cd_1  |  icd_cd_2  |  icd_cd_3  |   icd_cd_4 |\n+============+===========+============+============+============+\n|    A       |  code_6   | code_2     |  code_8    |    code_3  |\n+------------+-----------+------------+------------+------------+\n|    B       |  code_1   | code_2     |  code_3    |            |\n+------------+-----------+------------+------------+------------+\n|    C       |  code_4   | code_2     |  code_5    |            |\n+------------+-----------+------------+------------+------------+\n\nTo accomplish this task, simply use the function *long_to_short_transformation* as such:\n\n.. code-block:: python\n  \n  import pandas as pd \n  import icd\n\n  data = {\"person_id\":[1,1,1,2,2,3],\n           \"dx_1\":[\"F11\",\"E40\",\"\",\"F32\",\"C77\",\"G10\"],\n           \"dx_2\":[\"F1P\",\"E400\",\"\",\"F322\",\"C737\",\"\"]}\n  df = pd.DataFrame.from_dict(data)\n  icd.long_to_short_transformation(df,\"person_id\",[\"dx_1\",\"dx_2\"])\n\nWhere *df* is your pandas dataframe, *\"person_id\"* is the column you want to roll up on, and *[\"dx_1\",\"dx_2\"]* is the array of columns that contain icd codes.\n\nIt is important to note that even if you only have one icd column, it **must still be an array**. Also, you must **impute NaN values** to be an **empty string** such as \"\".\n\nThe function will return a new dataframe with index of *person_id*, a column of *person_id*, as well as as many unique columns as needed in the following form *icd_0*, *icd_1*, ... , *icd_n*.\n\n|\n|\n\n**Processing icd codes to known comorbidities**\n\nThe second task has to do with actually mapping comorbidities to these icd codes. For this, you can use the function *icd_to_comorbidities*. This can be seen from going from a table of the format:\n\n+------------+-----------+------------+------------+------------+\n| person_id  | icd_cd_1  |  icd_cd_2  |  icd_cd_3  |   icd_cd_4 |\n+============+===========+============+============+============+\n|    A       |  code_6   | code_2     |  code_8    |    code_3  |\n+------------+-----------+------------+------------+------------+\n|    B       |  code_1   | code_2     |  code_3    |            |\n+------------+-----------+------------+------------+------------+\n|    C       |  code_4   | code_2     |  code_5    |            |\n+------------+-----------+------------+------------+------------+\n\nTo the format:\n\n+------------+-----------+------------+------------+------------+\n| person_id  | comorb_1  |  comorb_2  |  comorb_3  |   comorb_4 |\n+============+===========+============+============+============+\n|    A       |  True     | False      |  True      |    True    |\n+------------+-----------+------------+------------+------------+\n|    B       |  False    | True       |  False     |     False  |\n+------------+-----------+------------+------------+------------+\n|    C       |  False    | False      |  False     |   False    |\n+------------+-----------+------------+------------+------------+\n\nThis comorbidity mapping is pending on the mapping used.\n\n|\n\nAn example of doing is is carried out as such:\n\n.. code-block:: python\n\n  import pandas as pd\n  import icd\n\n  df = pd.DataFrame.from_dict({'icd_0': {1: 'F1P', 2: 'F322', 3: ''},\n\t\t               'icd_1': {1: 'F11', 2: 'C77', 3: 'G10'},\n\t\t\t       'icd_2': {1: '', 2: 'C737', 3: ''},\n\t\t\t       'icd_3': {1: 'E400', 2: 'F32', 3: ''},\n\t\t               'icd_4': {1: 'E40', 2: '', 3: ''},\n\t\t\t       'person_id': {1: 1, 2: 2, 3: 3}})\n  icd.icd_to_comorbidities(df, \"person_id\", [\"icd_0\",\"icd_1\",\"icd_2\",\"icd_3\",\"icd_4\"])\n\n|\n\nThe default default mapping is the *quan_elixhauser10*, which is a transcription by Quan of the original Elixhauser icd 9 comorbidities in the `following paper \u003chttps://www.ncbi.nlm.nih.gov/pubmed/16224307\u003e`_.\n\nOptionally, you can provide a *mapping* keyword argument as such:\n\n.. code-block:: python\n\n  icd.icd_to_comorbidities(df, \"person_id\", [\"icd_0\",\"icd_1\",\"icd_2\",\"icd_3\",\"icd_4\"], mapping=\"quan_elixhauser10\")\n\nThe currently supported mappings are the default *\"quan_elixhauser10\"* as well as the *\"charlson10\"* mapping as referenced from the same paper above. Additionally, you can find them laid out in SAS code `here \u003chttp://web.archive.org/web/20110225042437/http://www.chaps.ucalgary.ca/sas\u003e`_.\n\n\nIf you want to to create a custom comborbidity mapping, simply pass in a dict for the mapping argument instead of a supported keyword string. The dict must follow the following format as such:\n\n.. code-block:: python\n\n  custom_mapping = {\"paraplegia_and_hemiplegia\":['G81','G82','G041','G114','G801','G802','G830','G831','G832','G833','G834','G839'],\n\t\t\t\t    \"renal_disease\":['N18','N19','N052','N053','N054','N055','N056','N057','N250','I120','I131','N032','N033','N034','N035','N036','N037','Z490','Z491','Z492','Z940','Z992'],\n\t\t\t\t    \"cancer\":['C00','C01','C02','C03','C04','C05','C06','C07','C08','C09','C10','C11','C12','C13','C14','C15','C16','C17','C18','C19','C20','C21','C22','C23','C24','C25','C26','C30','C31','C32','C33','C34','C37','C38','C39','C40','C41','C43','C45','C46','C47','C48','C49','C50','C51','C52','C53','C54','C55','C56','C57','C58','C60','C61','C62','C63','C64','C65','C66','C67','C68','C69','C70','C71','C72','C73','C74','C75','C76','C81','C82','C83','C84','C85','C88','C90','C91','C92','C93','C94','C95','C96','C97'],\n\t\t\t\t    \"moderate_or_sever_liver_disease\":['K704','K711','K721','K729','K765','K766','K767','I850','I859','I864','I982'],\n\t\t\t\t    \"metastitic_carcinoma\":['C77','C78','C79','C80'],\n\t\t\t\t    \"aids_hiv\":['B20','B21','B22','B24']\n\t\t\t\t  }\n  icd.icd_to_comorbidities(df, \"person_id\", [\"icd_0\",\"icd_1\",\"icd_2\",\"icd_3\",\"icd_4\"], mapping=custom_mapping)\n\nThe above function returns a new DataFrame with the *person_id* values as the index, a column of whatever \"person_id\" string is passed in, along with a column for every comorbidity populated with either **True** or **False**.\n\nInstallation\n------------\n\nicd can easily be downloaded from Pypi package index via the following:\n\n.. code-block:: python\n\n  pip install icd\n\n\n\nRequirements\n^^^^^^^^^^^^\n- `pandas \u003chttps://github.com/pandas-dev/pandas\u003e`_\n\nCompatibility\n-------------\n\nicd currently supports Python 3.4, 3.5, and 3.6\n\nLicence\n-------\n\n`MIT \u003chttps://github.com/mark-hoffmann/icd/blob/master/LICENSE.txt\u003e`_\n\nAuthors\n-------\n\n`icd` was written by `Mark Hoffmann \u003cmarkkhoffmann@gmail.com\u003e`_.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmark-hoffmann%2Ficd","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmark-hoffmann%2Ficd","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmark-hoffmann%2Ficd/lists"}