{"id":14987886,"url":"https://github.com/apache/ctakes","last_synced_at":"2025-04-04T16:08:37.394Z","repository":{"id":65467503,"uuid":"572068762","full_name":"apache/ctakes","owner":"apache","description":"Apache cTAKES is a Natural Language Processing (NLP) platform for clinical text.","archived":false,"fork":false,"pushed_at":"2025-03-18T15:11:31.000Z","size":134051,"stargazers_count":63,"open_issues_count":10,"forks_count":14,"subscribers_count":11,"default_branch":"main","last_synced_at":"2025-03-28T15:03:36.496Z","etag":null,"topics":["bioinformatics","clinical","nlp"],"latest_commit_sha":null,"homepage":"https://ctakes.apache.org","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/apache.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-11-29T13:43:35.000Z","updated_at":"2025-03-27T17:50:19.000Z","dependencies_parsed_at":"2024-08-23T14:51:46.891Z","dependency_job_id":"1909c24e-aca7-4e71-bff8-547252a4a5d1","html_url":"https://github.com/apache/ctakes","commit_stats":{"total_commits":126,"total_committers":6,"mean_commits":21.0,"dds":0.2063492063492064,"last_synced_commit":"b2fdc425889d3bf47da1b34f8e4e4157294436f7"},"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fctakes","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fctakes/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fctakes/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apache%2Fctakes/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/apache","download_url":"https://codeload.github.com/apache/ctakes/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247208143,"owners_count":20901570,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bioinformatics","clinical","nlp"],"created_at":"2024-09-24T14:15:38.415Z","updated_at":"2025-04-04T16:08:37.356Z","avatar_url":"https://github.com/apache.png","language":"Java","funding_links":[],"categories":["人工智能"],"sub_categories":[],"readme":"# Apache cTAKES™\n\n## Introduction\n\n\nThe Apache™ clinical Text Analysis and Knowledge Extraction System (cTAKES™) focuses on extracting knowledge\nfrom clinical text through Natural Language Processing (NLP) techniques.\n\ncTAKES is engineered in a modular fashion and employs leading-edge rule-based and machine learning methods.\n\ncTAKES has standard features for biomedical text processing software,\nincluding the ability to extract concepts such as symptoms, procedures, diagnoses, medications and anatomy\nwith attributes and standard codes.\n\nMore powerful components can perform tasks as complex as identifying temporal events,\ndates and times – resulting in placement of events in a patient timeline.\n\nComponents are trained on gold standards from the biomedical as well as the general domain.\nThis affords usability across different types of clinical narrative (e.g. radiology reports,\nclinical notes, discharge summaries) in various institution formats as well as other types of\nhealth-related narrative (e.g. twitter feeds), using multiple data standards (e.g. Health Level 7 (HL7),\nClinical Document Architecture (CDA), Fast Healthcare Interoperability Resources (FHIR), SNOMED-CT, RxNORM).\n\ncTAKES is the NLP platform for many initiatives across the world covering a variety of research purposes\nand large datasets.\nContributors include professionals at medical and commercial institutions, NLP and Machine Learning researchers,\nMedical Doctors, and students of many disciplines and levels.\nWe encourage people from all backgrounds to get involved! (link)\n\n\n\u003cbr\u003e\n\n## Supported Environments\n1. **Java 1.8** is required to run cTAKES versions 5.x and older. Versions 6+ require java 17.  Run this command to check your Java version:\n```\n$ java -version\n```\n2. **Maven 3** is required to build cTAKES. Run this to command to check your Maven version:\n```\n$ mvn -version\n```\n3. A license for the [Unified Medical Language System (UMLS)](https://www.nlm.nih.gov/research/umls/index.html)\n   is required to use the named entity recognition module (dictionary lookup) with the default dictionary.\n4. **Python 3** is required to use cTAKES [Python Bridge to Java (PBJ)](https://github.com/apache/ctakes/wiki/pbj_intro). \nRun this to command to check your Python version:\n```\n$ python -V\n```\n\n\n\u003cbr/\u003e\n\n\n## Getting Started\n\n### New Users\n\nThe easiest way for new users to get a jump start running cTAKES is to use the [Standard Pipeline Installation Facility](artifacts).\nThe Standard Pipeline Installation Facility is a tool that can install cTAKES configured to run the most popular cTAKES pre-built pipelines. \nYou can then use the [Piper File Submitter](https://github.com/apache/ctakes/wiki/Piper+File+Submitter) GUI to submit jobs or submit them from the command line.\n\nFor access to all cTAKES capabilities, download a [zip]() or [tar.z]() file containing a fully-built installation of the most recent cTAKES [release](https://github.com/apache/ctakes/releases).\nThen, after obtaining a UMLS license, use the [UMLS Package Fetcher](https://github.com/apache/ctakes/wiki/cTAKES+UMLS+Package+Fetcher) GUI to install a copy of the \ndefault dictionary for Named Entity Recognition (NER) using cTAKES Fast Dictionary Lookup.\n\n### New Developers\n\n__Notice:__ cTAKES 7.0.0-SNAPSHOT requires jdk 17 to build and run.\n\nAll source code for cTAKES versions 5+ is available from the [cTAKES GitHub repository](https://github.com/apache/ctakes).\n1. Clone this repository\n```\n$ git clone https://github.com/apache/ctakes.git\n```\n2. Open your local copy of the repository in an IDE of your choice.\n3. Run directly from the code (link).  \n   or\n4. Build a binary installation (link), and\n5. Run a binary installation (link). \n\n\n## More information\n\nMuch more information can be found on the [cTAKES wiki](https://github.com/apache/ctakes/wiki).\n\nYou can also write to the cTAKES user and developer mailing lists: user at ctakes.apache.org and dev at apache.ctakes.org\nand find answers to previously asked questions by searching the [user](https://lists.apache.org/list.html?user@ctakes.apache.org)\nand [developer](https://lists.apache.org/list.html?dev@ctakes.apache.org) mail archives.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapache%2Fctakes","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fapache%2Fctakes","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapache%2Fctakes/lists"}