{"id":21594087,"url":"https://github.com/ornl/icat","last_synced_at":"2025-04-10T23:41:05.914Z","repository":{"id":82736623,"uuid":"542278112","full_name":"ORNL/icat","owner":"ORNL","description":"Interactive machine learning dashboard for textual data exploration","archived":false,"fork":false,"pushed_at":"2025-02-24T13:34:40.000Z","size":2125,"stargazers_count":4,"open_issues_count":18,"forks_count":1,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-03-24T20:23:11.971Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://ornl.github.io/icat/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ORNL.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-09-27T20:31:39.000Z","updated_at":"2025-02-24T13:34:44.000Z","dependencies_parsed_at":"2023-11-29T16:29:52.058Z","dependency_job_id":"8afbea09-661a-40fb-b818-9c5be1dae588","html_url":"https://github.com/ORNL/icat","commit_stats":null,"previous_names":[],"tags_count":12,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ORNL%2Ficat","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ORNL%2Ficat/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ORNL%2Ficat/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ORNL%2Ficat/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ORNL","download_url":"https://codeload.github.com/ORNL/icat/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248119286,"owners_count":21050755,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-24T17:15:58.764Z","updated_at":"2025-04-10T23:41:05.906Z","avatar_url":"https://github.com/ORNL.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n    \u003cpicture\u003e\n        \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"sphinx/source/_static/icat_large_full_dark.svg\" /\u003e\n        \u003csource media=\"(prefers-color-scheme: light)\" srcset=\"sphinx/source/_static/icat_large_full_light.svg\" /\u003e\n        \u003cimg alt='ICAT logo' src=\"https://raw.githubusercontent.com/ORNL/icat/main/sphinx/source/_static/icat_large_full_light.svg\" /\u003e\n    \u003c/picture\u003e\n\u003c/p\u003e\n\n# Interactive Corpus Analysis Tool\n\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![PyPI version](https://badge.fury.io/py/icat-iml.svg)](https://badge.fury.io/py/icat-iml)\n[![tests](https://github.com/ORNL/icat/actions/workflows/tests.yml/badge.svg?branch=main)](https://github.com/ORNL/icat/actions/workflows/tests.yml)\n[![License](https://img.shields.io/pypi/l/curifactory)](https://github.com/ORNL/curifactory/blob/main/LICENSE)\n[![status](https://joss.theoj.org/papers/0528d60ff4f251069d15456fdb83bd0f/status.svg)](https://joss.theoj.org/papers/0528d60ff4f251069d15456fdb83bd0f)\n\n\n\nThe Interactive Corpus Analysis Tool (ICAT) is an interactive machine learning (IML) dashboard for unlabeled text datasets that allows a user to iteratively and visually define features, explore and label instances of their dataset, and train a logistic regression model on the fly as they do so to assist in filtering, searching, and labeling tasks.\n\n![ICAT Screenshot](https://raw.githubusercontent.com/ORNL/icat/main/sphinx/source/_static/screenshot1.png)\n\nICAT is implemented using holoviz's [panel](https://panel.holoviz.org/) library, so it can either directly be rendered like a widget in a jupyter lab instance, or incorporated as part of a standalone panel website.\n\n## Installation\n\nICAT can be installed via `pip` with:\n\n```\npip install icat-iml\n```\n\n\u003c!-- usage/examples --\u003e\n\n## Documentation\n\nThe user guide and API documentation can be found at [https://ornl.github.io/icat](https://ornl.github.io/icat).\n\n## Visualization\n\nThe primary ring visualization is called AnchorViz, a technique from IML literature. (See Chen, Nan-Chen, et al. \"[AnchorViz: Facilitating classifier error discovery through interactive semantic data exploration](https://dl.acm.org/doi/abs/10.1145/3172944.3172950)\")\n\nWe implemented an ipywidget version of AnchorViz and use it in this project, it can be found separately at [https://github.com/ORNL/ipyanchorviz](https://github.com/ORNL/ipyanchorviz)\n\n\u003c!-- documentation section --\u003e\n\n## Contributing\n\nContributions for improving ICAT are welcome! If you run into any problems, find\nbugs, or think of useful improvements and enhancements, feel free to open an\n[issue](https://github.com/ORNL/icat/issues).\n\nIf you add a feature or fix a bug yourself and want it considered for\nintegration, feel free to open a pull request with the changes. Please provide\na detailed description of what the pull request is doing and briefly list any\nsignificant changes made. If it's in regards to a specific issue, please include\nor link the issue number.\n\n## Citation\n\nTo cite usage of ICAT, please use the following bibtex:\n\n```bibtex\n@misc{doecode_105653,\n    title = {Interactive Corpus Analysis Tool},\n    author = {Martindale, Nathan and Stewart, Scott},\n    abstractNote = {The Interactive Corpus Analysis Tool (ICAT) is an interactive machine learning dashboard for unlabeled text/natural language processing datasets that allows a user to iteratively and visually define features, explore and label instances of their dataset, and simultaneously train a logistic regression model. ICAT was created to allow subject matter experts in a specific domain to directly train their own models for unlabeled datasets visually, without needing to be a machine learning expert or needing to know how to code the models themselves. This approach allows users to directly leverage the power of machine learning, but critically, also involves the user in the development of the machine learning model.},\n    year = {2023},\n    month = {apr}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fornl%2Ficat","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fornl%2Ficat","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fornl%2Ficat/lists"}