{"id":21903117,"url":"https://github.com/mli0603/tatoo","last_synced_at":"2025-08-20T09:30:30.446Z","repository":{"id":174735052,"uuid":"641477146","full_name":"mli0603/TAToo","owner":"mli0603","description":"TAToo (\"Vision-based Joint Tracking of Anatomy and Tool for Skull-base Surgery\"), IPCAI 2023.","archived":false,"fork":false,"pushed_at":"2023-07-05T19:00:23.000Z","size":3905,"stargazers_count":8,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-03-29T01:06:48.177Z","etag":null,"topics":["computer-vision","deep-learning","object-tracking"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mli0603.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-05-16T14:48:21.000Z","updated_at":"2024-12-22T01:02:38.000Z","dependencies_parsed_at":null,"dependency_job_id":"30b8e833-b2a7-4ebb-8841-8f0eb51a6f51","html_url":"https://github.com/mli0603/TAToo","commit_stats":null,"previous_names":["mli0603/tatoo"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mli0603%2FTAToo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mli0603%2FTAToo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mli0603%2FTAToo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mli0603%2FTAToo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mli0603","download_url":"https://codeload.github.com/mli0603/TAToo/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249149097,"owners_count":21220652,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","deep-learning","object-tracking"],"created_at":"2024-11-28T15:25:45.147Z","updated_at":"2025-04-15T20:37:12.532Z","avatar_url":"https://github.com/mli0603.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"assets/tatoo.png\"  width=\"100\"\u003e\n\u003c/div\u003e\n\n\u003chr\u003e\n\nThis is the official repo for our work [TAToo: Vision-based Joint Tracking of Anatomy and Tool for Skull-base Surgery](https://arxiv.org/abs/2212.14131). \n\n\u003cimg src=\"assets/tatoo.gif\"  width=\"100%\"\u003e\n\n# Abstract\n\u003cimg src=\"assets/tatoo.png\" width=\"40\"\u003e (*T*racker for *A*natomy and *Too*l) jointly tracks the rigid 3D motion of the patient skull and surgical drill from stereo microscopic videos. \n\u003cimg src=\"assets/tatoo.png\" width=\"40\"\u003e leverages low-level vision signals and estimates motion via an iterative optimization process in an end-to-end differentiable form. For robust tracking performance, \u003cimg src=\"assets/tatoo.png\" width=\"40\"\u003e adopts a probabilistic formulation and enforces geometric constraints on the object level.\n\n# Updates\n- June 2nd, 2023: We have added support of [CRESTereo](https://github.com/ibaiGorordo/CREStereo-Pytorch)! See [configs/models/tatoo.py](configs/models/tatoo.py) for more details.\n- May 20th, 2023: We have deprecated the virutal dense data + real optical tracking data training scheme described in the paper. Instead, we use the [digital twin paradigm](https://arxiv.org/abs/2211.11863) to generate real dense data.\n\n# Environment setup\nWe have provided a [docker file](Dockerfile) for building docker environments. You may need to use `sudo` if your docker is not set up for all users.\n\nBuild the docker image\n```\ncd PATH_TO/TAToo\ndocker build -t tatoo .\n```\n\nCreate a docker container\n```\ndocker run -it --name tatoo_container --gpus=all --ipc=host -v PATH_TO/TAToo:/workspace/PATH_TO/TAToo\n```\n\nStart an interactive shell\n```\ndocker exec -it tatoo_container bash\n```\n\n# Pretrained weights download\nTAToo pretrained on Twin-S can be downloaded from [this link](https://drive.google.com/file/d/1k6BrwTXxfk6RN9Rsm0myCxJNMRA38G3e/view?usp=sharing).\n\n# Inference\nUse the following to run inference on stereo images\n```bash\npython scripts/inference_on_images.py --left PATTERN_LEFT --right PATTERN_RIGHT --ckpt PATH_TO_CHECKPOINT\n```\n\nTo visualize the results, use the following:\n```bash\npython scripts/visualize_output.py --video_write_dir PATH_TO_VIDEO_DIR --data_dir PATH_TO_OUTPUT --seg --flow --disp\n```\n# Training\n## Data\nWe use [HDF5 file](https://docs.h5py.org/en/stable/) to store data. The HDF5 contains the following groups:\n```\nmetadata\n|__ README\n|__ T_cb_c: calibrated transformation from camera base (cb) to camera (c)\n|__ T_db_d: calibrated transformation from drill base (db) to drill (d)\n|__ T_pb_p: calibrated transformation from phantom base (pb) to phantom (p)\n|__ baseline: camera baseline\n|__ camera_extrinsic: extrinsics to convert to OpenCV convention\n|__ camera_extrinsic: 3x3 intrinsics matrix\n\ndata\n|__ l_img: left image\n|__ r_img: right image\n|__ time: time of acquisition\n|__ segm: segmentation\n|__ depth: depth\n|__ pose_camhand: camera base pose\n|__ pose_drill: drill base pose\n|__ pose_pan: phantom base pose\n```\n \n### Sample data\nThe data can be downloaded from google drive from [this link](https://drive.google.com/drive/folders/1NOAYeb9HPq3eaDrAgAlgbsfobmPXijIO?usp=sharing).\n\nTo visualize the sample data, you can use the following script\n```bash\npython scripts/visualize_hdf5.py --base_folder PATH_TO_DATA --hdf5_file PATH_TO_HDF5\n```\n\nAlternatively, you can use a split file\n```bash\npython scripts/visualize_hdf5.py --base_folder PATH_TO_DATA --split_file PATH_TO_SPLIT_FILE\n```\n\n### Split files\nSplit files can be generated using the following\n```bash\npython scripts/generate_split.py --base_folder PATH_TO_HDF5_FOLDER --split_file SPLIT_FILE_NAME\n```\n\nAdditionally, pose augmentation via sampling and reversing the video can be done with the `--resampling` and `--reverse_order` arguments.\n\n## Launch training\nModify `configs/train_config.py` for [schedule config](configs/schedules/).\n\nRun the following command\n  - Distributed\n      ```angular2html\n      ./scripts/train.sh configs/train_config.py --work-dir PATH_TO_LOG [optional arguments]\n      ```\n  - Single GPU\n    ```angular2html\n    python train.py configs/train_config.py --work-dir PATH_TO_LOG [optional arguments]\n    ```\n- To freeze individual models, use `--freeze_stereo`, `--freeze_segmentation` or `--freeze_motion`\n\n### Evaluation\nRun following command\n  - Distributed\n      ```angular2html\n      ./scripts/inference.sh configs/inference_config.py CHECKPOINT_PATH --eval [optional arguments]\n      ```\n  - Single GPU\n      ```angular2html\n      python inference.py configs/inference_config.py CHECKPOINT_PATH --eval [optional arguments]\n      ```\n\nEvaluation can be done on partial data via the `--num-frames` argument by specifying the number of frames to inference on, `-1` for all frames.\n\n### Storing outputs for visualization\nRun following command\n  - Distributed\n      ```angular2html\n      ./scripts/inference.sh configs/inference_config.py CHECKPOINT_PATH --show [optional arguments]\n      ```\n  - Single GPU\n      ```angular2html\n      python inference.py configs/inference_config.py CHECKPOINT_PATH --show [optional arguments]\n      ```\n\nTo visualize the results, use the following:\n```bash\npython scripts/visualize_output.py --video_write_dir PATH_TO_VIDEO_DIR --data_dir PATH_TO_OUTPUT --seg --flow --disp\n```\n\n## Acknowledgements\n\u003cspan style = 'font-family:Chiller; font-size: 24px'\u003eTAToo \u003c/span\u003e is built upon the following great open-sourced projects\n- [CODD](https://github.com/facebookresearch/CODD)\n- [lietorch](https://github.com/princeton-vl/lietorch)\n- [LinkNet](https://github.com/ternaus/robot-surgery-segmentation)\n- [CREStereo](https://github.com/ibaiGorordo/CREStereo-Pytorch)\n\n\n## Citation\n\nIf you find our work relevant, please cite\n```\n@article{li2023tatoo,\n  title={Tatoo: vision-based joint tracking of anatomy and tool for skull-base surgery},\n  author={Li, Zhaoshuo and Shu, Hongchao and Liang, Ruixing and Goodridge, Anna and Sahu, Manish and Creighton, Francis X and Taylor, Russell H and Unberath, Mathias},\n  journal={International Journal of Computer Assisted Radiology and Surgery},\n  pages={1--8},\n  year={2023},\n  publisher={Springer}\n}\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmli0603%2Ftatoo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmli0603%2Ftatoo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmli0603%2Ftatoo/lists"}