{"id":16305435,"url":"https://github.com/moznion/git-theseus","last_synced_at":"2025-07-18T05:06:48.730Z","repository":{"id":196996711,"uuid":"697772478","full_name":"moznion/git-theseus","owner":"moznion","description":"A git tool to restore the commit logs","archived":false,"fork":false,"pushed_at":"2024-03-31T17:45:03.000Z","size":26,"stargazers_count":12,"open_issues_count":2,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-10-18T00:05:20.297Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/moznion.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2023-09-28T12:53:54.000Z","updated_at":"2024-02-10T08:50:43.000Z","dependencies_parsed_at":"2024-02-26T06:32:41.275Z","dependency_job_id":"b890b603-139e-4c2d-a66d-50fb2f056267","html_url":"https://github.com/moznion/git-theseus","commit_stats":null,"previous_names":["moznion/git-theseus"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/moznion%2Fgit-theseus","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/moznion%2Fgit-theseus/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/moznion%2Fgit-theseus/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/moznion%2Fgit-theseus/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/moznion","download_url":"https://codeload.github.com/moznion/git-theseus/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":221839057,"owners_count":16889573,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-10T21:07:01.409Z","updated_at":"2024-10-28T14:16:14.574Z","avatar_url":"https://github.com/moznion.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# git-theseus [![.github/workflows/check.yml](https://github.com/moznion/git-theseus/actions/workflows/check.yml/badge.svg)](https://github.com/moznion/git-theseus/actions/workflows/check.yml)\n\nA git tool to reconstruct the commit logs.\n\n## Usage\n\n```\n$ git theseus -h\nUsage of /usr/local/bin/git-theseus:\n  -dryrun\n        a parameter to instruct it to run as dryrun mode (i.e. no destructive operation on git)\n  -input-file string\n        [mandatory] a file path to the JSON file\n```\n\nthe schema of the input JSON file is like the following:\n\n```\n{\n  \"${git-commit-id}\": {\n    \"${target/file/path}\": [ ${line-num} ],\n    ...\n  },\n  ...\n}\n```\n\nexample is here: [git-theseus.example.json](./git-theseus.example.json)\n\n## Motivation\n\nWhen using code transformation tools (e.g., [decaffeinate/decaffeinate](https://github.com/decaffeinate/decaffeinate)) and/or code formatters, a well-known issue arises: git commit logs for the modified code can become cluttered with messages like \"transformed!\" or \"formatted!\", obscuring the original reasons for changes. This makes it challenging to track the rationale behind each modification on a line-by-line basis after the code has been transformed, as the original commit messages are lost.\n\nThis tool is designed to address these problems by mapping git commits line-by-line from the original to the transformed files and reconstructing the git commit history using this mapped information.\n\n## Example\n\nThe example project is here: [git-theseus-test-repo](./git-theseus-test-repo)\n\nFor example, let's think doing the code transformation.\n\n`foo` file is like the following and it has the commit history:\n\n```\n$ git -P blame foo\n^b36384d (moznion 2023-09-27 18:56:49 +0900 1) 1\n^b36384d (moznion 2023-09-27 18:56:49 +0900 2) 2\n^b36384d (moznion 2023-09-27 18:56:49 +0900 3) 3\n7b052155 (moznion 2023-09-27 18:57:38 +0900 4) 4\n9c4fe1bc (dummy   2023-09-27 18:57:22 +0900 5) 5\n9c4fe1bc (dummy   2023-09-27 18:57:22 +0900 6) 6\n```\n\nAnd the git commit history is like:\n\n```\n...\n7b05215 Third commit\n9c4fe1b Second commit\nb36384d First commit\n```\n\nAfter the code transformation, the file `foo_new`, which was transformed from `foo`, is as follows:\n\n**foo_new:**\n\n```\n1-A # original file's line is 1\n1-B # original file's line is 1\n2-A # original file's line is 2\n3-A # original file's line is 3\n4-A # original file's line is 4\n5-A # original file's line is 5\n5-B # original file's line is 5\n6-A # original file's line is 6\n```\n\nIn this case, the JSON input file would appear as follows:\n\n```\n{\n  \"b36384d2da65869dce07f09c204d2e5407ee0dad\": {\n    \"foo_new\": [1, 2, 3, 4]\n  },\n  \"9c4fe1bc69832dd26f980c2c8530964d32d1e98b\": {\n    \"foo_new\": [6, 7, 8]\n  },\n  \"7b0521555ba48ccc561dada09b2baf7039f87234\": {\n    \"foo_new\": [5]\n  }\n}\n```\n\nand after running `git-theseus`, the git-blame output for `foo-new` appears like bellow:\n\n```\n538f95d5 (moznion 2023-09-27 18:56:49 +0900 1) 1-A # original file's line is 1\n538f95d5 (moznion 2023-09-27 18:56:49 +0900 2) 1-B # original file's line is 1\n538f95d5 (moznion 2023-09-27 18:56:49 +0900 3) 2-A # original file's line is 2\n538f95d5 (moznion 2023-09-27 18:56:49 +0900 4) 3-A # original file's line is 3\n0a37f199 (moznion 2023-09-27 18:57:38 +0900 5) 4-A # original file's line is 4\na3099a67 (dummy   2023-09-27 18:57:22 +0900 6) 5-A # original file's line is 5\na3099a67 (dummy   2023-09-27 18:57:22 +0900 7) 5-B # original file's line is 5\na3099a67 (dummy   2023-09-27 18:57:22 +0900 8) 6-A # original file's line is 6\n```\n\nand detailed each commit is:\n\n```\ncommit 51b51c0e7cd51abce2520109288d63d554209aa9 (HEAD -\u003e main)\nAuthor:     moznion \u003cmoznion@mail.moznion.net\u003e\nAuthorDate: Wed Sep 27 18:57:38 2023 +0900\nCommit:     moznion \u003cmoznion@mail.moznion.net\u003e\nCommitDate: Sat Feb 3 20:04:21 2024 -0800\n\n    [git-theseus] Third commit\n\n    git-theseus does this migration commit.\n    The original commit is 7b0521555ba48ccc561dada09b2baf7039f87234\n\ncommit 52f2eb5ab13e50ef19a98cdfeb7398e65564ecc7\nAuthor:     dummy \u003cdummy@example.com\u003e\nAuthorDate: Wed Sep 27 18:57:22 2023 +0900\nCommit:     moznion \u003cmoznion@mail.moznion.net\u003e\nCommitDate: Sat Feb 3 20:04:21 2024 -0800\n\n    [git-theseus] Second commit\n\n    git-theseus does this migration commit.\n    The original commit is 9c4fe1bc69832dd26f980c2c8530964d32d1e98b\n\ncommit d278a77d2054c633c46b7fb2f474f9a96f4b9056\nAuthor:     moznion \u003cmoznion@mail.moznion.net\u003e\nAuthorDate: Wed Sep 27 18:56:49 2023 +0900\nCommit:     moznion \u003cmoznion@mail.moznion.net\u003e\nCommitDate: Sat Feb 3 20:04:21 2024 -0800\n\n    [git-theseus] First commit\n\n    git-theseus does this migration commit.\n    The original commit is b36384d2da65869dce07f09c204d2e5407ee0dad\n\n```\n\nAs you can see, it restored the commit logs associated with the original file's changes, line by line, along with the original author information.\n\n## How does it work\n\n1. Collect the commit hashes from an input JSON file and sort them by commit order, starting with the oldest.\n2. Load the contents of the files described in the input JSON file.\n3. Starting with the oldest commit, apply the following processing steps:\n   1. Look up the files and their related line numbers in the JSON using the commit hash.\n   2. For each file, perform the following:\n      1. Accumulate the file lines that are associated with the looked-up line numbers or the lines processed in the previous iteration.\n      2. Write the accumulated lines to the specified file path.\n      3. Execute \"git add\" for the file.\n   3. Extract the original commit log using the commit hash and interpolate it into the commit message template.\n   4. Execute \"git commit\" for the added files.\n4. Finally, as a precaution, restore the original contents to the files.\n\n\n## Author\n\nmoznion (\u003cmoznion@mail.moznion.net\u003e)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmoznion%2Fgit-theseus","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmoznion%2Fgit-theseus","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmoznion%2Fgit-theseus/lists"}