{"id":13562686,"url":"https://github.com/jmforsythe/Git-Heat-Map","last_synced_at":"2025-04-03T19:31:22.943Z","repository":{"id":65553339,"uuid":"579051600","full_name":"jmforsythe/Git-Heat-Map","owner":"jmforsythe","description":"Visualise a git repository by diff activity","archived":false,"fork":false,"pushed_at":"2023-12-02T01:16:53.000Z","size":13711,"stargazers_count":974,"open_issues_count":2,"forks_count":39,"subscribers_count":5,"default_branch":"master","last_synced_at":"2024-02-17T12:34:39.997Z","etag":null,"topics":["database","git","python","treemap"],"latest_commit_sha":null,"homepage":"http://heatmap.jonathanforsythe.co.uk","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jmforsythe.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-12-16T14:36:36.000Z","updated_at":"2024-05-30T06:29:21.080Z","dependencies_parsed_at":"2024-05-30T06:44:44.736Z","dependency_job_id":null,"html_url":"https://github.com/jmforsythe/Git-Heat-Map","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmforsythe%2FGit-Heat-Map","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmforsythe%2FGit-Heat-Map/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmforsythe%2FGit-Heat-Map/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jmforsythe%2FGit-Heat-Map/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jmforsythe","download_url":"https://codeload.github.com/jmforsythe/Git-Heat-Map/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223019754,"owners_count":17074674,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["database","git","python","treemap"],"created_at":"2024-08-01T13:01:11.192Z","updated_at":"2024-11-04T15:30:30.355Z","avatar_url":"https://github.com/jmforsythe.png","language":"JavaScript","funding_links":[],"categories":["JavaScript"],"sub_categories":[],"readme":"# Git-Heat-Map\n\n![Map showing the files in cpython that Guido van Rossum changed the most](img/example_image.png)\n*Map showing the files in cpython that Guido van Rossum changed the most;\nfull SVG image available in repo*\n\n## Now with file extension based highlighting\n## Now with submodule support\n## Website now available\n\nA version of this program is now available for use at [heatmap.jonathanforsythe.co.uk](https://heatmap.jonathanforsythe.co.uk)\n\n## Basic use guide\n\n* Generate database with `python generate_db.py {path_to_repo_dir}`\n* Create virtual environment with `python -m venv .` and install required modules with `pip install -r requirements.txt`\n* Run web server with `python app.py` or `flask run` (`flask run --host=\u003cip\u003e` to run on that ip address, with `0.0.0.0` being used for all addresses on that machine)\n* Connect on `127.0.0.1:5000`\n* Available repos will be displayed, select the one you want to view\n* Add emails, commits, filenames, and date ranges you want to highlight\n  * The \"browse\" buttons allow the user to see a list of valid values\n  * Alternatively valid [sqlite](https://www.sqlite.org/lang_expr.html#:~:text=The%20LIKE%20operator%20does%20a,more%20characters%20in%20the%20string.) patterns can be passed in\n* Clicking on any of these entries will cause the query to exclude results matching that entry\n* By default highlight hue is determined by file extensions but this can be manually overridden\n* Options affecting performance are levels of text to render, and minimum size of boxes rendered\n* Press submit query to update which files are highlighted\n* Press refresh to update highlighting hue and redraw based on window size\n* Click on directories to zoom in, and the back button in the sidebar to zoom out\n\n## Project Structure\n\nThis project consists of two parts:\n\n1. Git log -\u003e database\n2. Database -\u003e treemap\n\n### Git log -\u003e database\n\nScans through an entire git history using `git log`, and creates a database using three tables:\n* *Files*, which just keeps track of filenames\n* *Commits*, which stores commit hash, author, committer\n* *CommitFile*, which stores an instance of a certain file being changed by a certain commit, and tracks how many lines were added/removed by that commit\n* *Author*, which stores an author name and email\n* *CommitAuthor*, which links commits and Author in order to support coauthors on commits\n\nUsing these we can keep track of which files/commits changed the repository the most, which in itself can provide useful insight\n\n### Database -\u003e treemap\n\nTaking the database above, uses an SQL query to generate a JSON object with the following structure: \n```\ndirectory:\n  \"name\": \u003cDirectory name\u003e\n  \"val\": \u003cSum of sizes of children\u003e\n  \"children\": [\u003cdirectory or file\u003e, ...]\n\nfile:\n  \"name\": \u003cFile name\u003e\n  \"val\": \u003cTotal number of line changes for this file over all commits\u003e\n```\nthen uses this to generate an inline svg image representing a [treemap](https://en.wikipedia.org/wiki/Treemapping \"Wikipedia: Treemapping\") of the file system, with the size of each rectangle being the `val` described above.\n\nThen generates a second JSON object in a similar manner to above, but filtering for the things we want (only certain emails, date ranges, etc), then uses this to highlight the rectangles in varying intensity based on the `val`s returned eg highlighting the files changed most by a certain author.\n\n## Performance\nThese speeds were attained on my personal computer.\n### Database generation\n\n| Repo | Number of commits | Git log time | Git log size | Database time | Database size | **Total time** |\n| --- | --- | --- | --- | --- | --- | --- |\n| [linux](https://github.com/torvalds/linux) | 1,154,884 | 60 minutes | 444MB | 462.618 seconds | 733MB | **68 minutes** |\n| [cpython](https://github.com/python/cpython) | 115,874 | 4.6 minutes | 44.6MB | 36.607 seconds | 74.3MB | **5.2 minutes** |\n\nTime taken seems to scale linearly, going through approximately 300 commits/second, or requiring 0.0033 seconds/commit.\nDatabase size also scales linearly, with approximately 2600 commits/MB, or requiring 384 B/commit.\n\n### Querying database and displaying treemap\n\nFor this test I filtered each repo by its most prominent authors:\n\n| Repo | Author filter | Drawing treemap time | Highlighting treemap time |\n| --- | --- | --- | --- |\n| linux | torvalds@linux-foundation.org | 19.7 s | 54.3 s |\n| cpython | guido@python.org | 842 ms | 1238 ms |\n\nThese times are with `minimum size drawn = 0`, on very large repositories, so the performance is not completely unreasonable. This does not include the time for the browser to actually render the svg, which can take longer.\n\n## Wanted features\n\n### Faster database generation\nCurrently done using git log which can take a very long time for large repos. Will look into any other ways of getting needed information on files.\n\n### Multiple filters per query\nCurrently the user can submit only a single query for the highlighting. Ideally they could have a separate filter dictating which boxes to draw in the first place, and possibly multiple filters that could result in multiple colour highlighting on the same image.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjmforsythe%2FGit-Heat-Map","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjmforsythe%2FGit-Heat-Map","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjmforsythe%2FGit-Heat-Map/lists"}