{"id":16391744,"url":"https://github.com/williamfzc/git-file-keyword","last_synced_at":"2025-03-23T04:31:43.626Z","repository":{"id":193631339,"uuid":"687063096","full_name":"williamfzc/git-file-keyword","owner":"williamfzc","description":"Extract keywords from git history for better understanding your code files. For human and LLM.","archived":false,"fork":false,"pushed_at":"2023-11-29T06:17:08.000Z","size":67,"stargazers_count":6,"open_issues_count":5,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-14T02:40:07.720Z","etag":null,"topics":["codebase","git","llm","openai"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/williamfzc.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-09-04T14:22:51.000Z","updated_at":"2025-01-04T16:59:23.000Z","dependencies_parsed_at":"2023-10-15T05:00:27.688Z","dependency_job_id":"accca2ef-eea1-49aa-9472-8e84a28aa338","html_url":"https://github.com/williamfzc/git-file-keyword","commit_stats":null,"previous_names":["williamfzc/git-file-keyword"],"tags_count":15,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/williamfzc%2Fgit-file-keyword","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/williamfzc%2Fgit-file-keyword/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/williamfzc%2Fgit-file-keyword/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/williamfzc%2Fgit-file-keyword/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/williamfzc","download_url":"https://codeload.github.com/williamfzc/git-file-keyword/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245056889,"owners_count":20553855,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["codebase","git","llm","openai"],"created_at":"2024-10-11T04:47:13.920Z","updated_at":"2025-03-23T04:31:43.352Z","avatar_url":"https://github.com/williamfzc.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# git-file-keyword\n\n\u003e Auto-generate Code File Descriptions with Git History and LLM.\n\nExtract keywords from git history for better understanding your code files. For human and LLM.\n\n## What is it?\n\nWe use https://github.com/axios/axios.git for example.\n\nWith a simple command:\n\n```bash\ngfk --repo ./axios --include \"**/*.js\" --output_csv ./output.csv\n```\n\nYou can get a keywords list of all your code files, which is extracted from your git history:\n\n\u003cimg width=\"953\" alt=\"image\" src=\"https://github.com/williamfzc/git-file-keyword/assets/13421694/bdf3668d-f6bc-488f-b722-55ff33bccc78\"\u003e\n\nThese keywords can be used to guide developers/maintainers in understanding the potential functionality and history associated with these files.\n\nAnd, also LLM. If you provide an openai key ...\n\n```bash\ngfk --repo ../axios --include \"**/*.js\" --output_csv ./output.csv --openai_key=\"sk-***\"\n```\n\n![image](https://github.com/williamfzc/git-file-keyword/assets/13421694/a63ac735-4ec5-48eb-94cb-d1ed70879c36)\n\nYou will see the human-readable descriptions for every source files. For example:\n\n```text\nlib/core/Axios.js\n\nA core file that handles Axios configuration, interceptors, and request defaults.\n```\n\nWe used LLM as a keyword parser to analyze, organize, and summarize the functionality of each file, and present it in a human-readable format. \n\n## Usage\n\n```commandline\npip3 install git-file-keyword\n```\n\n### In terminal\n\n```commandline\ngfk --repo ../axios --include \"**/*.js\" --output_csv ./output.csv --openai_key=\"sk-***\"\n```\n\nOf course, there will be a significant number of meaningless phrases in the commit records. \nWhile we have utilized extensive existing stop-word libraries to address some of them, the same words may carry different meanings in different repositories, and there is no universal solution.\n\nSo you can simply exclude them by adding `your_stopwords.txt` file:\n\n```text\nstop_word1\nstop_word2\nstop_word3\n```\n\nAnd add it to your command:\n\n```commandline\n--stopword_txt your_stopwords.txt\n```\n\n### As a lib\n\nWe provided some examples:\n\n- [example/diff.py](example/diff.py): Get diff files and extract what they actually mean\n- [example/stopword_extractor.py](example/stopword_extractor.py): Extract global keywords\n- [git_file_keyword/cli/__init__.py](git_file_keyword/cli/__init__.py): Our cmd client\n\n## Motivation\n\n- Automatic maintenance of an always up-to-date document.\n- By extracting sufficient business context from the git history, git-file-keyword allows developers and LLMs to quickly understand the meaning behind each code file at a lower cost.\n- Enable clear positive feedback loops within the team through the use of commit messages.\n\n## How it works?\n\ngfk consists of 3 layers:\n\n- Word extractor: extract words from git history and related platforms like JIRA\n- Keyword finder: find the keywords from words\n- LLM connector: prompt and communication with llm\n\n## Contribution\n\nIssues and PRs are always welcome :)\n\n## License\n\n[Apache 2.0](LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwilliamfzc%2Fgit-file-keyword","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwilliamfzc%2Fgit-file-keyword","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwilliamfzc%2Fgit-file-keyword/lists"}