{"id":22735464,"url":"https://github.com/marcos-venicius/indexme","last_synced_at":"2025-09-08T23:43:32.177Z","repository":{"id":264447776,"uuid":"886335330","full_name":"marcos-venicius/indexme","owner":"marcos-venicius","description":"Trying to build a \"mini google\"","archived":false,"fork":false,"pushed_at":"2024-11-24T11:08:36.000Z","size":23,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-09-08T23:43:31.786Z","etag":null,"topics":["golang","google","tf-idf","tokenizer"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/marcos-venicius.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-11-10T18:54:46.000Z","updated_at":"2024-11-24T11:08:40.000Z","dependencies_parsed_at":"2024-11-24T12:21:29.560Z","dependency_job_id":null,"html_url":"https://github.com/marcos-venicius/indexme","commit_stats":null,"previous_names":["marcos-venicius/indexme"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/marcos-venicius/indexme","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marcos-venicius%2Findexme","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marcos-venicius%2Findexme/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marcos-venicius%2Findexme/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marcos-venicius%2Findexme/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/marcos-venicius","download_url":"https://codeload.github.com/marcos-venicius/indexme/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/marcos-venicius%2Findexme/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":274231140,"owners_count":25245675,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-08T02:00:09.813Z","response_time":121,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["golang","google","tf-idf","tokenizer"],"created_at":"2024-12-10T21:10:23.267Z","updated_at":"2025-09-08T23:43:32.142Z","avatar_url":"https://github.com/marcos-venicius.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# IndexMe\n\n\u003e [!WARNING]\n\u003e It's a working in progress and study case only\n\nPass a directory to the program then index all the sub directory files.\n\nThe first implementation will be using only TF-IDF technique.\n\nEvery \"search\" will be saved in a local sqlite file after do the indexing, preventing it to do the indexing everytime.\n\nThen, the user could search upon a saved directory index and search for a file with a certain content.\n\nThe tool should keep a checksum for every file in the folder, this allows the user to call a flag `-update` and do a search\nin all files of all subfolders and check if the hash matches, if yes, keep going if not re-index this file and only update them.\n\nWe also can cache the result of a search to improve the speed of the search, but everytime a file is updated we need to check if some cached search include this file, if yes, we remove this cache\nto guarantee that the cached data is not outdated.\n\nDefault ignore folders like: `node_modules`, `.git`, ...\nDefault ignore files like: `binary files`, `image files`, `pdf files`, `data files`, `zip files`, ...\n\nAllow the user to update this config by updating a configuration file (possible json).\n\n## Examples\n\n**Indexing:**\n\n```bash\ngo run . -index /etc\n\n# output something like\n\nIndexing /etc/hosts...\nIndexing /etc/resolv.conf...\nIndexing /etc/foo...\nIndexing /etc/foo/bar...\nIndexing /etc/foo/bar/baz...\n\n/etc indexed successfully\n```\n\n**Viewing indexed folders**\n\n```bash\ngo run . -list\n\n# output something like\n\n/etc\n  234 documents indexed\n  20 ignored files (binary, images, any non readable file)\n  last update at 2024-10-10 13:34 PM\n\n/work/projects/todo-list\n  234 documents indexed\n  20 ignored files (binary, images, any non readable file)\n  last update at 2024-10-10 13:34 PM\n```\n\n**Updating indexes**\n\n```bash\ngo run . -update /etc\n\n# output something like\n\n/etc/resolv.conf already updated\n/etc/hosts updated successfully\n\n/etc was sucessfully updated\n```\n\n**Search for a term**\n\nin the search, everything after the `-search` flag is a string to the query\n\n```bash\ngo run . -search openssl config\n\n# output something like\n\n/etc/openssl/somefile.conf\n/etc/other/file\n/etc/foo/file\n/etc/bar\n/home/tests\n```\n\nor search in a specific directory:\n\n\n```bash\ngo run . -dir /etc -search openssl config\n\n# output something like\n\n/etc/openssl/somefile.conf\n/etc/other/file\n/etc/foo/file\n/etc/bar\n```\n\n**Removing folder**\n\n```bash\ngo run . -remove /etc\n\n# output something like\n\n/etc folder removed sucessfully\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarcos-venicius%2Findexme","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmarcos-venicius%2Findexme","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarcos-venicius%2Findexme/lists"}