{"id":13493398,"url":"https://github.com/kantord/SeaGOAT","last_synced_at":"2025-03-28T11:32:48.159Z","repository":{"id":180470716,"uuid":"656242004","full_name":"kantord/SeaGOAT","owner":"kantord","description":"local-first semantic code search engine","archived":false,"fork":false,"pushed_at":"2024-10-29T11:15:04.000Z","size":18958,"stargazers_count":984,"open_issues_count":37,"forks_count":65,"subscribers_count":8,"default_branch":"main","last_synced_at":"2024-10-29T15:48:30.765Z","etag":null,"topics":["ai","ai-project","code-search","code-search-engine","embeddings","grep","grep-like","hacktoberfest","hacktoberfest2023","llm","regular-expression","ripgrep","vector-database","vector-embeddings"],"latest_commit_sha":null,"homepage":"https://kantord.github.io/SeaGOAT/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kantord.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-06-20T14:38:16.000Z","updated_at":"2024-10-27T18:00:31.000Z","dependencies_parsed_at":"2023-10-02T17:55:49.370Z","dependency_job_id":"3a37e3b1-bf18-43f9-a9bf-fa331ed08b94","html_url":"https://github.com/kantord/SeaGOAT","commit_stats":null,"previous_names":["kantord/seagoat"],"tags_count":203,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kantord%2FSeaGOAT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kantord%2FSeaGOAT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kantord%2FSeaGOAT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kantord%2FSeaGOAT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kantord","download_url":"https://codeload.github.com/kantord/SeaGOAT/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246021441,"owners_count":20710939,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","ai-project","code-search","code-search-engine","embeddings","grep","grep-like","hacktoberfest","hacktoberfest2023","llm","regular-expression","ripgrep","vector-database","vector-embeddings"],"created_at":"2024-07-31T19:01:14.817Z","updated_at":"2025-03-28T11:32:48.133Z","avatar_url":"https://github.com/kantord.png","language":"Python","funding_links":[],"categories":["Python","Code Analysis \u0026 Search","Coding \u0026 Development","ai"],"sub_categories":["Other IDEs"],"readme":"\u003c!-- markdownlint-disable MD033 --\u003e\n\n\u003e [!TIP]\n\u003e Check out [CodeGate](https://github.com/stacklok/codegate), the AI project I am currently working on. It's all about **security** in AI code generation.\n\n\u003ch1\u003e\n  \u003cp align=\"center\"\u003e\n    \u003cimg src=\"assets/logo-small.png\" alt=\"Logo\" width=\"200\"/\u003e\n    \u003cfont size=\"8\"\u003e\u003cb\u003eSeaGOAT\u003c/b\u003e\u003c/font\u003e\n  \u003c/p\u003e\n\u003c/h1\u003e\n\nA code search engine for the AI age. SeaGOAT is a local search tool that\nleverages vector embeddings to enable you to search your codebase semantically.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/demo-slideshow.gif\" alt=\"\" /\u003e\n\u003c/p\u003e\n\n## Getting started\n\n### Install SeaGOAT\n\nIn order to install SeaGOAT, you need to have the following\ndependencies already installed on your computer:\n\n- Python 3.11 or newer\n- ripgrep\n- [bat](https://github.com/sharkdp/bat) (**optional**, highly recommended)\n\nWhen `bat` is [installed](https://github.com/sharkdp/bat#on-ubuntu-using-apt),\nit is used to display results as long as color is enabled. When SeaGOAT is\nused as part of a pipeline, a grep-line output format is used. When color is\nenabled, but `bat` is not installed, SeaGOAT will highlight the output using\npygments. Using `bat` is recommended.\n\nTo install SeaGOAT using `pipx`, use the following command:\n\n```bash\npipx install seagoat\n```\n\n### System requirements\n\n#### Hardware\n\nShould work on any decent laptop.\n\n#### Operating system\n\nSeaGOAT is designed to work on Linux (*tested* ✅),\nmacOS ([partly tested, **help**](https://github.com/kantord/SeaGOAT/issues/178) 🙏)\nand Windows ([**help needed**](https://github.com/kantord/SeaGOAT/issues/179) 🙏).\n\n### Start SeaGOAT server\n\nIn order to use SeaGOAT in your project, you have to start the SeaGOAT server\nusing the following command:\n\n```bash\nseagoat-server start /path/to/your/repo\n```\n\n### Search your repository\n\nIf you have the server running, you can simply use the\n`gt` or `seagoat` command to query your repository. For example:\n\n```bash\ngt \"Where are the numbers rounded\"\n```\n\nYou can also use\n[Regular Expressions](https://en.wikipedia.org/wiki/Regular_expression)\nin your queries, for example\n\n```bash\ngt \"function calc_.* that deals with taxes\"\n```\n\n### Stopping the server\n\nYou can stop the running server using the following command:\n\n```bash\nseagoat-server stop /path/to/your/repo\n```\n\n### Configuring SeaGOAT\n\nSeaGOAT can be tailored to your needs through YAML configuration files,\neither globally or project-specifically with a `.seagoat.yml` file.\nFor instance:\n\n```yaml\n# .seagoat.yml\n\nserver:\n  port: 31134  # Specify server port\n```\n\n[Check out the documentation](https://kantord.github.io/SeaGOAT/latest/configuration/)\nfor more details!\n\n## Development\n\n**Requirements**:\n\n- [Poetry](https://python-poetry.org/)\n- Python 3.11 or newer\n- [ripgrep](https://github.com/BurntSushi/ripgrep)\n\n### Install dependencies\n\nAfter cloning the repository, install dependencies using the following command:\n\n```bash\npoetry install\n```\n\n### Running tests\n\n#### Watch mode (recommended)\n\n```bash\npoetry run ptw\n```\n\n#### Test changed files\n\n```bash\npoetry run pytest .  --testmon\n```\n\n#### Test all files\n\n```bash\npoetry run pytest .\n```\n\n### Manual testing\n\nYou can test any SeaGOAT command manually in your local development\nenvironment. For example to test the development version of the\n`seagoat-server` command, you can run:\n\n```bash\npoetry run seagoat-server start ~/path/an/example/repository\n```\n\n## FAQ\n\nThe points in this FAQ are indications of how SeaGOAT works, but are not\na legal contract. SeaGOAT is licensed under an open source license and if you\nare in doubt about the privacy/safety/etc implications of SeaGOAT, you are\nwelcome to examine the source code,\n[raise your concerns](https://github.com/kantord/SeaGOAT/issues/new),\nor create a pull request to fix a problem.\n\n### How does SeaGOAT work? Does it send my data to ChatGPT?\n\nSeaGOAT does not rely on 3rd party APIs or any remote APIs and executes all\nfunctionality locally using the SeaGOAT server that you are able to run on\nyour own machine.\n\nInstead of relying on APIs or \"connecting to ChatGPT\", it uses the vector\ndatabase called ChromaDB, with a local vector embedding engine and\ntelemetry disabled by default.\n\nApart from that, SeaGOAT also uses ripgrep, a regular-expression based code\nsearch engine in order to provider regular expression/keyword based matches\nin addition to the \"AI-based\" matches.\n\nWhile the current version of SeaGOAT does not send your data to remote\nservers, it might be possible that in the future there will be **optional**\nfeatures that do so, if any further improvement can be gained from that.\n\n### Why does SeaGOAT need a server?\n\nSeaGOAT needs a server in order to provide a speedy response. SeaGOAT heavily\nrelies on vector embeddings and vector databases, which at the moment cannot\nbe replace with an architecture that processes files on the fly.\n\nIt's worth noting that *you are able to run SeaGOAT server entirely locally*,\nand it works even if you don't have an internet connection. This use case\ndoes not require you to share data with a remote server, you are able to use\nyour own SeaGOAT server locally, albeit it's also possible to run a SeaGOAT\nserver and allow other computers to connect to it, if you so wish.\n\n### Does SeaGOAT create AI-derived work? Is SeaGOAT ethical?\n\nIf you are concerned about the ethical implications of using AI tools keep in\nmind that SeaGOAT is not a code generator but a code search engine, therefore\nit does not create AI derived work.\n\nThat being said, a language model *is* being used to generate vector\nembeddings. At the moment SeaGOAT uses ChromaDB's default model for\ncalculating vector embeddings, and I am not aware of this being an ethical\nconcern.\n\n### What programming languages are supported?\n\nCurrently SeaGOAT is hard coded to only process files in the following\nformats:\n\n- **Text Files** (`*.txt`)\n- **Markdown** (`*.md`)\n- **Python** (`*.py`)\n- **C** (`*.c`, `*.h`)\n- **C++** (`*.cpp`, `*.cc`, `*.cxx`, `*.hpp`)\n- **TypeScript** (`*.ts`, `*.tsx`)\n- **JavaScript** (`*.js`, `*.jsx`)\n- **HTML** (`*.html`)\n- **Go** (`*.go`)\n- **Java** (`*.java`)\n- **PHP** (`*.php`)\n- **Ruby** (`*.rb`)\n\n### Why is SeaGOAT processing files so slowly while barely using my CPU?\n\nSince processing files for large repositories can take a long time, SeaGOAT\nis **designed to allow you to use your computer while processing files**. It is\nan intentional design choice to avoid blocking/slowing down your computer.\n\nThis design decision does not affect the performance of queries.\n\n**By the way, you are able to use SeaGOAT to query your repository while\nit's processing your files!** When you make a query, and the files are not\nprocessed yet, you will receive a warning with an estimation of the accuracy\nof your results. Also, regular expression/full text search based results\nwill be displayed from the very beginning!\n\n### What character encodings are supported?\n\nThe preferred character encoding is UTF-8. Most other character encodings\nshould also work. Only text files are supported, SeaGOAT ignores binary files.\n\n### Where does SeaGOAT store it's database/cache?\n\nWhere SeaGOAT stores databases and cache depends on your operating system.\nFor your convenience, you can use the `seagoat-server server-info`\ncommand to find out where these files are stored on your system.\n\n### Can I host SeaGOAT server on a different computer?\n\nYes, if you would like to use SeaGOAT without having to run the server on\nthe same computer, you can simply self-host SeaGOAT server on a different\ncomputer or in the cloud, and\n[configure](https://kantord.github.io/SeaGOAT/latest/configuration/)\nthe `seagoat`/`gt` command to connect to this remote server through the\ninternet.\n\nKeep in mind that SeaGOAT itself does not enforce any security as it is\nprimarily designed to run locally. If you have private code that you do not\nwish to leak, you will have to make sure that only trusted people have\naccess to the SeaGOAT server. This could be done by making it only available\nthrough a VPN that only your teammates can access.\n\n### Can I ignore files/directories?\n\nSeaGOAT already ignores all files/directories ignored in your `.gitignore`.\nIf you wish to ignore additional files but keep them in git, you can use the\n`ignorePatterns` attribute from the server configuration.\n[Learn more](https://kantord.github.io/SeaGOAT/latest/configuration/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkantord%2FSeaGOAT","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkantord%2FSeaGOAT","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkantord%2FSeaGOAT/lists"}