{"id":20621450,"url":"https://github.com/emarifer/search-engine","last_synced_at":"2026-04-16T16:40:28.595Z","repository":{"id":247950072,"uuid":"823274230","full_name":"emarifer/search-engine","owner":"emarifer","description":"A mini Google. Custom web crawler \u0026 indexer written in Golang.","archived":false,"fork":false,"pushed_at":"2024-07-04T11:51:47.000Z","size":951,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-17T05:26:27.585Z","etag":null,"topics":["crawler","dashboard","deep-first-search","fiber-framework","full-text-search","golang","gorm-orm","htmx","htmx-go","hyperscript","indexer","inverted-index","response-caching","search-engine","templ","worker-pool"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/emarifer.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-02T18:03:03.000Z","updated_at":"2024-07-28T10:35:25.000Z","dependencies_parsed_at":"2024-07-11T14:08:26.486Z","dependency_job_id":null,"html_url":"https://github.com/emarifer/search-engine","commit_stats":null,"previous_names":["emarifer/search-engine"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emarifer%2Fsearch-engine","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emarifer%2Fsearch-engine/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emarifer%2Fsearch-engine/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emarifer%2Fsearch-engine/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/emarifer","download_url":"https://codeload.github.com/emarifer/search-engine/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":242298976,"owners_count":20104922,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crawler","dashboard","deep-first-search","fiber-framework","full-text-search","golang","gorm-orm","htmx","htmx-go","hyperscript","indexer","inverted-index","response-caching","search-engine","templ","worker-pool"],"created_at":"2024-11-16T12:17:54.091Z","updated_at":"2026-04-16T16:40:23.563Z","avatar_url":"https://github.com/emarifer.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n  \n\u003ch1 align=\"center\"\u003eSearch Engine\u003c/h1\u003e\n\n\u003cimg src=\"./assets/img/logo-doc.png\" width=\"55%\"\u003e\n\n\u003chr /\u003e\n\n\u003cp style=\"margin-bottom: 16px;\"\u003e\n    A mini Google. Custom web crawler \u0026 indexer written in Golang.\n\u003c/p\u003e\n\n\u003e 🚧 This is a work in progress and therefore you should expect that the\n\u003e application may not have all the features at this moment.\n\n\u003cbr /\u003e\n  \n![GitHub License](https://img.shields.io/github/license/emarifer/go-echo-templ-htmx) ![Static Badge](https://img.shields.io/badge/Go-%3E=1.18-blue)\n\n\u003c/div\u003e\n\n\u003chr /\u003e\n\n## Features 🚀\n\n- [x] **Golang-Powered:** Leverage the performance and safety of one of the\n  best languages in the market for backend development.\n- [x] **Search engine based on the `Depth-first search (DFS)` algorithm:** [Depth-first search](https://www.geeksforgeeks.org/depth-first-search-or-dfs-for-a-graph/) is an algorithm for traversing or searching tree or graph data structures, as is the case with HTML documents. To avoid processing the same link more than once, a `unique` constraint is used when storing the urls that will be crawled in subsequent cycles.\n- [x] **Indexing `full text search`:** It is carried out using a parser/tokenizer that uses the [Snowball](https://github.com/kljensen/snowball) library and implemented an [inverted index](https://www.geeksforgeeks.org/inverted-index/), which is stored in the database, allowing an efficient query of the terms search.\n- [x] **SQL Database Integration:** Storing crawled urls and indexing results in a `Postgres` DB, which allows greater scalability and efficiency in searches.\n- [x] **Caching of the responses (in `JSON` format) of the searches performed:** The `Fiber` framework provides [middleware](https://docs.gofiber.io/api/middleware/cache) for easy caching of server responses.\n- [x] **Using the `Fiber` framework, `A-H/Templ` and `Htmx` libraries::** The use of [Fiber](https://gofiber.io/), [Templ](https://templ.guide/) and [Htmx](https://htmx.org/) greatly speeds up the creation of a simple user interface for minimal search engine administration. Check out some of my other [repositories](https://github.com/emarifer/gofiber-templ-htmx) for more explanations.\n- [x] **Using interfaces in the `services` package:** The architecture follows a typical \"onion model\" where each layer doesn't know about the layer above it, and each layer is responsible for a specific thing, in this case, the `services` (package) layer, which allows for better separation of responsibilities and `dependency injection`.\n- [ ] **Using concurrency in engine-built crawling functions:** Use is made of one of the features in which the Go language shines most: concurrency, to try to speed up the always heavy link crawling tasks. 🚧 This is a work in progress!!\n\n\u003cbr /\u003e\n\n\u003chr /\u003e\n\n## 🖼️ Screenshots:\n\n\u003cdiv align=\"center\"\u003e\n\n###### Admin login screen and dashboard:\n\n\u003cimg src=\"assets/img/screenshot-01.png\" width=\"26%\" align=\"top\"\u003e\u0026nbsp;\u0026nbsp;\u003cimg src=\"assets/img/screenshot-02.png\" width=\"26%\" align=\"top\"\u003e\n\n\n###### Response to a search performed with cURL:\n\n\u003cimg src=\"assets/img/screenshot-03.png\" width=\"55%\" align=\"top\"\u003e\n\n\n\u003c/div\u003e\n\n\n---\n\n## 👨‍🚀 Installation and Usage\n\nBefore compiling the view templates, you'll need to regenerate the CSS. First, you need to install the dependencies required by `Tailwind CSS` and `daisyUI` (you must have `Node.js` installed on your system) and then run the regeneration of the `main.css` file. To do this, apply the following commands:\n\n```\n$ cd tailwind \u0026\u0026 npm i\n$ npm run build-css-prod # `npm run watch-css` regenerate the css in watch mode for development\n```\n\nSince we use the PostgreSQL database from a Docker container, it is necessary to have the latter also installed and execute this command in the project folder:\n\n```\n$ docker compose up -d\n```\n\nThese other commands will also be useful to manage the database from its container:\n\n```\n$ docker start search-engine # start container\n$ docker stop search-engine # stop container\n$ docker exec -it search-engine psql -U postgres # (user: postgres, without password)\n```\n\nBesides the obvious prerequisite of having Go! on your machine, you must have [Air](https://github.com/air-verse/air) installed for hot reloading when editing code.\n\n\u003e[!TIP]\n\u003e***In order to have autocompletion and syntax highlighting in VS Code for the `Templ templating language`, you will have to install the [templ-vscode](https://marketplace.visualstudio.com/items?itemName=a-h.templ) extension (for vim/nvim install this [plugin](https://github.com/joerdav/templ.vim)). To generate the Go code corresponding to these templates you will have to download this [executable binary](https://github.com/a-h/templ/releases/tag/v0.2.476) from Github and place it in the PATH of your system. The command:***\n\n```\n$ templ generate # `templ generate --watch` to enable watch mode\n```\n\n\u003e[!TIP]\n\u003e***This command allows us to regenerate the `.templ` templates and, therefore, is necessary to start the application. This will also allow us to monitor changes to the `.templ` files (if we have the `--watch` flag activated) and compile them as we save them if we make changes to them. Review the documentation on Templ [installation](https://templ.guide/quick-start/installation) and [support](https://templ.guide/commands-and-tools/ide-support/) for your IDE .***\n\nBuild for production:\n\n```\n$ go build -ldflags=\"-s -w\" -o ./bin/search-engine ./cmd/search-engine/main.go # ./bin/search-engine to run the application / Ctrl + C to stop the application\n```\n\nStart the app in development mode:\n\n```\n$ air # This compiles the view templates automatically / Ctrl + C to stop the application\n```\n\n---\n\n### Happy coding 😀!!","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Femarifer%2Fsearch-engine","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Femarifer%2Fsearch-engine","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Femarifer%2Fsearch-engine/lists"}