{"id":18687742,"url":"https://github.com/jorgechato/word-search-engine","last_synced_at":"2026-04-10T13:30:54.915Z","repository":{"id":84530411,"uuid":"177961302","full_name":"jorgechato/word-search-engine","owner":"jorgechato","description":"Word search engine based on scraping the html source code and integrated with CI/CD and k8s orchestration","archived":false,"fork":false,"pushed_at":"2019-04-02T22:24:20.000Z","size":19,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2026-01-03T14:35:25.895Z","etag":null,"topics":["docker","firebase","flask","k8s","public","swagger","travis"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jorgechato.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-03-27T09:30:11.000Z","updated_at":"2019-10-28T17:23:24.000Z","dependencies_parsed_at":"2024-04-18T21:51:14.843Z","dependency_job_id":null,"html_url":"https://github.com/jorgechato/word-search-engine","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/jorgechato/word-search-engine","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jorgechato%2Fword-search-engine","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jorgechato%2Fword-search-engine/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jorgechato%2Fword-search-engine/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jorgechato%2Fword-search-engine/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jorgechato","download_url":"https://codeload.github.com/jorgechato/word-search-engine/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jorgechato%2Fword-search-engine/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31645154,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-10T07:40:12.752Z","status":"ssl_error","status_checked_at":"2026-04-10T07:40:11.664Z","response_time":98,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker","firebase","flask","k8s","public","swagger","travis"],"created_at":"2024-11-07T10:33:59.424Z","updated_at":"2026-04-10T13:30:54.897Z","avatar_url":"https://github.com/jorgechato.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Word search engine\n[![Build Status](https://travis-ci.com/jorgechato/word-search-engine.svg?token=x3vLcsQVEzf1kfJyx1Uv\u0026branch=master)](https://travis-ci.com/jorgechato/word-search-engine)\n[![Docker](https://img.shields.io/badge/docker-image-blue.svg)](https://hub.docker.com/r/jorgechato/word-search-engine)\n\nInput:\n\n- Query to search for (a word).\n- Source where to search (webpage).\n\nOutput:\n\n- Number of times the word exists in the html source.\n\nConstrains:\n\n- Count only the giving word, not another word containing.\n\n## Architecture\n\nTODO: architecture\n\n## API\n\nBase API contract is stored in the [doc](/doc/contract.json) folder.\nYou can see the UI in http://\\\u003cENDPOINT\\\u003e:\\\u003cPORT\\\u003e/ and the live documentation in\nhttp://\\\u003cENDPOINT\\\u003e:\\\u003cPORT\\\u003e/swagger.json tho.\n\nThe body of the request can be strict or with limiters.\n\nIn case the query is restricted, the search engine search for a perfect match.\n\nIn case the strict value is `false` you need to provide a limiters. In this case\nthe word can be encapsulated between this limiters.\n\nYou can provide more than one limiter.\n\n## Run\n\n```bash\n$ FLASK_APP=src/app flask run\n# or\n$ python src/app.py\n```\n\n```bash\n$ curl -X PUT http://\u003cENDPOINT\u003e:\u003cPORT\u003e/search \\\n      -H 'content-type: application/json' \\\n      -d @'base.template.json'\n```\n\n## Deploy\n\nThe deployment is automated by the CI/CD pipeline but you can always run it in\nyour local machine.\n\n```bash\n# Build docker\n$ docker build -t word-search-engine:latest .\n$ docker run -p 8000:8000 -e PORT=8000 --name word-search-engine word-search-engine:latest\n```\n\nPull the latest version from [hub.docker](https://hub.docker.com/r/jorgechato/word-search-engine) from any machine with docker installed on it.\nYou can automate the process with Terraform, and a CI/CD pipeline if you are\nusing ECS or create a deploy/rollback in K8s\n\n```bash\n$ docker pull jorgechato/word-search-engine:latest\n# example with k8s\n# do not forget to export the ENV_VARIABLES for the DB connection first\n$ kubectl apply -f deploy/k8s.yml\n```\n\n---\n\n## Requirements\n\n### Must have\n\n- [python 3.x](https://www.python.org/downloads/)\n- pip3\n\n### Recommendation for development\n\n- [anaconda](https://anaconda.org/anaconda/python)\n\n#### Install dependencies\n\n```bash\n# with anaconda\n$ conda env create -f environment.yml # create virtual environment\n$ conda activate backend # enter VE\n# or\n$ source activate backend\n(backend) $ conda deactivate # exit VE\n```\n\n---\n\n## FAQ\n\n**Can the word be part of any html tag, css or js embedded in the source code of\nthe page?**\n\nSee also:\n\n* [business] MVP Questions ([#1][i1])\n\n**Does the scrapping search take place in the hole site-map of the domain?**\n\nNo, the search engine only search in the endpoint provided by the requester.\n\n**Why K8S and not AWS lambdas?**\n\n**P 1**: When using Serverless platforms the first invocation of a function takes\nsome time since the code needs to be initialized. In this case we will need a fast\nresponse since this service will be integrated with a stack of MS.\n\n**P 2**: Kubernetes might provide better scalability features than some Serverless\nplatforms, since Kubernetes is more mature and provides even HA (high availability)\nbetween different zones which not all Serverless platforms provide yet.\nAnd we plan to expand our business to different zones.\n\n\n**P 3**: it might be easier to use Kubernetes for more complex applications because\nthe platform is more mature. And since we are planning to use a database to\nstore the outcome of the logic, that make sense.\n\n\n**P 4**: Serverless doesn’t automatically mean lower costs, like when your\napplications need to run 24/7. There can also be some hidden costs like extra\ncosts for API management or the costs for the function invocations for tests.\n\n**P 5**: The monitoring capabilities of Kubernetes applications are much more\nmature.\n\n\n[i1]: https://github.com/jorgechato/word-search-engine/issues/1\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjorgechato%2Fword-search-engine","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjorgechato%2Fword-search-engine","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjorgechato%2Fword-search-engine/lists"}