{"id":13679794,"url":"https://github.com/c-bata/pysearch","last_synced_at":"2025-04-29T19:31:54.695Z","repository":{"id":23066689,"uuid":"26420195","full_name":"c-bata/pysearch","owner":"c-bata","description":"Web crawler and Search engine in Python.","archived":true,"fork":false,"pushed_at":"2016-05-23T13:54:53.000Z","size":20,"stargazers_count":53,"open_issues_count":0,"forks_count":19,"subscribers_count":7,"default_branch":"master","last_synced_at":"2024-11-11T22:35:37.796Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"http://nwpct1.hatenablog.com/entry/python-search-engine","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/c-bata.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2014-11-10T03:56:14.000Z","updated_at":"2024-08-12T19:15:18.000Z","dependencies_parsed_at":"2022-07-31T06:37:56.132Z","dependency_job_id":null,"html_url":"https://github.com/c-bata/pysearch","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/c-bata%2Fpysearch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/c-bata%2Fpysearch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/c-bata%2Fpysearch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/c-bata%2Fpysearch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/c-bata","download_url":"https://codeload.github.com/c-bata/pysearch/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251569549,"owners_count":21610575,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-02T13:01:09.655Z","updated_at":"2025-04-29T19:31:49.685Z","avatar_url":"https://github.com/c-bata.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Search Engine and Web Crawler in Python\n\n![Screenshot](https://qiita-image-store.s3.amazonaws.com/0/29989/786c36ad-4de7-43a7-75a0-98c82e412fa3.png \"Screenshot\")\n\n- Implement a web crawler\n- japanese morphological analysis using [janome](https://github.com/mocobeta/janome)\n- Implement search engine\n- Store in MongoDB\n- Web frontend using [Flask](http://flask.pocoo.org/)\n\nMore details are avairable from [My Tech Blog(Japanese)](http://nwpct1.hatenablog.com/entry/python-search-engine).\n\n## Requirements\n\n- Python 3.5\n\n## Setup\n\n1. Clone repository\n\n    ```\n    $ git clone git@github.com:mejiro/SearchEngine.git\n    ```\n    \n2. Install python packages\n\n    ```\n    $ cd SearchEngine\n    $ pip install -r requirements.txt -c constraints.txt\n    ```\n\n3. MongoDB settings\n4. Run\n\n    ```\n    $ python manage.py crawler # build a index\n    $ python manage.py webpage # access to http://127.0.0.1:5000\n    ```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fc-bata%2Fpysearch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fc-bata%2Fpysearch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fc-bata%2Fpysearch/lists"}