Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/typesense/showcase-linux-commits-search
Instantly search 1M Linux Kernel Commit Messages using Typesense Search (an open source alternative to Algolia / ElasticSearch) ⚡ 💻 🔍
https://github.com/typesense/showcase-linux-commits-search
git instantsearch linux linux-kernel typesense typesense-showcase
Last synced: 2 months ago
JSON representation
Instantly search 1M Linux Kernel Commit Messages using Typesense Search (an open source alternative to Algolia / ElasticSearch) ⚡ 💻 🔍
- Host: GitHub
- URL: https://github.com/typesense/showcase-linux-commits-search
- Owner: typesense
- License: apache-2.0
- Created: 2020-11-09T17:33:24.000Z (about 4 years ago)
- Default Branch: master
- Last Pushed: 2023-02-06T09:46:16.000Z (almost 2 years ago)
- Last Synced: 2024-05-01T09:38:46.227Z (8 months ago)
- Topics: git, instantsearch, linux, linux-kernel, typesense, typesense-showcase
- Language: JavaScript
- Homepage: https://linux-commits-search.typesense.org/
- Size: 1.83 MB
- Stars: 38
- Watchers: 4
- Forks: 3
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Linux Commit History Search
This is a demo that showcases some of Typesense's features using 1 Million commit messages from the Linux kernel [repo](https://github.com/torvalds/linux).
View it live here: https://linux-commits-search.typesense.org/
# Tech Stack
This search experience is powered by Typesense which is a fast, open source typo-tolerant search-engine. It is an open source alternative to Algolia and an easier-to-use alternative to ElasticSearch.
The dataset was extracted by running `git log` on the Linux Kernel git repo.
The dataset is ~950MB on disk, with ~1 million records. It took 45 minutes to index this dataset on a 3-node Typesense cluster with 4vCPUs per node and the index was ~3GB in RAM.
The app was built using the [Typesense Adapter for InstantSearch.js](https://github.com/typesense/typesense-instantsearch-adapter) and is hosted on S3, with CloudFront for a CDN.
The search backend is powered by a geo-distributed 3-node Typesense cluster running on [Typesense Cloud](https://cloud.typesense.org), with nodes in Oregon, Frankfurt and Mumbai.
## Repo structure
- `src/` and `index.html` - contain the frontend UI components, built with Typesense Adapter for InstantSearch.js
- `scripts/` - contains the scripts to extract, transform and index the git log data into Typesense.## Development
1. Create a `.env` file using `.env.example` as reference.
2. Extract commit history
```shell
mkdir data/linux
cd data/linux
git checkout https://github.com/torvalds/linux
yarn extractCommitHistory:merges
yarn extractCommitHistory:nonMerges
```3. Transform and index the data
```shell
bundle install
gzip data/git-log-output
yarn transformDataset
yarn run typesenseServer
UPDATE_COLLECTION_ALIAS=true yarn index
```4. Install dependencies and run the local server:
```shell
yarn
yarn start
```Open http://localhost:3000 to see the app.
## Deployment
The app is hosted on S3, with Cloudfront for a CDN.
```shell
yarn build
yarn deploy
```