https://github.com/typesense/showcase-books-search
A site to instantly search 28M books from OpenLibrary using Typesense Search (an open source alternative to Algolia / ElasticSearch) ⚡ 📚 🔍
https://github.com/typesense/showcase-books-search
instantsearch typesense typesense-instantsearch-adapter typesense-showcase
Last synced: about 2 months ago
JSON representation
A site to instantly search 28M books from OpenLibrary using Typesense Search (an open source alternative to Algolia / ElasticSearch) ⚡ 📚 🔍
- Host: GitHub
- URL: https://github.com/typesense/showcase-books-search
- Owner: typesense
- License: apache-2.0
- Created: 2020-12-13T23:05:45.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2025-02-05T02:38:23.000Z (4 months ago)
- Last Synced: 2025-03-28T22:14:36.425Z (2 months ago)
- Topics: instantsearch, typesense, typesense-instantsearch-adapter, typesense-showcase
- Language: JavaScript
- Homepage: https://books-search.typesense.org/
- Size: 676 KB
- Stars: 159
- Watchers: 11
- Forks: 18
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# 📚 Instant Books Search, powered by Typesense
This is a demo that showcases some of [Typesense's](https://github.com/typesense/typesense) features using a 28 Million database of books from OpenLibrary (Internet Archive).
View it live here: [books-search.typesense.org](https://books-search.typesense.org/)
## Tech Stack
This search experience is powered by Typesense which is
a blazing-fast, open source typo-tolerant
search-engine. It is an open source alternative to Algolia and an easier-to-use alternative to ElasticSearch.The book dataset is from openlibrary.org. If you're able to contribute
book metadata, please do 🙏The app was built using the
Typesense Adapter for InstantSearch.js and is hosted on S3, with CloudFront for a CDN.The search backend is powered by a geo-distributed 3-node Typesense cluster running on Typesense Cloud,
with nodes in Oregon, Frankfurt and Mumbai.The dataset has ~28M records, takes up 6.8GB on disk and 14.3GB in RAM when indexed in Typesense.
Takes ~3 hours to index these 28M records.## Repo structure
- `src/` and `index.html` - contain the frontend UI components, built with Typesense Adapter for InstantSearch.js
- `scripts/indexer` - contains the script to index the book data into Typesense.
- `scripts/data` - contains a 1K sample subset of the books database. But you can download the full dataset from the link above.## Development
To run this project locally, install the dependencies and run the local server:
```shell
yarn
bundle # JSON parsing takes a while to run using JS when indexing, so we're using Ruby just for indexingyarn run typesenseServer
ln -s .env.development .env
yarn run indexer:extractAuthors # This will output an authors.jsonl file
yarn run indexer:transformDataset # This will output a transformed_dataset.json file
BATCH_SIZE=100000 yarn run indexer:importToTypesense # This will import the JSONL file into Typesenseyarn start
```Open http://localhost:3000 to see the app.
## Deployment
The app is hosted on S3, with Cloudfront for a CDN.
```shell
yarn build
yarn deploy
```