Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/frutik/awesome-search

Awesome Search - this is all about the (e-commerce, but not only) search and its awesomeness
https://github.com/frutik/awesome-search

List: awesome-search

autocomplete-suggestions ecommerce-search evaluating-search knowledge-graph learning-to-rank natural-language-processing query-understanding ranking relevance-algorithms relevant-search search search-engine search-engines search-intents search-ui search-ux semantic-search spelling-correction suggestions synonyms

Last synced: 19 days ago
JSON representation

Awesome Search - this is all about the (e-commerce, but not only) search and its awesomeness

Awesome Lists containing this project

README

        

# Awesome Search

Support the Ukrainian fight for freedom

[RUSSIAN WARSHIP, GO F*CK YOURSELF](https://en.wikipedia.org/wiki/Russian_warship,_go_fuck_yourself!)

I've been building e-commerce search applications for almost ten years. Below you can find a list of (some) publications, conferences and books that inspire me. Grouped by topic (If an article fits into multiple topics - it goes into multiple sections).

:star: Star us on GitHub — it helps!

Also check my other collections [awesome e-commerce](https://github.com/frutik/awesome-e-commerce), [awesome knowledge graphs](https://github.com/frutik/awesome-knowledge-graphs), [awesome cloud apps](https://github.com/frutik/awesome-cloud-apps)

### Topics
- [General, fun, philosophy](#general-fun-philosophy)
- [Types of search](#types-of-search)
- [Classic/Lexical search](#classic-lexical-search)
- [Vectors/Semantic search](#vectorssemantic-search)
- [Embeddings](#embeddings)
- [Types of vectors](#types-of-vectors)
- [Dense vectors](#dense-vectors)
- [Matryoshka embeddings](#matryoshka-embeddings)
- [Sparse vectors](#sparse-vectors)
- [SPLADE](#splade)
- [Encoder models](#encoder-models)
- Main architectures
- [Query/Document tokens interaction]()
- [No interactions - Two towers / Bi-encoders](#two-towers--bi-encoders)
- [Early interactions - Cross-encoders](#early-interactions---cross-encoders)
- [Late interactions - ColBERT](#late-interactions--colbert)
- [Handling high-dimension embeddings](#handling-high-dimension-embeddings)
- [Dimensionality reduction](#dimensionality-reduction)
- [Quantization](#quantization)
- [Scalar quantization](#scalar-quantization)
- [Binary quantization](#binary-quantization)
- [Symmetric and Asymmetric semantic search](#symmetric-and-asymmetric-semantic-search)
- [Hybrid search](#hybrid-search)
- [Reciprocal rank fusion - RRF](#reciprocal-rank-fusion-rrf)
- [Multimodal search](#multimodal-search)
- [Areas of application](#also-types-of-search)
- [Enterprise search](#enterprise-search)
- [e-Commerce search](#e-commerce-search)
- [Conversational search](#conversational-search)
- [Geo-Spatial Search]()
- [Medical and Healthcare Search]()
- [Social Media and User-Generated Content Search]()
- [Question Answering Systems]()
- [Personal Information Management]()
- [Search Results](#search-results)
- [Retrieval](#retrieval)
- [Relevance](#relevance)
- [Relevance Algorithms](#relevance-algorithms)
- [Ranking](#ranking)
- [Multi stage ranking](#multi-stage-ranking)
- [Learning to Rank](#learning-to-rank)
- [Click models for search](#click-models-for-search)
- [Bias](#bias)
- [Diversification](#diversification)
- [Personalisation](#personalisation)
- [Zero search results](#zero-search-results)
- [Search UX](#search-ux)
- [Baymard Institute](#baymard-institute)
- [Nielsen Norman Group](#nielsen-norman-group)
- [Enterprise Knowledge LLC](#enterprise-knowledge-llc)
- [Facets](#facets)
- [Accidental Taxonomist](#accidental-taxonomist)
- [Other](#other)
- [Spelling correction](#spelling-correction)
- [Suggestions](#suggestions)
- [Synonyms](#synonyms)
- [Stopwords](#stopwords)
- [Graphs/Taxonomies/Knowledge Graph](#graphstaxonomiesknowledge-graph)
- Integrating Search and Knowledge Graphs (by Enterprise Knowledge)
- [Query expansion](#query-expansion)
- [Query understanding](#query-understanding)
- [Search Intent](#search-intent)
- [Query segmentation](#query-segmentation)
- [Algorithms](#algorithms)
- [BERT](#bert)
- [ColBERT](#colbert)
- [Collocations, common phrases](#collocations-common-phrases)
- [Other Algorithms](#other-algorithms)
- [Tracking, profiling, GDPR, Analysis](#tracking-profiling-gdpr-analysis)
- [Experiments](#experiments)
- A/B testing, MABs
- Evaluating search
- MRR
- [Testing, metrics, KPIs](#testing-metrics-kpis)
- Metrics
- KPIs
- Evaluating Search (by Daniel Tunkelang)
- Measuring Search (by James Rubinstein)
- Three Pillars of Search Relevancy (by Andreas Wagner)
- [Architecture](#architecture)
- [Vectors search](#vectors-search)
- [Education and networking](#education-and-networking)
- [Conferences](#conferences)
- [Trainings and courses](#trainings-and-courses)
- [Books](#books)
- [Blogs and Portals, News](#blogs-and-portals)
- [Papers](#papers)
- [Management, Search Team](#management-search-team)
- [Job Interviews](#job-interviews)
- [Engineering](#engineering)
- [Industry players](#industry-players)
- Personalies and influencers
- Search Engines
- Products and services
- Consulting companies
- [Blogposts series](#blogposts-series)
- Search Optimization 101 (by Charlie Hull)
- Query Understanding (by Daniel Tunkelang)
- Grid Dynamics
- Considering Search: Search Topics (by Derek Sisson)
- [Videos](#videos)
- Channels
- Featured
- [Case studies](#case-studies)
- [General search](#general-search)
- [Multisided markets](#multisided-markets)
- [E-commerce](#e-commerce)
- [Datasets](#datasets)
- [Tools](#tools)

## Unsorted

- [sandbox Jun 2021](https://github.com/frutik/awesome-search/issues/19)
- [sandbox May 2021](https://github.com/frutik/awesome-search/issues/18)
- [sandbox April 2021](https://github.com/frutik/awesome-search/issues/17)
- [sandbox Dec 2020](https://github.com/frutik/awesome-search/issues/10)
- [sandbox Jan 2020](https://github.com/frutik/awesome-search/issues/1)

## General, fun, philosophy

* [Falsehoods Programmers Believe About Search](https://opensourceconnections.com/blog/2019/05/29/falsehoods-programmers-believe-about-search/)
* [Ethical Search: Designing an irresistible journey with a positive impact](https://medium.com/empathyco/fooddiscovery-2-ethical-search-designing-an-irresistible-journey-with-a-positive-impact-cc921c07a5a8)
* [On Semantic Search](https://medium.com/modern-nlp/semantic-search-fuck-yeah-e371c0f639d)
* [Feedback debt: what the segway teaches search teams](https://opensourceconnections.com/blog/2020/03/19/feedback-debt/)
* [Supporting the Searcher’s Journey: When and How](https://medium.com/@dtunkelang/supporting-the-searchers-journey-when-and-how-568e9b68fe02)
* [Shopping is Hard, Let’s go Searching!](https://medium.com/@dtunkelang/shopping-is-hard-lets-go-searching-f61f3d5764d3)
* [An Introduction to Search Quality](https://opensourceconnections.com/blog/2018/11/19/an-introduction-to-search-quality/)
* [On-Site Search Design Patterns for E-Commerce: Schema Structure, Data Driven Ranking & More](https://project-a.github.io/on-site-search-design-patterns-for-e-commerce/)
* [In Search of Recall](https://www.linkedin.com/pulse/search-recall-daniel-tunkelang/)
* [Balance Your Search Budget!](https://www.linkedin.com/pulse/balance-your-search-budget-daniel-tunkelang/)

## Types of search

* [Evolution of Search Technology: A Look Ahead](https://medium.com/@Ratnaparkhi/how-the-search-technology-is-evolving-88607f5efb9e)

### Classic/Lexical Search

* Etsy. [Targeting Broad Queries in Search](https://codeascraft.com/2015/07/29/targeting-broad-queries-in-search/)
* [How Etsy Uses Thermodynamics to Help You Search for “Geeky”](https://codeascraft.com/2015/08/31/how-etsy-uses-thermodynamics-to-help-you-search-for-geeky/)
* [Broad and Ambiguous Search Queries](https://medium.com/@dtunkelang/broad-and-ambiguous-search-queries-1bbbe417dcc)
* [Deconstructing E-Commerce Search: The 12 Query Types](https://baymard.com/blog/ecommerce-search-query-types)

### Vectors/Semantic search

* [Migrating to Elasticsearch with dense vector for Carousell Spotlight search engine](https://medium.com/carousell-insider/migrating-to-elasticsearch-with-dense-vector-for-carousell-spotlight-search-engine-e328b16155fc)
* [From zero to semantic search embedding model](https://blog.metarank.ai/from-zero-to-semantic-search-embedding-model-592e16d94b61)
* [Guidelines to choose an index](https://github.com/facebookresearch/faiss/wiki/Guidelines-to-choose-an-index)
* [Pinecone Series](#pinecone-series)
* [Nearest Neighbor Indexes for Similarity Search](https://www.pinecone.io/learn/series/faiss/vector-indexes/)
* [The Missing WHERE Clause in Vector Search](https://www.pinecone.io/learn/vector-search-filtering/)

#### Symmetric and Asymmetric semantic search

* [Symmetric vs. Asymmetric Semantic Search](https://www.sbert.net/examples/applications/semantic-search/README.html#symmetric-vs-asymmetric-semantic-search)

#### Embeddings
##### Types

* [Bi-encoder vs Cross encoder?When to use which one?](https://medium.com/@sujathamudadla1213/bi-encoder-vs-cross-encoder-when-to-use-which-one-4a20edbe6d37)
* [What is ColBERT and Late Interaction and Why They Matter in Search?](https://jina.ai/news/what-is-colbert-and-late-interaction-and-why-they-matter-in-search/)

#### Encoder models

##### Query/Document tokens interaction

###### No interactions - Two towers / Bi-encoders

###### Early interactions - Cross-encoders

###### Late interactions - ColBERT

- [Announcing the Vespa ColBERT embedder](https://blog.vespa.ai/announcing-colbert-embedder-in-vespa/)
- [What is ColBERT and Late Interaction and Why They Matter in Search?](https://jina.ai/news/what-is-colbert-and-late-interaction-and-why-they-matter-in-search/)

##### Dense Vectors

###### Matryoshka embeddings

* [Matryoshka embeddings: faster OpenAI vector search using Adaptive Retrieval](https://supabase.com/blog/matryoshka-embeddings)
* [Introduction to Matryoshka Embedding Models](https://huggingface.co/blog/matryoshka)
* [Matryoshka representations. A guide to faster semantic search](https://ujjwalm29.medium.com/matryoshka-representation-learning-a-guide-to-faster-semantic-search-1c9025543530)

##### Sparse Vectors

###### SPLADE

* [Hybrid Search: SPLADE (Sparse Encoder)](https://medium.com/@sowmiyajaganathan/hybrid-search-splade-sparse-encoder-neural-retrieval-models-d092e5f46913)
* [SPLADE for Sparse Vector Search Explained](https://www.pinecone.io/learn/splade/)

#### Handling high-dimension embeddings
##### Dimensionality reduction
##### Quantization
##### Binary embeddings

### Hybrid search

* [Hybrid search > sum of its parts?](https://pretalx.com/bbuzz22/talk/YEHRTE/)
* [On Hybrid Search](https://qdrant.tech/articles/hybrid-search/#)
* [Hybrid search with Re-ranking](https://medium.com/@sowmiyajaganathan/hybrid-search-with-re-ranking-ff120c8a426d)

#### Reciprocal rank fusion (RRF)

* [Hybrid search with Re-ranking](https://medium.com/@sowmiyajaganathan/hybrid-search-with-re-ranking-ff120c8a426d)
* [Reciprocal rank fusion](https://www.elastic.co/guide/en/elasticsearch/reference/current/rrf.html)

### Multimodal search

* [Muves: Multimodal & multilingual vector search w/ Hardware Acceleration](https://www.youtube.com/watch?v=9OS8cMf2rwY)
* [Model Selection for Multimodal Search](https://docs.marqo.ai/2.6/Cookbook/model_selection/multimodal_search/#convnext-models)

## Areas of application

### Enterprise search

* [GenAI Can Improve Enterprise Search, But Remains a Work In Progress](https://www.reworked.co/knowledge-findability/genai-can-improve-enterprise-search-but-remains-a-work-in-progress/)

### e-Commerce search

* [The influence of TF-IDF algorithms in eCommerce search](https://medium.com/empathyco/the-influence-of-tf-idf-algorithms-in-ecommerce-search-e7cb9ab8e662)

### Conversational search

* [Search as a Conversation](https://queryunderstanding.com/search-as-a-conversation-bafa7cd0c9a5)
* [Affordances for Conversational Search](https://dtunkelang.medium.com/affordances-for-conversational-search-2cc543eae83d)
* [Query Understanding and Chatbots](https://queryunderstanding.com/query-understanding-and-chatbots-5fa0c154f)

## Search Results

### Retrieval

#### Relevance

* [Humans Search for Things not for Strings](https://www.linkedin.com/pulse/humans-search-things-strings-andreas-wagner/)
* [What is a ‘Relevant’ Search Result?](https://opensourceconnections.com/blog/2019/12/11/what-is-a-relevant-search-result/)
* [How to Achieve Ecommerce Search Relevance](https://blog.searchhub.io/how-to-achieve-ecommerce-search-relevance?cn-reloaded=1&cn-reloaded=1)
* [Setting up a relevance evaluation program](https://medium.com/@jamesrubinstein/setting-up-a-relevance-evaluation-program-c955d32fba0e)

##### Relevance Algorithms

* Practical BM25: [How Shards Affect Relevance Scoring in Elasticsearch](https://www.elastic.co/blog/practical-bm25-part-1-how-shards-affect-relevance-scoring-in-elasticsearch), [The BM25 Algorithm and its Variables](https://www.elastic.co/blog/practical-bm25-part-2-the-bm25-algorithm-and-its-variables)
* [The influence of TF-IDF algorithms in eCommerce search](https://medium.com/empathyco/the-influence-of-tf-idf-algorithms-in-ecommerce-search-e7cb9ab8e662)
* [BM25 The Next Generation of Lucene Relevance](https://opensourceconnections.com/blog/2015/10/16/bm25-the-next-generation-of-lucene-relevation/)
* [Lucene Similarities (BM25, DFR, DFI, IB, LM) Explained](https://sematext.com/blog/search-relevance-solr-elasticsearch-similarity/)

### Ranking

#### Multi-stage ranking

* [Multi stage ranking](https://medium.com/better-ml/multi-stage-ranking-e0dacd81ac4)

#### Learning to Rank

* [How is search different than other machine learning problems?](https://opensourceconnections.com/blog/2017/08/03/search-as-machine-learning-prob/)
* [Reinforcement learning assisted search ranking](https://medium.com/sajari/reinforcement-learning-assisted-search-ranking-a594cdc36c29)
* [E-commerce Search Re-Ranking as a Reinforcement Learning Problem](https://towardsdatascience.com/e-commerce-search-re-ranking-as-a-reinforcement-learning-problem-a9d1561edbd0)
* [When to use a machine learned vs. score-based search ranker](https://towardsdatascience.com/when-to-use-a-machine-learned-vs-score-based-search-ranker-aa8762cd9aa9)
* [What is Learning To Rank?](https://opensourceconnections.com/blog/2017/02/24/what-is-learning-to-rank/)
* [Using AI and Machine Learning to Overcome Position Bias within Adobe Stock Search](https://medium.com/adobetech/evaluating-addressing-position-bias-in-adobe-stock-search-9807b11ee268)
* [Train and Test Sets Split for Evaluating Learning To Rank Models](https://sease.io/2022/07/how-to-split-your-dataset-into-train-and-test-sets-for-evaluating-learning-to-rank-models.html)
* [How LambdaMART works - optimizing product ranking goals](https://softwaredoug.com/blog/2021/11/28/how-lammbamart-works.html)

##### Click models for search

* [Click models](https://github.com/filipecasal/knowledge-repo/blob/master/click_models.md)
* [Click Modeling for eCommerce](https://tech.ebayinc.com/engineering/click-modeling-for-ecommerce/)
* [Using Behavioral Data to Improve Search](https://tech.ebayinc.com/engineering/using-behavioral-data-to-improve-search/)

### Bias

* [What is Presentation Bias in search?](https://softwaredoug.com/blog/2022/07/16/what-is-presentation-bias-in-search.html)
* [Dealing with Position Bias in Recommendations and Search](https://www.kdnuggets.com/2023/03/dealing-position-bias-recommendations-search.html)

### Diversification

* [Search Result Diversification using Causal Language Models](https://arxiv.org/pdf/2108.04026.pdf)
* [Learning to Diversify for E-commerce Search with Multi-Armed Bandit](http://ceur-ws.org/Vol-2410/paper18.pdf)
* [Search Quality for Discovery & Inspiration](https://blog.searchhub.io/three-pillars-of-search-quality-in-ecommerce-part-2-discovery-inspiration)
* [How to measure Diversity of Search Results](https://2021.berlinbuzzwords.de/session/how-measure-diversity-search-results)
* [Searching for Goldilocks](https://dtunkelang.medium.com/searching-for-goldilocks-12cb21c7d036)
* [Broad and Ambiguous Search Queries - Recognizing When Search Results Need Diversification](https://dtunkelang.medium.com/broad-and-ambiguous-search-queries-1bbbe417dcc)
* [Thoughts on Search Result Diversity](https://dtunkelang.medium.com/thoughts-on-search-result-diversity-1df54cb5bf4a)

### Personalisation

* [Patterns for Personalization in Recommendations and Search](https://eugeneyan.com/writing/patterns-for-personalization/)
* Daniel Tunkelang [Personalization](https://queryunderstanding.com/personalization-3ed715e05ef)
* Airbnb - [Real-time personalization in search](https://medium.com/airbnb-engineering/listing-embeddings-for-similar-listing-recommendations-and-real-time-personalization-in-search-601172f7603e)
* [98 personal data points that facebook uses to target ads to you](https://www.washingtonpost.com/news/the-intersect/wp/2016/08/19/98-personal-data-points-that-facebook-uses-to-target-ads-to-you/)
* [Architecture of real world recommendation systems](https://fennel.ai/blog/real-world-recommendation-system/)
* [Feature engineering for personalized search](https://fennel.ai/blog/feature-engineering-for-personalized-search/)

### Zero search results

* [Strategies for using alternative queries to mitigate zero results and their application to online marketplaces](https://haystackconf.com/files/slides/haystackEU2023/Jean_Ren%C3%A9_Strategies_for_using_alternative_queries_to_mitigate_zero_results.pdf)
* [Semantic Equivalence of e-Commerce Queries](https://dtunkelang.medium.com/semantic-equivalence-of-e-commerce-queries-78630e5fab5d)

## Search UX

### Baymard Institute

* [Deconstructing E-Commerce Search: The 12 Query Types](https://baymard.com/blog/ecommerce-search-query-types)
* [Autodirect or Guide Users to Matching Category](https://baymard.com/blog/autodirect-searches-matching-category-scopes)
* [13 Design Patterns for Autocomplete Suggestions (27% Get it Wrong)](https://baymard.com/blog/autocomplete-design)
* [E-Commerce Search Needs to Support Users’ Non-Product Search Queries (15% Don’t)](https://baymard.com/blog/support-non-product-search)
* [Search UX: 6 Essential Elements for ‘No Results’ Pages](https://baymard.com/blog/no-results-page)
* [Product Thumbnails Should Dynamically Update to Match the Variation Searched For (54% Don’t)](https://baymard.com/blog/color-and-variation-searches)
* [Faceted Sorting - A New Method for Sorting Search Results](https://baymard.com/blog/faceted-sorting)
* [The Current State of E-Commerce Search](https://baymard.com/blog/external-article-state-of-ecommerce-search)
* [E-Commerce Sites Need Multiple of These 5 ‘Search Scope’ Features](https://baymard.com/blog/search-scope)
* [E-Commerce Search Field Design and Its Implications](https://baymard.com/blog/search-field-design)
* [E-Commerce Sites Should Include Contextual Search Snippets (96% Get it Wrong)](https://baymard.com/blog/search-snippets)
* [E-Commerce Search Usability: Report & Benchmark](https://baymard.com/blog/ecommerce-search-report-and-benchmark)
* [Six ‘COVID-19’ Related E-Commerce UX Improvements to Make](https://baymard.com/blog/covid-19-ux-improvements)

### Nielsen Norman Group

* [The Love-at-First-Sight Gaze Pattern on Search-Results Pages](https://www.nngroup.com/articles/love-at-first-sight-pattern/)
* [Good Abandonment on Search Results Pages](https://www.nngroup.com/articles/good-abandonment/)
* [Complex Search-Results Pages Change Search Behavior: The Pinball Pattern](https://www.nngroup.com/articles/pinball-pattern-search-behavior/)
* [Site Search Suggestions](https://www.nngroup.com/articles/site-search-suggestions/)
* [Search-Log Analysis: The Most Overlooked Opportunity in Web UX Research](https://www.nngroup.com/articles/search-log-analysis/)
* [Scoped Search: Dangerous, but Sometimes Useful](https://www.nngroup.com/articles/scoped-search/)
* [3 Guidelines for Search Engine "No Results" Pages](https://www.nngroup.com/articles/search-no-results-serp/)

### Enterprise Knowledge LLC

* [Optimizing Your Search Experience: A Human-Centered Approach to Search Design](https://enterprise-knowledge.com/optimizing-your-search-experience-a-human-centered-approach-to-search-design/)

### Facets

* [Facets of Faceted Search](https://medium.com/@dtunkelang/facets-of-faceted-search-38c3e1043592)
* [Coffee, Coffee, Coffee!](https://medium.com/@dtunkelang/coffee-coffee-coffee-de3121b797d1)
* [Faceted Search](https://queryunderstanding.com/faceted-search-7d053cc4fada) (start here!)
* [How to implement faceted search the right way](https://medium.com/empathyco/how-to-implement-faceted-search-the-right-way-4bfba2bd2adc)
* [Metadata and Faceted Search](https://medium.com/searchblox/metadata-and-faceted-search-62ec6e4de353)
* [Metacrap: Putting the torch to seven straw-men of the meta-utopia](https://people.well.com/user/doctorow/metacrap.htm)
* [7 Filtering Implementations That Make Macy’s Best-in-Class](https://baymard.com/blog/macys-filtering-experience)
* [Facet Search: The Most Comprehensive Guide. Best Practices, Design Patterns, Hidden Caveats, And Workarounds](https://hybrismart.com/2019/02/13/facet-search-the-most-comprehensible-guide-best-practices-design-patterns/#d5)
* [Facets: Constraints or Preferences?](https://dtunkelang.medium.com/facets-constraints-or-preferences-8b8689903652)
* [Facets, But Which Ones?](https://dtunkelang.medium.com/facets-but-which-ones-6589416ed4db)

#### Accidental Taxonomist

* [How Many Facets Should a Taxonomy Have](http://accidental-taxonomist.blogspot.com/2020/07/how-many-facets-in-taxonomy.html)
* [When a Taxonomy Should not be Hierarchical](https://accidental-taxonomist.blogspot.com/2020/06/when-taxonomy-should-not-be-hierarchical.html)
* [Customizing Taxonomy Facets](http://accidental-taxonomist.blogspot.com/2020/10/customizing-taxonomy-facets.html)

### Other

* [Learning from Friction to Improve the Search Experience](https://medium.com/@dtunkelang/learning-from-friction-to-improve-the-search-experience-8937c71ec97a)
* [Why is it so hard to sort by price?](https://medium.com/@dtunkelang/why-is-it-so-hard-to-sort-by-price-2a5e63899233)
* [Faceted Sorting](https://baymard.com/blog/faceted-sorting)
* [Google kills Instant Search](https://www.904labs.com/en/blog-google-kills-instant-search.html)

## Spelling correction

* Peter Norvig. ["How to Write a Spelling Corrector"](http://norvig.com/spell-correct.html). Classic publication.
* Daniel Tunkelang. ["Spelling Correction"](https://queryunderstanding.com/spelling-correction-471f71b19880)
* [A simple spell checker built from word vectors](https://blog.usejournal.com/a-simple-spell-checker-built-from-word-vectors-9f28452b6f26)
* A closer look into the spell correction problem: [1](https://medium.com/@searchhub.io/a-closer-look-into-the-spell-correction-problem-part-1-a6795bbf7112), [2](https://medium.com/@searchhub.io/a-closer-look-into-the-spell-correction-problem-part-2-introducing-predict-8993ecab7226), [3](https://medium.com/@searchhub.io/a-closer-look-into-the-spell-correction-problem-part-3-the-bells-and-whistles-19697a34011b), [preDict](https://github.com/searchhub/preDict)
* [Deep Spelling](https://machinelearnings.co/deep-spelling-9ffef96a24f6)
* [Modeling Spelling Correction for Search at Etsy](https://codeascraft.com/2017/05/01/modeling-spelling-correction-for-search-at-etsy/)
* Wolf Garbe. Author of [Sympell](https://github.com/wolfgarbe/symspell). [1000x Faster Spelling Correction algorithm](https://medium.com/@wolfgarbe/1000x-faster-spelling-correction-algorithm-2012-8701fcd87a5f), [Top highlight SymSpell vs. BK-tree: 100x faster fuzzy string search & spell checking](https://towardsdatascience.com/symspell-vs-bk-tree-100x-faster-fuzzy-string-search-spell-checking-c4f10d80a078), [Fast Word Segmentation of Noisy Text](https://towardsdatascience.com/fast-word-segmentation-for-noisy-text-2c2c41f9e8da)
* [Chars2vec: character-based language model for handling real world texts with spelling errors and](https://hackernoon.com/chars2vec-character-based-language-model-for-handling-real-world-texts-with-spelling-errors-and-a3e4053a147d)
* JamSpell, spelling correction taking into account surrounding context - [library](https://github.com/bakwc/JamSpell), (in russian) [Исправляем опечатки с учётом контекста](https://habr.com/ru/post/346618/)
* [Embedding for spelling correction](https://towardsdatascience.com/embedding-for-spelling-correction-92c93f835d79)
* [A simple spell checker built from word vectors](https://blog.usejournal.com/a-simple-spell-checker-built-from-word-vectors-9f28452b6f26)
* [What are some algorithms of spelling correction that are used by search engines?](https://www.quora.com/String-Searching-Algorithms/What-are-some-algorithms-of-spelling-correction-that-are-used-by-search-engines-For-example-when-I-used-Google-to-search-Google-imeges-it-prompted-me-Did-you-mean-Google-images/answer/Wolf-Garbe)
* [Moman](https://github.com/jpbarrette/moman) - lucene/solr/elasticsearch spell correction/autocorrect is (was?) actually powered by this library.
* [Query Segmentation and Spelling Correction](https://towardsdatascience.com/query-segmentation-and-spelling-correction-483173008981)
* [Applying Context Aware Spell Checking in Spark NLP](https://medium.com/spark-nlp/applying-context-aware-spell-checking-in-spark-nlp-3c29c46963bc)
* [Autocorrect in Google, Amazon and Pinterest and how to write your own one](https://towardsdatascience.com/autocorrect-in-google-amazon-and-pinterest-and-how-to-write-your-own-one-6d23bc927c81)

## Synonyms

* [Boosting the power of Elasticsearch with synonyms](https://www.elastic.co/blog/boosting-the-power-of-elasticsearch-with-synonyms)
* [Real Talk About Synonyms and Search](https://medium.com/@dtunkelang/real-talk-about-synonyms-and-search-bb5cf41a8741)
* [Synonyms in Solr I — The good, the bad and the ugly](https://medium.com/empathyco/synonyms-in-solr-i-the-good-the-bad-and-the-ugly-efe8e437a940)
* [Synonyms and Antonyms from WordNet](https://medium.com/@tameremil/synonyms-and-antonyms-from-wordnet-778f6274fb09)
* [Synonyms and Antonyms in Python](https://towardsdatascience.com/synonyms-and-antonyms-in-python-a865a5e14ce8)
* [Dive into WordNet with NLTK](https://medium.com/parrot-prediction/dive-into-wordnet-with-nltk-b313c480e788)
* [Creating Better Searches Through Automatic Synonym Detection](https://lucidworks.com/post/search-automatic-synonym-detection/)
* [Multiword synonyms in search using Querqy](https://sharing.luminis.eu/blog/multiword-synonyms-in-search-using-querqy/)
* [How to Build a Smart Synonyms Model](https://blog.kensho.com/how-to-build-a-smart-synonyms-model-1d525971a4ee)
* [The importance of Synonyms in eCommerce Search](https://blog.searchhub.io/the-importance-of-synonyms-in-ecommerce-search)

## Stopwords

- [Do all-stopword queries matter?](https://observer.wunderwood.org/2007/05/31/do-all-stopword-queries-matter/)

## Suggestions

Synonyms: autocomplete, search as you type, suggestions

* Giovanni Fernandez-Kincade.
[Bootstrapping Autosuggest](https://medium.com/related-works-inc/bootstrapping-autosuggest-c1ca3edaf1eb), [Building an Autosuggest Corpus, Part 1](https://medium.com/related-works-inc/building-an-autosuggest-corpus-part-1-3acd26056708), [Building an Autosuggest Corpus, Part 2](https://medium.com/related-works-inc/building-an-autosuggest-corpus-nlp-d21b0f25c31b), [Autosuggest Retrieval Data Structures & Algorithms](https://medium.com/related-works-inc/autosuggest-retrieval-data-structures-algorithms-3a902c74ffc8), [Autosuggest Ranking](https://medium.com/related-works-inc/autosuggest-ranking-d8a3242c2837)
* [On two types of suggestions](https://web.archive.org/web/20181207194952/https://www.searchblox.com/autosuggest-search-query-based-vs-content-based)
* [Improving Search Suggestions for eCommerce](https://medium.com/empathyco/improving-search-suggestions-for-ecommerce-cb1bc2946021)
* [Autocomplete Search Best Practices to Increase Conversions](https://lucidworks.com/post/autocomplete-search-increase-conversions/)
* [Why we’ve developed the searchhub smartSuggest module and why it might matter to you](https://www.linkedin.com/pulse/why-weve-developed-searchhub-smartsuggest-module-might-andreas-wagner/)
* Nielsen Norman Group: [Site Search Suggestions](https://www.nngroup.com/articles/site-search-suggestions/)
* [13 Design Patterns for Autocomplete Suggestions](https://baymard.com/blog/autocomplete-design)
* [Autocomplete](https://queryunderstanding.com/autocomplete-69ed81bba245)
* [Autocomplete and User Experience](https://queryunderstanding.com/autocomplete-and-user-experience-421df6ab3000)
* [IMPLEMENTING A LINKEDIN LIKE SEARCH AS YOU TYPE WITH ELASTICSEARCH](https://spinscale.de/posts/2020-05-29-implementing-a-linkedin-like-search-as-you-type-with-elasticsearch.html)
* [Smart autocomplete best practices: improve search relevance and sales](https://blog.griddynamics.com/smart-autocomplete-best-practices/)
* OLX: [Building Corpus for AutoSuggest (Part 1)](https://tech.olx.com/building-corpus-for-autosuggest-part-1-4f63512b1ea1), [AutoSuggest Retrieval & Ranking (Part 2)](https://tech.olx.com/autosuggest-retrieval-ranking-part-2-14a8f50fef34)
* [Autocomplete, Live Search Suggestions, and Autocorrection: Best Practice Design Patterns](https://hybrismart.com/2019/01/08/autocomplete-live-search-suggestions-autocorrection-best-practice-design-patterns/)
* [Mirror, Mirror, What Am I Typing Next? All About Search Suggestions](https://spinscale.de/posts/2023-01-18-mirror-mirror-what-am-i-typing-next.html)

## Graphs/Taxonomies/Knowledge Graph

* [Knowledge graphs applied in the retail industry](https://towardsdatascience.com/knowledge-graphs-applied-in-the-retail-industry-ecac4e7baf8)

Knowledge graphs are becoming increasingly popular in tech. We explore how they can be used in the retail industry to enrich data, widen search results and add value to a retail company.

* [Awesome Knowledge Graphs](https://github.com/frutik/awesome-knowledge-graphs)

### Integrating Search and Knowledge Graphs (by Enterprise Knowledge)

* [Part 1: Displaying Relationships](https://enterprise-knowledge.com/integrating-search-and-knowledge-graphs-series-part-1-displaying-relationships/)
* [Search query expansion with query embeddings](https://bytes.grubhub.com/search-query-embeddings-using-query2vec-f5931df27d79)

## Query expansion

- [Fundamentals of query rewriting (part 1): introduction to query expansion](https://opensourceconnections.com/blog/2021/10/19/fundamentals-of-query-rewriting-part-1-introduction-to-query-expansion/?utm_source=dlvr.it&utm_medium=linkedin)

## Query understanding

* Daniel Tunkelang [Query Understanding](https://queryunderstanding.com/introduction-c98740502103).
* [Query Understanding, Divided into Three Parts](https://medium.com/@dtunkelang/query-understanding-divided-into-three-parts-d9cbc81a5d09)
* [Search for Things not for Strings](https://blog.searchhub.io/humans-search-for-things-not-for-strings-2?cn-reloaded=1)
* Understanding the Search Query. [Part 1](https://towardsdatascience.com/understanding-the-search-query-part-i-632d1b323b50), [Part 2](https://medium.com/analytics-vidhya/understanding-the-search-query-part-ii-44d18892283f), [Part 3](https://medium.com/@sonusharma.mnnit/understanding-the-search-query-part-iii-a0c5637a639)
* [Food Discovery with Uber Eats: Building a Query Understanding Engine](https://eng.uber.com/uber-eats-query-understanding/)
* [AI for Query Understanding](https://www.linkedin.com/pulse/ai-query-understanding-daniel-tunkelang)

### Search Intent

* [Mapping Search Queries To Search Intents](https://medium.com/@dtunkelang/search-queries-and-search-intent-1dec79ad155f)
* [Search: Intent, Not Inventory](https://medium.com/@dtunkelang/search-intent-not-inventory-289386f28a21)

### Query segmentation

* Paper [Unsupervised Query Segmentation Using only Query Logs ](https://www.microsoft.com/en-us/research/wp-content/uploads/2011/01/pp0295-mishra.pdf)
* Paper [Towards Semantic Query Segmentation](https://arxiv.org/pdf/1707.07835.pdf)

## Algorithms

### BERT

* [Understanding BERT and Search Relevance](https://opensourceconnections.com/blog/2019/11/05/understanding-bert-and-search-relevance/)
* [Google is improving web search with BERT – can we use it for enterprise search too?](https://www.linkedin.com/pulse/google-improving-web-search-bert-can-we-use-too-mickel-gr%C3%B6nroos/)

### ColBERT

* [Pretrained Transformer Language Models for Search - part 3](https://blog.vespa.ai/pretrained-transformer-language-models-for-search-part-3/#)

### Collocations, common phrases

* [Automatically detect common phrases – multi-word expressions / word n-grams – from a stream of sentences.]( https://radimrehurek.com/gensim/models/phrases.html)
* [The Unreasonable Effectiveness of Collocations](https://opensourceconnections.com/blog/2019/05/16/unreasonable-effectiveness-of-collocations/)

### Other Algorithms

* [One hot encoding](https://medium.com/fintechexplained/nlp-text-data-to-numbers-d28d32294d2e)
* [Writing a full-text search engine using Bloom filters](https://www.stavros.io/posts/bloom-filter-search-engine/)

#### Hashing

* [Locality Sensitive Hashing](https://towardsdatascience.com/understanding-locality-sensitive-hashing-49f6d1f6134)
* [Locality Sensitive Hashing (LSH): The Practical and Illustrated Guide](https://www.pinecone.io/learn/locality-sensitive-hashing/)
* [Minhash](http://ekzhu.com/datasketch/minhash.html)

#### Sorting by average ratings

* [Better than Average: Sort by Best Rating](https://www.elastic.co/blog/better-than-average-sort-by-best-rating-with-elasticsearch)
* [How Not To Sort By Average Rating](https://www.evanmiller.org/how-not-to-sort-by-average-rating.html)

#### Keywords extraction

* [Keyword Extraction using RAKE](https://codelingo.wordpress.com/2017/05/26/keyword-extraction-using-rake/)
* [Yet Another Keyword Extractor (Yake)](https://github.com/LIAAD/yake)
* [Keyword Extraction with BERT](https://towardsdatascience.com/keyword-extraction-with-bert-724efca412e)

## Tracking, profiling, GDPR, Analysis

### Tools, platforms, helpers for search tracking

* [OpenSearch User Behavior Insights](https://github.com/opensearch-project/user-behavior-insights)
* [Site Search tracking with Google Analytics 4](https://opensourceconnections.com/blog/2023/04/06/site-search-tracking-with-google-analytics-4/)
* [Snowplow](https://snowplow.io/)
* [search-colletor](https://github.com/searchhub/search-collector)
* [OpenTelemetry with search additions](https://gist.github.com/binarymax/16ef2ed12d0aa446a6240c5fbb95e2c3)

### Resources

* [Anonymisation: managing data protection risk (code of practice)](https://ico.org.uk/media/1061/anonymisation-code.pdf)
* [The Anonymisation Decision-Making Framework](https://ukanon.net/wp-content/uploads/2015/05/The-Anonymisation-Decision-making-Framework.pdf)
* [98 personal data points that facebook uses to target ads to you](https://www.washingtonpost.com/news/the-intersect/wp/2016/08/19/98-personal-data-points-that-facebook-uses-to-target-ads-to-you/)
* [Opportunity Analysis for Search](https://www.linkedin.com/pulse/opportunity-analysis-search-daniel-tunkelang/)
* [A Face Is Exposed for AOL Searcher No. 4417749](https://www.nytimes.com/2006/08/09/technology/09aol.html)
* [AOL search data leak](https://en.wikipedia.org/wiki/AOL_search_data_leak)
* [Personal data](https://en.wikipedia.org/wiki/Personal_data)

## Experiments

* [Common Pitfalls of Search Experimentation](https://www.searchhub.io/common-pitfalls-of-search-experimentation/)
* [Improving Search @scale with efficient query experimentation](https://youtu.be/5p9Ss2vn7t4?si=42TBiIpwO5IxO1SZ)

### A/B testing, MABs

* [A/B Testing for Search is Different](https://medium.com/@dtunkelang/a-b-testing-for-search-is-different-f6b0f6f4d0f5)
* [A/B Testing Search: thinking like a scientist](https://medium.com/@jamesrubinstein/a-b-testing-search-thinking-like-a-scientist-1cc34b88392e)

## Testing, metrics, KPIs

### Metrics

* [Discounted cumulative gain](https://en.wikipedia.org/wiki/Discounted_cumulative_gain)
* [Flavors of NDCG - normalized to what!?](https://softwaredoug.com/blog/2024/05/22/flavors-of-ndcg)
* [Mean reciprocal rank](https://en.wikipedia.org/wiki/Mean_reciprocal_rank)
* [P@k](https://en.wikipedia.org/wiki/Evaluation_measures_(information_retrieval)#Precision_at_K)
* [Demystifying nDCG and ERR](https://opensourceconnections.com/blog/2019/12/09/demystifying-ndcg-and-err/)
* [Choosing your search relevance evaluation metric](https://opensourceconnections.com/blog/2020/02/28/choosing-your-search-relevance-metric/)
* [How to Implement a Normalized Discounted Cumulative Gain (NDCG) Ranking Quality Scorer in Quepid](https://opensourceconnections.com/blog/2018/02/26/ndcg-scorer-in-quepid/)
* https://en.wikipedia.org/wiki/Precision_and_recall
* https://en.wikipedia.org/wiki/F1_score
* [Visualizing search metrics](https://nathanday.shinyapps.io/rank-algo-app/)
* [Choosing your search relevance evaluation metric](https://opensourceconnections.com/blog/2020/02/28/choosing-your-search-relevance-metric/)
* [Compute Mean Reciprocal Rank (MRR) using Pandas](https://softwaredoug.com/blog/2021/04/21/compute-mrr-using-pandas.html)
* [Recommender Systems: Machine Learning Metrics and Business Metrics](https://neptune.ai/blog/recommender-systems-metrics)

### KPIs

* [5 Right Ways to Measure How Search Is Performing](https://opensourceconnections.com/blog/2020/05/11/5-right-ways-to-measure-search/)
* E-commerce Site-Search KPIs. [Part 1 – Customers](https://opensourceconnections.com/blog/2020/08/28/e-commerce-site-search-kpis/), [Part 2 – Products](https://opensourceconnections.com/blog/2020/09/10/e-commerce-site-search-kpis-part-2/), [Part 3 - Queries](https://opensourceconnections.com/blog/2020/09/24/e-commerce-site-search-kpis-part-3-queries/)
* [Learning from Friction to Improve the Search Experience](https://medium.com/@dtunkelang/learning-from-friction-to-improve-the-search-experience-8937c71ec97a)
* [Behind the Wizardry of a Seamless Search Experience](https://enterprise-knowledge.com/if-i-only-had-an-enterprise-search-brain-behind-the-wizardry-of-a-seamless-search-experience/)
* [Analyzing online search relevance metrics with the Elastic Stack](https://www.elastic.co/blog/analyzing-online-search-relevance-metrics-with-the-elastic-stack)
* [How to Gain Insight From Search Analytics](https://www.searchblox.com/how-to-gain-insight-from-search-analytics/)

### Evaluating Search (by Daniel Tunkelang)

* [Measure It](https://medium.com/@dtunkelang/evaluating-good-search-part-i-measure-it-5507b2dbf4f6)
* [Measuring Searcher Behavior](https://medium.com/@dtunkelang/evaluating-search-measuring-searcher-behavior-5f8347619eb0)
* [Using Human Judgement](https://medium.com/@dtunkelang/evaluating-search-using-human-judgement-fbb2eeba37d9)
* [When There’s No Conversion Rate](https://medium.com/@dtunkelang/when-theres-no-conversion-rate-67a372666fed)

### Measuring Search (by James Rubinstein)

* [Statistical and human-centered approaches to search engine improvement](https://medium.com/@jamesrubinstein/statistical-and-human-centered-approaches-to-search-engine-improvement-52af0e98f38f)
* [A Human Approach](https://medium.com/@jamesrubinstein/measuring-search-a-human-approach-acf54e2cf33d)
* [Setting up a relevance evaluation program](https://medium.com/@jamesrubinstein/setting-up-a-relevance-evaluation-program-c955d32fba0e)
* [Metrics Matter](https://medium.com/@jamesrubinstein/measuring-search-metrics-matter-de124c2f6f8c)
* [A/B Testing Search: thinking like a scientist](https://medium.com/@jamesrubinstein/a-b-testing-search-thinking-like-a-scientist-1cc34b88392e)
* [Query Triage: The Secret Weapon for Search Relevance](https://medium.com/@jamesrubinstein/query-triage-the-secret-weapon-for-search-relevance-1a02cdd297ed)
* [The Launch Review: bringing it all together…](https://medium.com/@jamesrubinstein/the-launch-review-bringing-it-all-together-2f7e4cfbf86e)

### Three Pillars of Search Relevancy (by Andreas Wagner)

* [Part 1: Findability](https://blog.searchhub.io/three-pillars-of-search-quality-in-ecommerce-part-1-findability)
* [part 2: Search Quality For Discovery & Inspiration](https://blog.searchhub.io/three-pillars-of-search-quality-in-ecommerce-part-2-discovery-inspiration)

## Architecture

* [The Art Of Abstraction – Revisiting Webshop Architecture](https://blog.searchhub.io/the-art-of-abstraction-revisting-webshop-architecture)
* Canva - Search Pipeline
* [Part One](https://canvatechblog.com/search-pipeline-part-i-faa6c543aef1) outline of the challenges faced
* [Part Two](https://canvatechblog.com/search-pipeline-part-ii-3b43978607cd) new search arcthitecture
* [Event-Driven Architecture for Efficient Search Indexing](https://sasarun.medium.com/event-driven-architecture-for-efficient-search-indexing-f7af27192e98)

## Education and networking

### Conferences

* [Activate](https://www.activate-conf.com/)
* [Berlin buzzword](berlinbuzzwords.de)
* [Haystack](https://haystackconf.com/)
* [Elastic{ON}](https://www.elastic.co/elasticon/)
* [MIX-CAMP E-COMMERCE SEARCH](http://www.mices.co)
* [SIGIR eCommerce](https://sigir-ecom.github.io/index.html)
- [2019](https://sigir-ecom.github.io/ecom2019/index.html)
- [2018](https://sigir-ecom.github.io/ecom2018/index.html)
- [2017](http://sigir-ecom.weebly.com/)

### Trainings and courses

* [Machine Learning Powered Search. Doug Turnbull](https://www.getsphere.com/cohorts/machine-learning-powered-search) Next: Jan 24, 2023
* OpenSource Connections
- [Elasticsearch "Think Like a Relevance Engineer"](https://opensourceconnections.com/training/elasticsearch-think-like-a-relevance-engineer-tlre/)
- [Solr "Think Like a Relevance Engingeer"](https://opensourceconnections.com/training/solr-think-like-a-relevance-engineer-tlre/)
- [Beyond Search Relevance: Understanding and Measuring Search Result Quality](https://opensourceconnections.com/training/beyond-search-relevance-understanding-and-measuring-search-result-quality/)
- [Hello LTR](https://opensourceconnections.com/training/hello-ltr-learning-to-rank/)
* [Sease's trainings](https://sease.io/training)
* [Search Fundamentals. Daniel Tunkelang, Grant Ingersoll](https://corise.com/course/search-fundamentals) Next: Feb 6, 2023
* [Search with Machine Learning. Daniel Tunkelang, Grant Ingersoll](https://corise.com/course/search-with-machine-learning) Next: Feb 27, 2023
* [Search for Product Managers. Daniel Tunkelang](https://corise.com/course/search-for-product-managers) Next: Apr 3, 2023
* [Sematext's Solr, Elasticsearch, and OpenSearch trainings](https://sematext.com/training/)

Fall 2023

* https://dtunkelang.medium.com/upcoming-search-classes-this-fall-58f877fe00ad

### Books

* [AI-powered search](https://www.manning.com/books/ai-powered-search)
* [Relevant Search](https://www.manning.com/books/relevant-search)
* [Deep Learning for search](https://www.manning.com/books/deep-learning-for-search)
* [Interactions with search systems](https://www.cambridge.org/core/books/interactions-with-search-systems/5B3CF5920355A8B09088F2C409FFABDC)
* [Embeddings in Natural Language Processing. Theory and Advances in Vector Representation of Meaning](http://josecamachocollados.com/book_embNLP_draft.pdf)
* [Search User Interfaces](http://www.searchuserinterfaces.com)
* [Search Patterns](https://searchpatterns.org/)
* [Search Analytics for Your Site: Conversations with Your Customers](https://www.amazon.com/Search-Analytics-Your-Site-Conversations/dp/1933820209)
* [Click Models for Web Search](https://www.amazon.com/Synthesis-Lectures-Information-Concepts-Retrieval/dp/1627056475/)
* [Optimization Algorithms](https://www.manning.com/books/optimization-algorithms)

### Blogs and Portals

* [Searchnews](http://searchnews.org/)


### Papers

* [List of papers](PAPERS.md)

## Management, Search Team

* [Search is a Team Sport](https://medium.com/search-in-21st-century/search-is-a-team-sport-400eecdfe736)
* [Thoughts about Managing Search Teams](https://medium.com/@dtunkelang/thoughts-about-managing-search-teams-f8d2f54fbed7)
* [On Search Leadership](https://dtunkelang.medium.com/on-search-leadership-815b36c15df1)
* [Building an Effective Search Team: the key to great search & relevancy](https://opensourceconnections.com/blog/2020/05/14/building-an-effective-search-team-the-key-to-great-search-relevancy/)
* [Query Triage: The Secret Weapon for Search Relevance](https://medium.com/@jamesrubinstein/query-triage-the-secret-weapon-for-search-relevance-1a02cdd297ed)
* [The Launch Review: bringing it all together ](https://medium.com/@jamesrubinstein/the-launch-review-bringing-it-all-together-2f7e4cfbf86e)
* [The Role of Search Product Owners](https://enterprise-knowledge.com/the-role-of-search-product-owners/)
* [Search Product Management: The Most Misunderstood Role in Search?](https://jamesrubinstein.medium.com/search-product-management-the-most-misunderstood-role-in-search-2b7569058638)
* [Search relevance for understaffed teams](https://softwaredoug.com/blog/2023/05/29/guide-for-search-teams.html)

### Job Interviews

* [Interview Questions for Search Relevance Engineers, Data Scientists, and Product Managers](https://medium.com/@dtunkelang/interview-questions-for-search-relevance-engineers-and-product-managers-7a1b6b8cacea)
* [Data Science Interviews: Ranking and search](https://github.com/alexeygrigorev/data-science-interviews/blob/master/theory.md#ranking-andsearch)

### Engineering

* [Technical debt in search](https://twitter.com/gsingers/status/1655286486692970497?t=7HVu0Kc2vXT5NPHH_bB2uA&s=19)

## Blogposts series

### Search Optimization 101 (by Charlie Hull)

* [How do I know that my search is broken?](https://blog.supahands.com/2020/07/08/how-do-i-know-that-my-search-is-broken/)
* [What does it mean if my search is ‘broken’?](https://blog.supahands.com/2020/07/20/search-optimization-101-what-does-it-mean-if-my-search-is-broken/)
* [How do you fix a broken search?](https://blog.supahands.com/2020/08/04/search-optimization-101-how-do-you-fix-a-broken-search/)
* [Reducing business risk by optimizing search
](https://blog.supahands.com/2020/09/02/reducing-business-risks-by-optimizing-search/)

### Query Understanding (by Daniel Tunkelang)
Better search through query understanding.

* [An Introduction](https://queryunderstanding.com/introduction-c98740502103)
* [Language Identification](https://queryunderstanding.com/language-identification-c1d2a072eda)
* [Character Filtering](https://queryunderstanding.com/character-filtering-76ede1cf1a97)
* [Tokenization](https://queryunderstanding.com/tokenization-c8cdd6aef7ff)
* [Spelling Correction](https://queryunderstanding.com/spelling-correction-471f71b19880)
* [Stemming and Lemmatization](https://queryunderstanding.com/stemming-and-lemmatization-6c086742fe45)
* [Query Rewriting: An Overview](https://queryunderstanding.com/query-rewriting-an-overview-d7916eb94b83)
* [Query Expansion](https://queryunderstanding.com/query-expansion-2d68d47cf9c8)
* [Query Relaxation](https://queryunderstanding.com/query-relaxation-342bc37ad425)
* [Query Segmentation](https://queryunderstanding.com/query-segmentation-2cf860ade503)
* [Query Scoping](https://queryunderstanding.com/query-scoping-ed61b5ec8753)
* [Entity Recognition](https://queryunderstanding.com/entity-recognition-763cae840a20)
* [Taxonomies and Ontologies](https://queryunderstanding.com/taxonomies-and-ontologies-8e4812a79cb2)
* [Autocomplete](https://queryunderstanding.com/autocomplete-69ed81bba245)
* [Autocomplete and User Experience](https://queryunderstanding.com/autocomplete-and-user-experience-421df6ab3000)
* [Contextual Query Understanding: An Overview](https://queryunderstanding.com/contextual-query-understanding-65c78d792dd8)
* [Session Context](https://queryunderstanding.com/session-context-4af0a355c94a)
* [Location as Context](https://queryunderstanding.com/geographical-context-77ce4c773dc7)
* [Seasonality](https://queryunderstanding.com/seasonality-5eef79d8bf1c)
* [Personalization](https://queryunderstanding.com/personalization-3ed715e05ef)
* [Search as a Conversation](https://queryunderstanding.com/search-as-a-conversation-bafa7cd0c9a5)
* [Clarification Dialogues](https://queryunderstanding.com/clarification-dialogues-69420432f451)
* [Relevance Feedback](https://queryunderstanding.com/relevance-feedback-c6999529b92c)
* [Faceted Search](https://queryunderstanding.com/faceted-search-7d053cc4fada)
* [Search Results Presentation](https://queryunderstanding.com/search-results-presentation-7d6c6c384ec1)
* [Search Result Snippets](https://queryunderstanding.com/search-result-snippets-e8c447950219)
* [Search Results Clustering](https://queryunderstanding.com/search-results-clustering-b2fa64c6c809)
* [Question Answering](https://queryunderstanding.com/question-answering-94984185c203)
* [Query Understanding and Voice Interfaces](https://queryunderstanding.com/query-understanding-and-voice-interfaces-6cd60d063fca)
* [Query Understanding and Chatbots](https://queryunderstanding.com/query-understanding-and-chatbots-5fa0c154f)

### Grid Dynamics

* [Not your father’s search engine: a brief history of retail search](https://blog.griddynamics.com/not-your-fathers-search-engine-a-brief-history-of-retail-search/)
* [Semantic vector search: the new frontier in product discovery](https://blog.griddynamics.com/semantic-vector-search-the-new-frontier-in-product-discovery/)
* [Boosting product discovery with semantic search](https://blog.griddynamics.com/boosting-product-discovery-with-semantic-search/)
* [Semantic query parsing blueprint](https://blog.griddynamics.com/semantic-query-parsing-blueprint/)

### Considering Search: Search Topics (by Derek Sisson)

* [Intro](https://www.philosophe.com/archived_content/search_topics/search_topics.html)
* [Assumptions About Search](https://www.philosophe.com/archived_content/search_topics/search_assumptions.html)
* [Assumptions About User Search Behavior](https://www.philosophe.com/archived_content/search_topics/user_behavior.html)
* [Types of Information Collections](https://www.philosophe.com/archived_content/search_topics/collections.html)
* [A Structural Look at Search](https://www.philosophe.com/archived_content/search_topics/structure.html)
* [Users and the Task of Information Retrieval](https://www.philosophe.com/archived_content/search_topics/search_tasks.html)
* [Testing Search](https://www.philosophe.com/archived_content/search_topics/search_tests.html)
* [Useful Search Links and References](https://www.philosophe.com/archived_content/search_topics/search_links.html)

## Industry players

### Personalies and influencers

* [Daniel Tunkelang (he is God of Search)](https://medium.com/@dtunkelang)
* [Max Irwin](https://twitter.com/binarymax)
* [Doug Turnbull](https://twitter.com/softwaredoug)
* [Baymard’s Institute](https://baymard.com/blog)

### Search Engines

* Google
* Bing
* Yandex
* Amazon
* eBay

### Products and services

* [Algolia](https://www.algolia.com/)
* Vespa
* Elastic
* Solr
* [Fess Enterprise Search Server](https://github.com/codelibs/fess)
* [Typesense](https://github.com/typesense/typesense) - an opensource alternative to Algolia.
* [SearchHub.io](https://www.searchhub.io/)
* [Datafari](https://www.datafari.com/en/index.html) - an open source enterprise search solution.
* [Qdrant](https://qdrant.tech/) - an open source vector database.

### Consulting companies

* [OpenSource Connections](https://www.opensourceconnections.com)
* https://sease.io/
* [Sematext](https://sematext.com/)

## Case studies

* Airbnb - [Machine Learning-Powered Search Ranking of Airbnb Experiences](https://medium.com/airbnb-engineering/machine-learning-powered-search-ranking-of-airbnb-experiences-110b4b1a0789)
* Airbnb - [Listing Embeddings in Search Ranking](https://medium.com/airbnb-engineering/listing-embeddings-for-similar-listing-recommendations-and-real-time-personalization-in-search-601172f7603e)
* Algolia - [The Architecture Of Algolia’s Distributed Search Network](http://highscalability.com/blog/2015/3/9/the-architecture-of-algolias-distributed-search-network.html)
* Meituan - Exploration and practice of BERT in the core ranking of Meituan search (🇨🇳 [BERT在美团搜索核心排序的探索和实践](https://tech.meituan.com/2020/07/09/bert-in-meituan-search.html))
* Netflix - How Netflix Content Engineering makes a federated graph searchable ([Part 1](https://netflixtechblog.com/how-netflix-content-engineering-makes-a-federated-graph-searchable-5c0c1c7d7eaf), [Part 2](https://netflixtechblog.com/how-netflix-content-engineering-makes-a-federated-graph-searchable-part-2-49348511c06c))
* Netflix - [Elasticsearch Indexing Strategy in Asset Management Platform (AMP)](https://netflixtechblog.medium.com/elasticsearch-indexing-strategy-in-asset-management-platform-amp-99332231e541)
* Skyscanner - [Learning to Rank for Flight Itinerary Search](https://hackernoon.com/learning-to-rank-for-flight-itinerary-search-8594761eb867)
* Slack - [Search at Slack](https://slack.engineering/search-at-slack-431f8c80619e)
* Twitter - [Stability and scalability for search](https://blog.twitter.com/engineering/en_us/topics/infrastructure/2022/stability-and-scalability-for-search)
* [Amazon SEO Explained: How to Rank Your Products #1 in Amazon Search Results in 2020](https://crazylister.com/blog/amazon-seo-ultimate-guide/)
* [Building a Better Search Engine for Semantic Scholar](https://medium.com/ai2-blog/building-a-better-search-engine-for-semantic-scholar-ea23a0b661e7)

### General search

* [How Bing Ranks Search Results: Core Algorithm & Blue Links](https://www.searchenginejournal.com/how-bing-ranks-search-results/357804/)
* [How Google Search Ranking Works – Darwinism in Search](https://www.searchenginejournal.com/how-google-search-ranking-works/307591/)

### E-commerce

* [Searchandising](https://searchanise.io/blog/searchandising/)

### Multisided markets

* [Discover How Cassini (The eBay Search Engine) Works and Rank](https://crazylister.com/blog/ebay-search-engine-cassini/)

## Videos
[Apache Solr Short Tips](https://www.youtube.com/watch?v=YFoPWgja89o&list=PLT_fd32OFYpe7xXxUYtV8upGcZtgPY3cU)

### Channels

* [Lucid Thoughts](https://www.youtube.com/c/LucidThoughts)
* [Lucidworks](https://www.youtube.com/user/LucidWorksSearch)
* [MIx-Camp E-commerce Search](https://www.youtube.com/channel/UCCxvMykUdtFFc1O_tIr9oxA)
* [OpenSource Connections](https://www.youtube.com/channel/UCiuXt-f2Faan4Es37nADUdQ)
* [SIGIR eCom](https://www.youtube.com/channel/UCd6PyC_9zrxgA7vmT05Mx4Q)

### Featured

* [Relevant Facets](https://www.youtube.com/watch?v=W8DJYfAKKLA)

## Datasets

* [Shopping Queries Dataset: A Large-Scale ESCI Benchmark for Improving Product Search](https://github.com/amazon-science/esci-data/tree/main)
* [ESCI-S: extended metadata for Amazon ESCI dataset](https://github.com/shuttie/esci-s)
* [Home Depot Product Search Relevance](https://www.kaggle.com/competitions/home-depot-product-search-relevance/data)
* [WANDS - Wayfair ANnotation Dataset](https://github.com/wayfair/WANDS)

## Tools

### Spacy

[Awesome Spacy](https://github.com/frutik/awesome-spacy) - Natural language upderstanding, content enrichment etc.

### Word2Vec

* [Word2Vec For Phrases — Learning Embeddings For More Than One Word](https://towardsdatascience.com/word2vec-for-phrases-learning-embeddings-for-more-than-one-word-727b6cf723cf)
* [Gensim Word2Vec Tutorial](http://kavita-ganesan.com/gensim-word2vec-tutorial-starter-code/#.XV-wnJMzbUL)
* [How to incorporate phrases into Word2Vec – a text mining approach](http://kavita-ganesan.com/how-to-incorporate-phrases-into-word2vec-a-text-mining-approach/#.XV-wnJMzbUL)
* [Word2Vec — a baby step in Deep Learning but a giant leap towards Natural Language Processing](https://medium.com/explore-artificial-intelligence/word2vec-a-baby-step-in-deep-learning-but-a-giant-leap-towards-natural-language-processing-40fe4e8602ba)
* [How to Develop Word Embeddings in Python with Gensim](https://machinelearningmastery.com/develop-word-embeddings-python-gensim/)

### Libs

* [Query Segmenter](https://github.com/soumyaxyz/query-segmenter)
* https://github.com/zentity-io/zentity
* https://github.com/mammothb/symspellpy
* https://github.com/searchhub/search-collector
* [Kiri](https://github.com/kiri-ai/kiri) - State-of-the-art semantic search made easy.
* [Haystack](https://github.com/deepset-ai/haystack) - End-to-end Python framework for building natural language search interfaces to data.
* https://github.com/castorini/docTTTTTquery

### Other

* [Chorus](https://github.com/querqy/chorus), [Smui](https://github.com/querqy/smui), [Querqy](https://github.com/querqy/querqy)
* [Quepid](https://github.com/o19s/quepid)
* [Rated Ranking Evaluator](https://github.com/SeaseLtd/rated-ranking-evaluator)
* [Jina AI](https://github.com/jina-ai/jina) - A neural search framework

## Other awesome stuff

* [Awesome Knowledge Graphs](https://github.com/frutik/awesome-knowledge-graphs)
* [Awesome time series](https://github.com/frutik/awesome-timeseries)
* [Awesome Spacy](https://github.com/frutik/awesome-spacy)
* [Query-Understanding](https://github.com/sanazb/Query-Understanding)
* [Click models](https://github.com/filipecasal/knowledge-repo/blob/master/click_models.md)