{"id":21496607,"url":"https://github.com/mlibrary/relevant_search_ruby_solr","last_synced_at":"2025-03-17T12:16:16.054Z","repository":{"id":61988958,"uuid":"555492869","full_name":"mlibrary/relevant_search_ruby_solr","owner":"mlibrary","description":"Examples with Ruby and Solr for the book Relevant Search (Turnbull \u0026 Berryman, Manning, 2016)","archived":false,"fork":false,"pushed_at":"2023-01-17T18:54:32.000Z","size":34,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-01-23T21:53:31.951Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mlibrary.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-10-21T17:34:31.000Z","updated_at":"2022-10-21T17:34:45.000Z","dependencies_parsed_at":"2023-01-23T03:10:13.722Z","dependency_job_id":null,"html_url":"https://github.com/mlibrary/relevant_search_ruby_solr","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlibrary%2Frelevant_search_ruby_solr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlibrary%2Frelevant_search_ruby_solr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlibrary%2Frelevant_search_ruby_solr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlibrary%2Frelevant_search_ruby_solr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mlibrary","download_url":"https://codeload.github.com/mlibrary/relevant_search_ruby_solr/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244031154,"owners_count":20386534,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-23T16:17:44.140Z","updated_at":"2025-03-17T12:16:16.028Z","avatar_url":"https://github.com/mlibrary.png","language":"Ruby","funding_links":[],"categories":[],"sub_categories":[],"readme":"This repository allows the reader to use Ruby and Solr to follow along with the\nexamples in chapters 3 and 4 of [Relevant\nSearch](https://learning.oreilly.com/library/view/relevant-search-with/9781617292774/)\n\nIt contains:\n\n* a `docker-compose` setup to run Solr locally, and enough to get a core set up\n  with the default \"schemaless\" configuration\n\n* a Ruby script for configuring Solr field definitions (if needed) and loading documents (`reindex.rb`)\n\n* an example of using RSolr to query the core (`query.rb`)\n\n## Getting Started\n\n* Clone, including the submodule with the example data\n\n```bash\ngit clone --recurse-submodules https://github.com/mlibrary/relevant_search_ruby_solr/\n```\n\n* Run the setup script to set up Solr such that we can create a core and index documents\n\n```bash\n./setup.sh\n```\n\n## Indexing\n\nIndex documents with `reindex.rb`:\n\n```\ndocker-compose run --rm index\n```\n\n### Field Types and Field Definitions\n\nTo adjust field and field type definitions -- see the methods\n`configure_fields` and `configure_field_types` and add the field type\ndefinition there.\n\n`reindex.rb` already includes Solr versions of some of the example analysis chains from Chapter 4; try\nrunning `docker compose --rm index` to set up the field types, then [try\nanalyzing text using the phonetic\nanalysis](http://localhost:8983/solr/#/tmdb/analysis?analysis.fieldtype=text_dbl_metaphone)\n\nTry setting up your own for the other examples using the information on filters\nfrom [the solr documentation on\nfilters](https://solr.apache.org/guide/solr/latest/indexing-guide/filters.html);\nin particular, experiment with:\n\n* [Word Delimiter Grpah\nFilter](https://solr.apache.org/guide/solr/latest/indexing-guide/filters.html#word-delimiter-graph-filter)\n* [Pattern Replace Filter](https://solr.apache.org/guide/solr/latest/indexing-guide/filters.html#pattern-replace-filter)\n* [Synonym Graph Filter](https://solr.apache.org/guide/solr/latest/indexing-guide/filters.html#synonym-graph-filter)\n* [Path Hierarchy Tokenizer](https://solr.apache.org/guide/solr/latest/indexing-guide/tokenizers.html#path-hierarchy-tokenizer)\n\nas compared to the ElasticSearch examples in the book.\n\nFor the most part, the XML configuration maps fairly cleanly to the schema API\nhere.\n\n### A Note on Nested Documents\n\nAs described in Chapter 5, the TMDB data set includes nested documents (e.g.\ncast, location).\n\nElasticSearch has automatic handling for nested documents, but Solr does not.\nThus, `reindex.rb` includes logic to extract the `name` and `characters` field\nfrom the nested documents and then throw away the remaining fields.\n\nSolr [can index nested\ndocuments](https://solr.apache.org/guide/solr/latest/indexing-guide/indexing-nested-documents.html)\nin a way that preserves relationships between fields in nested documents.  but\nit requires some more work. Nothing in Chapters 3-5 require the data from those\nnested documents. See `reindex_nested_docs.rb` for an example that sets up the field\ndefinitions for nested fields, adds document IDs to index the nested documents,\nand adds a field to indicate whether a document is a parent or child document.\nTo retrieve nested documents, append `[child]` to the field list (`fl`)\nparameter when querying. There is also a way to query child documents alongside\nthe parent documents -- see the [Block Join Children Query\nParser](https://solr.apache.org/guide/solr/latest/query-guide/block-join-query-parser.html)\nand [Searching Nested\nDocuments](https://solr.apache.org/guide/solr/latest/query-guide/searching-nested-documents.html)).\n\n## Querying\n\nTo start a pry session where you can query Solr:\n\n```\ndocker-compose run --rm query\n```\n\nThis will load `query.rb` and start pry.\n\nSome things you might try doing:\n\n```ruby\nparams = {q: \"basketball with cartoon aliens\", defType: \"edismax\", qf: \"title^10 overview\"}\nputs summary(search(params))\n```\n\nGet query debugging information\n\n```ruby\ndebug = search(params.merge({debugQuery: true}))[\"debug\"][\"parsedquery_toString\"]\n```\n\nExplaining results\n\n```\nputs explain(search(params))\n```\n\n## Debugging Analysis\n\n```\nhttp://localhost:8983/solr/#/tmdb/analysis?analysis.fieldvalue=Fire%20with%20Fire\u0026analysis.fieldname=title\u0026verbose_output=1\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmlibrary%2Frelevant_search_ruby_solr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmlibrary%2Frelevant_search_ruby_solr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmlibrary%2Frelevant_search_ruby_solr/lists"}