{"id":21202114,"url":"https://github.com/tonellotto/terrier-micro","last_synced_at":"2026-07-02T22:34:10.243Z","repository":{"id":83197171,"uuid":"180596530","full_name":"tonellotto/terrier-micro","owner":"tonellotto","description":"An efficient layer to perform query processing on top of Terrier","archived":false,"fork":false,"pushed_at":"2023-06-14T22:29:32.000Z","size":2578,"stargazers_count":2,"open_issues_count":1,"forks_count":1,"subscribers_count":3,"default_branch":"1.5.3","last_synced_at":"2025-10-29T11:43:13.449Z","etag":null,"topics":["efficiency","information-retrieval","query-processing","terrier"],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"lgpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tonellotto.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"COPYING","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-04-10T14:13:13.000Z","updated_at":"2023-05-18T09:46:19.000Z","dependencies_parsed_at":null,"dependency_job_id":"400c47cd-ddeb-4c90-9c7b-5c8fae04ae35","html_url":"https://github.com/tonellotto/terrier-micro","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/tonellotto/terrier-micro","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tonellotto%2Fterrier-micro","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tonellotto%2Fterrier-micro/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tonellotto%2Fterrier-micro/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tonellotto%2Fterrier-micro/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tonellotto","download_url":"https://codeload.github.com/tonellotto/terrier-micro/tar.gz/refs/heads/1.5.3","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tonellotto%2Fterrier-micro/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":35065702,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-07-02T02:00:06.368Z","response_time":173,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["efficiency","information-retrieval","query-processing","terrier"],"created_at":"2024-11-20T20:13:12.685Z","updated_at":"2026-07-02T22:34:10.226Z","avatar_url":"https://github.com/tonellotto.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Terrier Micro\n\n[![Build Status](https://travis-ci.org/tonellotto/terrier-micro.svg?branch=1.5.1)](https://travis-ci.org/tonellotto/terrier-micro)\n[![Codacy Badge](https://api.codacy.com/project/badge/Grade/be331c1b98ca42b588db6115c548df07)](https://www.codacy.com?utm_source=github.com\u0026amp;utm_medium=referral\u0026amp;utm_content=tonellotto/terrier-micro\u0026amp;utm_campaign=Badge_Grade)\n[![License: LGPL v3](https://img.shields.io/badge/License-LGPL%20v3-blue.svg)](https://www.gnu.org/licenses/lgpl-3.0)\n\nThis project provides a lightweight implementation of some query processing strategies built on top of Terrier 5. It re-implements the query processing pipeline of the Terrier search engine, removing all unnecessary features such as document score modifiers, multiple weighting models, etc.\n\nIf you use this package to conduct search or experimentation, whether be it a research paper, dissertation, article, poster, presentation, or documentation, please cite the following paper:\n\n    @article{fnt,\n        author = {Tonellotto, Nicola and Macdonald, Craig and Ounis, Iadh},\n        issn = {1554-0669},\n        journal = {Foundations and Trends in Information Retrieval},\n        number = {4--5},\n        pages = {319--492},\n        title = {Efficient Query Processing for Scalable Web Search},\n        volume = {12},\n        year = {2018}\n    }\n\nThis package is [free software](http://www.gnu.org/philosophy/free-sw.html) distributed under the [GNU Lesser General Public License](http://www.gnu.org/copyleft/lesser.html).\n\n## Pre-requisites\n\n[Elias-Fano compression for Terrier](https://github.com/tonellotto/terrier-ef) is required for testing purposes or if you plan to use it in your experiments, but it is not explicitly required for using the Terrier Micro package.\n\nTo install the Elias-Fano compression for Terrier package (version 1.5.1) on your local machine, please run the following commands.\n\n```bash\ngit clone https://github.com/tonellotto/terrier-ef\ncd terrier-ef\ngit checkout 1.5.1\nmvn install appassembler:assemble\n```\n\n## Usage\n\nIf not already available, e.g. from Maven Central, you should git clone and install Terrier Micro (version 1.5.1):\n\n```bash\ngit clone https://github.com/tonellotto/terrier-micro\ncd terrier-micro\ngit checkout 1.5.1\nmvn install appassembler:assemble\n```\n\nThe main script to perform batch query processing is the [retrieve](./docs/retrieve.md) tool.\n\nIf you want to use all available processors on your machine to perform batch query processing, use the [parallel retrieve](./docs/parallel_retrieve.md) tool.\n\nTwo other scripts are provided, to support advanced query processing strategies: the [ms-generate](./docs/ms-gen.md) and [bmw-generate](./docs/bmw-gen.md) tools.\n\n## Python\n\nThe [python](./python) folder repo holds static copies of notebooks for learning to use Terrier Micro (Java). The notebooks in this repo are sync'ed (by hand) with notebooks in Colab. For convenience, there is a small pre-built  index, available to download [here](https://drive.google.com/open?id=1si4B1McN4u7a_kF8fTexsnStc7qWrlBb).\n\n* _Terrier Micro demo on Robust 2004_: [local](./python/terrier_robust04_demo.ipynb) and [colab](https://colab.research.google.com/drive/1M2UkPA2dWrpFx4zZO7beNl5bG6BA9-mR) notebooks.\n\n## Credits\n\nDeveloped by Nicola Tonellotto, ISTI-CNR. Contributions by Craig Macdonald, University of Glasgow, and Matteo Catena, ISTI-CNR.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftonellotto%2Fterrier-micro","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftonellotto%2Fterrier-micro","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftonellotto%2Fterrier-micro/lists"}