{"id":13438163,"url":"https://github.com/sfikas/rusteval","last_synced_at":"2025-07-07T00:08:51.523Z","repository":{"id":114793620,"uuid":"67040939","full_name":"sfikas/rusteval","owner":"sfikas","description":"A tool used to evaluate the output of retrieval algorithms. Written in Rust. ","archived":false,"fork":false,"pushed_at":"2020-04-13T13:24:55.000Z","size":18735,"stargazers_count":17,"open_issues_count":6,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-11T23:05:15.025Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sfikas.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-08-31T13:41:40.000Z","updated_at":"2022-03-22T17:41:25.000Z","dependencies_parsed_at":null,"dependency_job_id":"680bd471-1b33-46c9-a3ca-89d1e13aa1da","html_url":"https://github.com/sfikas/rusteval","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/sfikas/rusteval","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sfikas%2Frusteval","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sfikas%2Frusteval/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sfikas%2Frusteval/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sfikas%2Frusteval/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sfikas","download_url":"https://codeload.github.com/sfikas/rusteval/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sfikas%2Frusteval/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263991504,"owners_count":23540667,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T03:01:03.357Z","updated_at":"2025-07-07T00:08:51.486Z","avatar_url":"https://github.com/sfikas.png","language":"Rust","funding_links":[],"categories":["Development tools","开发工具 Development tools","开发工具"],"sub_categories":["Pattern recognition","模式识别 Pattern recognition","模式识别"],"readme":"# Rusteval\n\n[![Build Status](https://travis-ci.org/sfikas/rusteval.svg?branch=master)](https://travis-ci.org/sfikas/rusteval)\n\nA tool used to evaluate the output of retrieval algorithms. Written in [Rust].\n\n## Building\n\nInstall [Rust] with\n```\ncurl -sSf https://static.rust-lang.org/rustup.sh | sh\n```\nand build the release version with\n\n```\ngit clone \u003crusteval repo name\u003e\ncd rusteval\ncargo build --release\n```\n\n## Testing\n\nBefore testing, unzip the test result file found in the ```fixtures/``` folder with (unzipped this is \u003e100Mb):\n```\ngunzip fixtures/G1_TRACK_I_Bentham.xml.gz\n```\n\nRun the test suite with\n\n```\ncargo test --release\n```\n\nor, for a more verbose output\n\n```\ncargo test --release -- --nocapture\n```\n\n## Running\n\nAfter building and testing, run rusteval with\n```\ntarget/release/rusteval \u003crelevance file\u003e \u003cresult file\u003e\n```\n\nThe ```fixtures/``` folder contains some examples of relevance and result files (see below for an explanation of what these files are).\nFor example, in order to reproduce some of the results of the ICFHR'14 [keyword spotting competition], you can run\n```\ntarget/release/rusteval fixtures/TRACK_I_Bentham_ICFHR2014.RelevanceJudgements.xml fixtures/G1_TRACK_I_Bentham.xml\n```\nThis should produce the results of evaluation of method 'G1' for the 'Bentham' track of the competition. Results show up for each of the selected queries, and averaged over all queries.\nThe last lines of the output should read something like\n```\nMEAN:  precAt5    precAt10   ap\n=======================================================================\n       0.73813    0.60268    0.52402\n```\nThis output means that mean precision at 5 is 73.8%, mean precision at 10 is 60.2%, and mean average precision (MAP) is 52.4% for the submitted method.\n\n## The retrieval paradigm, relevance and result files\n\nThe retrieval paradigm typically presupposes a finite set of queries, each associated with a finite set of matching tokens.\n\nA retrieval algorithm returns an ordered list for each query, representing all tokens from best to worst match.\n\nThis information is necessary for evaluation.\nInput to the tool is read from two distinct text files, the *relevance file* and the *result file*.\n\nThe *relevance file* tells us:\n* What and how many are our queries\n* With what matching tokens does each query *actually* match\n\nThe *result file* tells us:\n* What is the ordered list of matching tokens, from best to worst match, for each query\n\n## Supported input file formats\n\n### trec_eval format\n\nThis format has been originally introduced for use with the [trec_eval] evaluation software.\n\n#### Relevance file\n\nRelevance files follow the format\n```\nqid  0  docno  rel\n```\nfor each text line.\n\nThe line above tells us that query with id ```qid``` matches with token ```docno```.\nThe degree that the query and each token match is encoded as the floating-point value ```rel```, taking\nvalues in ```[0, 1]```. A perfect match has ```rel = 1```.\n\nSample relevance file:\n```\ncv1 0 tok1 1\ncv1 0 tok2 1\ncv1 0 tok3 0\ncv2 0 tok1 0\ncv2 0 tok2 0\ncv2 0 tok3 1\n```\n\nThis tells us that query ```cv1``` matches with tokens ```tok1``` and ```tok2``` but not ```tok3```;\nquery ```cv2``` matches with token ```tok3``` only.\n\n#### Results file\n\nResult files follow the format\n```\nqid 0 docno rank sim run_id\n```\nfor each text line.\n\n```rank``` is an integer that is ignored but required by the format, and has to be in the range ```[0, 1000]``` according the documentation.\n```sim``` is a floating-point value. Higher ```sim``` corresponds to a better match.\n```run_id``` is also required but ignored.\n\nAccording to the docs, the file has to be sorted according to ```qid```.\n\nSample result file:\n```\ncv1 0 April_d06-086-09 0 -0.960748 hws\ncv1 0 April_d05-008-03 1 -1.307986 hws\ncv1 0 April_p03-181-00 2 -1.372011 hws\ncv1 0 April_d05-021-05 3 -1.394318 hws\ncv1 0 April_e06-053-07 4 -1.404273 hws\ncv1 0 April_g01-025-09 5 -1.447217 hws\ncv1 0 April_g01-027-03 6 -1.453828 hws\ncv1 0 April_p03-072-03 7 -1.556320 hws\ncv1 0 April_g01-008-03 8 -1.584332 hws\ncv1 0 April_n01-045-05 9 -1.682590 hws\n```\n\nThis shows results for matches with query ```cv1```. The best match is ```April_d06-086-09```,\nthe worst match is ```April_n01-045-05```.\nNote again that is is the ```rank``` value that encodes the order of the matches, i.e. the penultimate floating-point number in each line.\n\n### icfhr'14 keyword spotting format\n\nThis format is adapted to be used with [keyword spotting], a form of image retrieval where retrieved elemens are word images, typically cropped off a containing document image.\nIt has been used for the ICFHR'14 [keyword spotting competition].\n\nTokens are defined with an XML ```word``` tag, that must contain the following fields *in this particular order*:\n* document\n* x\n* y\n* width\n* height\n* Text (optional)\n* Relevance (optional; default value = 1)\n\nNote also that rusteval requires that each line must contain at most one XML tag.\n\n#### Relevance file\n\nSample relevance file:\n```xml\n\u003c?xml version=\"1.0\" encoding=\"utf-8\"?\u003e\n\u003cGroundTruthRelevanceJudgements xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\"\u003e\n  \u003cGTRel queryid=\"query1\"\u003e\n    \u003cword document=\"027_029_001\" x=\"159\" y=\"1775\" width=\"184\" height=\"89\" Text=\"possess\" Relevance=\"1\" /\u003e\n    \u003cword document=\"027_029_001\" x=\"860\" y=\"1774\" width=\"180\" height=\"89\" Relevance=\"1\" /\u003e\n  \u003c/GTRel\u003e\n  \u003cGTRel queryid=\"query2\"\u003e\n    \u003cword document=\"027_029_001\" x=\"1490\" y=\"1769\" width=\"176\" height=\"86\" Relevance=\"1\" /\u003e\n    \u003cword document=\"071_053_004\" x=\"354\" y=\"790\" width=\"319\" height=\"108\" Text=\"possesst\" Relevance=\"0.7\" /\u003e\n    \u003cword document=\"027_029_001\" x=\"1460\" y=\"178\" width=\"298\" height=\"98\" Relevance=\"0.6\" /\u003e\n  \u003c/GTRel\u003e\n\u003c/GroundTruthRelevanceJudgements\u003e\n```\n\n#### Result file\n\nThe quality of the match is encoded by the order in which the token appears in the file.\n\nSample result file:\n```xml\n\u003c?xml version=\"1.0\" encoding=\"utf-8\"?\u003e\u003cRelevanceListings xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\"\u003e\n  \u003cRel queryid=\"query1\"\u003e\n    \u003cword document=\"027_029_001\" x=\"159\" y=\"1775\" width=\"184\" height=\"89\" /\u003e\n    \u003cword document=\"027_029_001\" x=\"860\" y=\"1774\" width=\"180\" height=\"89\" /\u003e\n    \u003cword document=\"027_029_001\" x=\"1490\" y=\"1769\" width=\"176\" height=\"86\" /\u003e\n    \u003cword document=\"027_029_001\" x=\"1015\" y=\"2182\" width=\"189\" height=\"87\" /\u003e\n    \u003cword document=\"071_053_004\" x=\"92\" y=\"607\" width=\"220\" height=\"138\" /\u003e\n  \u003c/Rel\u003e\n  \u003cRel queryid=\"query2\"\u003e\n    \u003cword document=\"027_029_001\" x=\"1015\" y=\"2182\" width=\"189\" height=\"87\" /\u003e\n    \u003cword document=\"071_053_004\" x=\"92\" y=\"607\" width=\"220\" height=\"138\" /\u003e\n    \u003cword document=\"027_029_001\" x=\"159\" y=\"1775\" width=\"184\" height=\"89\" /\u003e\n    \u003cword document=\"027_029_001\" x=\"860\" y=\"1774\" width=\"180\" height=\"89\" /\u003e\n    \u003cword document=\"027_029_001\" x=\"1490\" y=\"1769\" width=\"176\" height=\"86\" /\u003e\n  \u003c/Rel\u003e\n\u003c/RelevanceListings\u003e\n```\n\nIn this example, for ```query2``` the best match is ```document=\"027_029_001\" x=\"1015\" y=\"2182\" width=\"189\" height=\"87\"```,\nand the worst match is ```document=\"027_029_001\" x=\"1490\" y=\"1769\" width=\"176\" height=\"86\"```.\n\n## Metrics\n\n### Precision at 5\n\nPrecision at 5 is defined as the ratio of the number of instances, among the k closest matches, that are correctly retrieved,\ndivided by k.\nFor Precision at 5, k equals to 5, *or the total number of possible matches if this number is less than 5* (the software provided with the [keyword spotting competition] of ICFHR 2014 also uses this convention).\nPrecision at 10 is defined in an analogous manner.\n\n### Average Precision\n\nAverage precision is defined as the weighted average of 'Precisions at k' for all possible values of k.\nThe weight depends on k and equals to one if the k-th retrieved instance is a match. Otherwise it equals to zero.\n\nFor more details, see\n```\n@ARTICLE{Giotis17,\n    title = \"A survey of document image word spotting techniques\",\n    author = \"A. P. Giotis and G. Sfikas and B. Gatos and C. Nikou\",\n    journal = \"Pattern Recognition\",\n    volume = \"68\",\n    number = \"\",\n    pages = \"310 - 332\",\n    year = \"2017\",\n    publisher = \"Elsevier\"\n}\n```\n\n\n[trec_eval]: \u003chttp://faculty.washington.edu/levow/courses/ling573_SPR2011/hw/trec_eval_desc.htm\u003e\n[keyword spotting]: \u003chttp://www.cs.uoi.gr/~sfikas/16SfikasRetsinasGatos_ZAH.pdf\u003e\n[keyword spotting competition]: \u003chttp://vc.ee.duth.gr/H-KWS2014/\u003e\n[Rust]: \u003chttps://www.rust-lang.org/\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsfikas%2Frusteval","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsfikas%2Frusteval","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsfikas%2Frusteval/lists"}