{"id":18555404,"url":"https://github.com/bearloga/interleaved-python","last_synced_at":"2026-02-23T09:39:29.700Z","repository":{"id":146996631,"uuid":"377235146","full_name":"bearloga/interleaved-python","owner":"bearloga","description":"Library for analyzing interleaved search A/B tests to determine preference between competing ranking functions","archived":false,"fork":false,"pushed_at":"2021-06-21T14:37:07.000Z","size":141,"stargazers_count":4,"open_issues_count":0,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-24T06:33:51.649Z","etag":null,"topics":["ab-testing","information-retrieval","interleaved","search-ranking"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bearloga.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-06-15T16:58:19.000Z","updated_at":"2025-01-22T23:15:10.000Z","dependencies_parsed_at":null,"dependency_job_id":"257b74af-3cf2-457d-8ac4-4f893db0f79a","html_url":"https://github.com/bearloga/interleaved-python","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/bearloga/interleaved-python","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bearloga%2Finterleaved-python","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bearloga%2Finterleaved-python/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bearloga%2Finterleaved-python/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bearloga%2Finterleaved-python/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bearloga","download_url":"https://codeload.github.com/bearloga/interleaved-python/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bearloga%2Finterleaved-python/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29741140,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-23T07:44:07.782Z","status":"ssl_error","status_checked_at":"2026-02-23T07:44:07.432Z","response_time":90,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ab-testing","information-retrieval","interleaved","search-ranking"],"created_at":"2024-11-06T21:26:32.302Z","updated_at":"2026-02-23T09:39:29.654Z","avatar_url":"https://github.com/bearloga.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# interleaved\n\nLibrary for analyzing interleaved search A/B tests to determine preference between competing [ranking functions](https://en.wikipedia.org/wiki/Ranking_(information_retrieval))\n\n## Installing\n\n```\npip install --upgrade git+https://github.com/bearloga/interleaved-python.git@main\n```\n\n## Usage\n\n```python\nfrom interleaved import load_example_data\n\ndata = load_example_data(preference='a') # alternatively: 'none' or 'b'\n\ndata.head()\n```\n\n```\n                  timestamp   search_id  event  position ranking_function\n0 2018-08-01 00:01:31+00:00  p2tvgm3clu   serp       NaN              NaN\n1 2018-08-01 00:04:09+00:00  p2tvgm3clu  click      14.0                A\n2 2018-08-01 00:04:29+00:00  p2tvgm3clu  click       4.0                A\n3 2018-08-01 00:06:10+00:00  p2tvgm3clu  click       1.0                A\n4 2018-08-01 00:06:42+00:00  p2tvgm3clu  click       7.0                B\n```\n\n```python\nfrom interleaved import Experiment\n\nex = Experiment(\n    queries = data[data['event'] == 'click']['search_id'].to_numpy(),\n    clicks = data[data['event'] == 'click']['ranking_function'].to_numpy()\n)\nex.bootstrap(seed=42)\n\nprint(ex.summary(ranker_labels=['New Algorithm', 'Old Algorithm'], rescale=True))\n```\n\n```\n In this interleaved search experiment, 906 searches were used to determine whether the\nresults from ranker 'New Algorithm' or 'Old Algorithm' were preferred by users (based on\ntheir clicks to the results from those rankers interleaved into a single search result\nset).\n\n The preference statistic, as defined by Chapelle et al. (2012), was estimated to be 74.3%\nwith a 95% (bootstrapped) confidence interval of (70.0%, 77.9%) on [-100%, 100%] scale\nwith -100% indicating total preference for 'Old Algorithm', 100% indicating total\npreference for 'New Algorithm', and 0% indicating complete lack of preference between the\ntwo -- indicating that the users had preference for ranker 'New Algorithm'.\n```\n\nQuite a strong preference for that new algorithm!\n\n**Additional methods:**\n- `.distribution(rescale=False)` returns the bootstrapped distribution of preference statistic (useful if visualizing)\n- `.preference_statistic(rescale=False)` returns the estimated preference statistic\n- `.conf_int(conf_level=0.95, rescale=False)` returns the confidence interval based on the bootstrapped distribution\n\n**Note**: `rescale=True` rescales the preference statistic from [-0.5, 0.5] scale to a [-1, 1] scale,\nwhich may help with interpretability of the results.\n\n## References\n\n- Chapelle, O., Joachims, T., Radlinski, F., \u0026 Yue, Y. (2012). Large-scale validation and analysis of interleaved search evaluation. *ACM Transactions on Information Systems*, **30**(1), 1-41. [doi:10.1145/2094072.2094078](https://doi.org/10.1145/2094072.2094078)\n- Radlinski, F. and Craswell, N. (2013). [Optimized interleaving for online retrieval evaluation](https://www.microsoft.com/en-us/research/publication/optimized-interleaving-for-online-retrieval-evaluation/). *ACM International Conference on Web Search and Data Mining (WSDM)*. [doi:10.1145/2433396.2433429](https://doi.org/10.1145/2433396.2433429)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbearloga%2Finterleaved-python","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbearloga%2Finterleaved-python","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbearloga%2Finterleaved-python/lists"}