{"id":40864897,"url":"https://github.com/upwork/model-rerank","last_synced_at":"2026-01-22T00:16:41.234Z","repository":{"id":45536615,"uuid":"90074226","full_name":"upwork/model-rerank","owner":"upwork","description":null,"archived":false,"fork":false,"pushed_at":"2023-02-27T23:16:51.000Z","size":78,"stargazers_count":3,"open_issues_count":6,"forks_count":0,"subscribers_count":6,"default_branch":"master","last_synced_at":"2023-08-05T09:11:08.473Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/upwork.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-05-02T20:17:20.000Z","updated_at":"2022-09-12T12:33:29.000Z","dependencies_parsed_at":"2023-01-21T19:30:52.909Z","dependency_job_id":null,"html_url":"https://github.com/upwork/model-rerank","commit_stats":null,"previous_names":[],"tags_count":0,"template":null,"template_full_name":null,"purl":"pkg:github/upwork/model-rerank","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/upwork%2Fmodel-rerank","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/upwork%2Fmodel-rerank/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/upwork%2Fmodel-rerank/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/upwork%2Fmodel-rerank/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/upwork","download_url":"https://codeload.github.com/upwork/model-rerank/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/upwork%2Fmodel-rerank/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28647922,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-21T21:29:11.980Z","status":"ssl_error","status_checked_at":"2026-01-21T21:24:31.872Z","response_time":86,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-01-22T00:16:41.157Z","updated_at":"2026-01-22T00:16:41.228Z","avatar_url":"https://github.com/upwork.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Model-Rerank : Reranking a set of records containing model features based on a given model\n\n__com.upwork.common.rerank__:__rerank__ is a library to rerank a set of instances containing features using a\npre-configured [Weka][weka] model. The library is designed to load and manage multiple [Weka][weka] models to enable comparative\nevaluation of two or more models.\n\n#### Where to use the library\nThis library can be used for reranking a set of records or data instances based on a machine learning model built using\n[Weka][weka]. \n\nA typical use case is of reranking of search results returned by a search engine. For such an use case,\nthe input to the library is a list of data instances corresponding to each of the search results and a class label\nwhose score the results are reranked on. Each of the data instances contains a set of features which are required by the\n model. Output is an object composed of instance ids which are reranked and useful debugging information.\n\nFor the details around how to use the library, please look at \"How to use\" section below.\n\n#### Overview\n\nThe library loads the model configuration files and the model binary files, optionally caches the model instances in the \nmemory and can execute a model on a set of records to rerank the records. It also provide valuable debug info for a \ndeeper dive into the performance of the model. \n\nThe model binaries must have an extension _.model_ and the configuration for a model should be a _.json_ file \ncontaining the specification for the model and its features. The library provides an utility class\n[JsonConfigGenerator][configgen] to convert the model specification from the weka standard _.ARFF_ format into the\ncustom _.json_ config this library takes. The library expects that all the files related to a model have a name which\n follows the convention `modelName.extension`. For example, if the model name is `modelZ` then the name of the the\n model binary must be `modelZ.model`, of the _.arff_ file must be `modelZ.arff` and of the generated _.json_ file is\n  `modelZ.json`. The library doesn't support `date` or [`relation`][multiInstance] weka attribute [types][wekaTypes]\n  at present.\n\nThe library chose custom format over the standard _.ARFF_ format so as to be sufficiently extensible for real time\nuse cases where each of the features may have additional configuration for their data sources or functions to compute\n  them on the fly. That is how  we use it at [Upwork][upwork], however the current version of the library doesn't\n  expose that extended functionality. We may decide to do so in future.\n\n You can read more about the [Weka][weka] and [ARFF][arff] by following the links.\n\n#### Prerequisites\n1. Java version 1.8 or greater\n2. Maven version 3.2.1 or greater\n\n#### How to use\n\nPlease follow the following steps to make use of the library in an application.\n\n1. **Add Dependency** : Include the following maven dependency in your pom.xml\n\n    ```\n    \u003cdependency\u003e\n        \u003cgroupId\u003ecom.upwork.common.rerank\u003c/groupId\u003e\n        \u003cartifactId\u003ererank\u003c/artifactId\u003e\n        \u003cversion\u003e{version}\u003c/version\u003e\n    \u003c/dependency\u003e\n    ```\n\n2. **Configure** : Set the following properties in a file named __config.properties__ and make it available on the \nclasspath of the application or set the properties as system properties through the program.\n\n    ```\n    ## Relative or absolute path of the repository where model binaries and their configs are stored.\n    rerank.models.repo=../sample_models  \n    \n    ## Names of the supported models\n    rerank.models.supported=rerank_model \n    ```\n\n    The library uses [Archaius][archaius] for configuration management. The default __config.properties__ file is \n    available at [Default Config][defaultConfig]\n\n3. **Convert ARFF to JSON** : [JsonConfigGenerator][configgen] can be used for this purpose. The usage is as follows.\n\n    ```\n    java com.upwork.rerank.apputils.JsonConfigGenerator \u003cmodelName\u003e\n    ```\n    e.g. for a model name modelZ\n\n    ```\n    java com.upwork.rerank.apputils.JsonConfigGenerator modelZ\n    ```\n\n    The above command will expect a file `modelZ.arff` in the model store repo directory set by the property\n    `rerank.models.repo` and generate a file named `modelZ.json` in the same directory. Open `modelZ.json` and make \n    sure the `name` and `shortName` of the models are as expected. They are exactly same as the name of the @rel in \n    the weka _.arff_ file, however they should be modified to appropriately reflect the name of the model. The _\n    .json_ file can be optionally be formatted for easier readability.\n\n4. **Create Instances and Rerank** : Create instances of features and use the library to rerank them. A typical \nusage is as follows.\n\n    ```\n    //This is the name of the model which is used for scoring each of the instances\n    String modelName = \"modelZ\"; \n    \n    //This is the class label on which the instances are scored\n    String classLableToRerank = \"1\"; \n    \n    //Convert the domain specific data instances to a list of TInstance\n    List\u003cTInstance\u003e instances = getData(); \n    \n    //Get an instance of the rerank lib\n    RerankLib lib = RerankLibFactory.getInstance(2, 10).getRerankLib(modelName); \n    \n    //Rerank using the lib\n    RerankResultSet rerankResultSet = lib.rerank(instances, classLableToRerank); \n    ```\n\n    The project includes a sample application [SampleRerankApplication][sampleApp] which uses a sample model named \n    _rerank_model_ to rerank a set of instances. While in an actual application the data i.e. List of TInstance\n    would come from application at runtime, the sample application makes use of the data already available in the model \n    _.arff_ file to illustrate the usage. Otherwise, the _.arff_ is of no use after it is converted into _.json_ config \n    and can be done away with thereafter.\n\n#### _.json_ config file format\n\nThe json format is illustrated below through the config for a sample model _rerank_model_ included in the\nproject. Every feature can optionally have another attribute name `customConfig` to include application specific \ncustom configuration e.g. data source specifications.\n\n    ```\n    {\n      \"features\": [\n        {\n          \"name\": \"a\",\n          \"dataType\": \"numeric\"\n        },\n        {\n          \"name\": \"b\",\n          \"dataType\": \"numeric\"\n        },\n        {\n          \"name\": \"c\",\n          \"dataType\": \"numeric\"\n        },\n        {\n          \"name\": \"d\",\n          \"dataType\": \"numeric\"\n        },\n        {\n          \"name\": \"e\",\n          \"dataType\": \"numeric\"\n        },\n        {\n          \"name\": \"f\",\n          \"dataType\": \"nominal\",\n          \"values\": [\n            \"0\",\n            \"1\"\n          ],\n          \"class\": true\n        }\n      ],\n      \"name\": \"rerank_model\",\n      \"shortName\": \"rerank_model\"\n    }\n    ```\n    \n#### Known Issues and Limitations\n1. The RerankLib can load multiple models and caches them as per the arguments supplied while creating it. It uses an LRU cache internally.\n2. If the class/target data type is a \"numeric\" then a classLabel is not required to be supplied to rerank call.\n\n#### License\nMIT\n\n[configgen]:./rerank/src/main/java/com/upwork/rerank/apputils/JsonConfigGenerator.java\n[defaultConfig]:./rerank/src/main/resources/config.properties\n[sampleApp]:./rerank-example/src/main/java/com/upwork/rerankexample/SampleRerankApplication.java\n[weka]:http://www.cs.waikato.ac.nz/ml/weka/\n[arff]:http://weka.wikispaces.com/ARFF\n[upwork]:http://upwork.com\n[archaius]:https://github.com/Netflix/archaius\n[multiInstance]:https://weka.wikispaces.com/Multi-instance+classifications\n[wekaTypes]:http://www.cs.waikato.ac.nz/ml/weka/arff.html\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fupwork%2Fmodel-rerank","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fupwork%2Fmodel-rerank","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fupwork%2Fmodel-rerank/lists"}