{"id":13676867,"url":"https://github.com/allegro/allRank","last_synced_at":"2025-04-29T07:33:42.460Z","repository":{"id":37502391,"uuid":"214192712","full_name":"allegro/allRank","owner":"allegro","description":"allRank is a framework for training learning-to-rank neural models based on PyTorch.","archived":false,"fork":false,"pushed_at":"2024-08-06T19:11:59.000Z","size":129,"stargazers_count":925,"open_issues_count":18,"forks_count":123,"subscribers_count":27,"default_branch":"master","last_synced_at":"2025-04-11T22:37:27.452Z","etag":null,"topics":["click-model","deep-learning","information-retrieval","learning-to-rank","machine-learning","ndcg","python","pytorch","ranking","transformer"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/allegro.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-10-10T13:39:33.000Z","updated_at":"2025-04-09T14:12:49.000Z","dependencies_parsed_at":"2024-11-13T02:00:28.859Z","dependency_job_id":"7b3f6e92-426d-41f9-8f40-f80fabac6d1f","html_url":"https://github.com/allegro/allRank","commit_stats":{"total_commits":62,"total_committers":7,"mean_commits":8.857142857142858,"dds":0.5806451612903225,"last_synced_commit":"2923985135c8afcdae6392e5b810b62da7851276"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/allegro%2FallRank","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/allegro%2FallRank/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/allegro%2FallRank/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/allegro%2FallRank/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/allegro","download_url":"https://codeload.github.com/allegro/allRank/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251456067,"owners_count":21592287,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["click-model","deep-learning","information-retrieval","learning-to-rank","machine-learning","ndcg","python","pytorch","ranking","transformer"],"created_at":"2024-08-02T13:00:34.086Z","updated_at":"2025-04-29T07:33:37.435Z","avatar_url":"https://github.com/allegro.png","language":"Python","funding_links":[],"categories":["Learning-to-Rank \u0026 Recommender Systems","Python"],"sub_categories":["Others"],"readme":"# allRank : Learning to Rank in PyTorch\n\n## About\n\nallRank is a PyTorch-based framework for training neural Learning-to-Rank (LTR) models, featuring implementations of:\n* common pointwise, pairwise and listwise loss functions\n* fully connected and Transformer-like scoring functions\n* commonly used evaluation metrics like Normalized Discounted Cumulative Gain (NDCG) and Mean Reciprocal Rank (MRR)\n* click-models for experiments on simulated click-through data\n\n### Motivation\n\nallRank provides an easy and flexible way to experiment with various LTR neural network models and loss functions.\nIt is easy to add a custom loss, and to configure the model and the training procedure. \nWe hope that allRank will facilitate both research in neural LTR and its industrial applications.\n\n## Features\n\n### Implemented loss functions:\n 1. ListNet (for binary and graded relevance)\n 2. ListMLE\n 3. RankNet\n 4. Ordinal loss\n 5. LambdaRank\n 6. LambdaLoss\n 7. ApproxNDCG\n 8. RMSE\n 9. NeuralNDCG (introduced in https://arxiv.org/pdf/2102.07831)\n\n### Getting started guide\n\nTo help you get started, we provide a ```run_example.sh``` script which generates dummy ranking data in libsvm format and trains\n a Transformer model on the data using provided example ```config.json``` config file. Once you run the script, the dummy data can be found in `dummy_data` directory\n and the results of the experiment in `test_run` directory. To run the example, Docker is required.\n\n### Getting the right architecture version (GPU vs CPU-only)\n\nSince torch binaries are different for GPU and CPU and GPU version doesn't work on CPU - one must select \u0026 build appropriate docker image version.\n\nTo do so pass `gpu` or `cpu` as `arch_version` build-arg in \n\n```docker build --build-arg arch_version=${ARCH_VERSION}```\n\nWhen calling `run_example.sh` you can select the proper version by a first cmd line argument e.g. \n\n```run_example.sh gpu ...```\n\nwith `cpu` being the default if not specified.\n\n### Configuring your model \u0026 training\n\nTo train your own model, configure your experiment in ```config.json``` file and run  \n\n```python allrank/main.py --config_file_name allrank/config.json --run_id \u003cthe_name_of_your_experiment\u003e --job_dir \u003cthe_place_to_save_results\u003e```\n\nAll the hyperparameters of the training procedure: i.e. model defintion, data location, loss and metrics used, training hyperparametrs etc. are controlled\nby the ```config.json``` file. We provide a template file ```config_template.json``` where supported attributes, their meaning and possible values are explained.\n Note that following MSLR-WEB30K convention, your libsvm file with training data should be named `train.txt`. You can specify the name of the validation dataset \n (eg. valid or test) in the config. Results will be saved under the path ```\u003cjob_dir\u003e/results/\u003crun_id\u003e```\n \nGoogle Cloud Storage is supported in allRank as a place for data and job results.\n\n\n### Implementing custom loss functions\n\nTo experiment with your own custom loss, you need to implement a function that takes two tensors (model prediction and ground truth) as input\n and put it in the `losses` package, making sure it is exposed on a package level.\nTo use it in training, simply pass the name (and args, if your loss method has some hyperparameters) of your function in the correct place in the config file:\n\n```\n\"loss\": {\n    \"name\": \"yourLoss\",\n    \"args\": {\n        \"arg1\": val1,\n        \"arg2: val2\n    }\n  }\n```\n\n### Applying click-model\n\nTo apply a click model you need to first have an allRank model trained.\nNext, run:\n\n```python allrank/rank_and_click.py --input-model-path \u003cpath_to_the_model_weights_file\u003e --roles \u003ccomma_separated_list_of_ds_roles_to_process e.g. train,valid\u003e --config_file_name allrank/config.json --run_id \u003cthe_name_of_your_experiment\u003e --job_dir \u003cthe_place_to_save_results\u003e``` \n\nThe model will be used to rank all slates from the dataset specified in config. Next - a click model configured in config will be applied and the resulting click-through dataset will be written under ```\u003cjob_dir\u003e/results/\u003crun_id\u003e``` in a libSVM format.\nThe path to the results directory may then be used as an input for another allRank model training.\n\n## Continuous integration\n\nYou should run `scripts/ci.sh` to verify that code passes style guidelines and unit tests.\n\n## Research\n\nThis framework was developed to support the research project [Context-Aware Learning to Rank with Self-Attention](https://arxiv.org/abs/2005.10084). If you use allRank in your research, please cite:\n```\n@article{Pobrotyn2020ContextAwareLT,\n  title={Context-Aware Learning to Rank with Self-Attention},\n  author={Przemyslaw Pobrotyn and Tomasz Bartczak and Mikolaj Synowiec and Radoslaw Bialobrzeski and Jaroslaw Bojar},\n  journal={ArXiv},\n  year={2020},\n  volume={abs/2005.10084}\n}\n```\nAdditionally, if you use the NeuralNDCG loss function, please cite the corresponding work, [NeuralNDCG: Direct Optimisation of a Ranking Metric via Differentiable Relaxation of Sorting](https://arxiv.org/abs/2102.07831):\n```\n@article{Pobrotyn2021NeuralNDCG,\n  title={NeuralNDCG: Direct Optimisation of a Ranking Metric via Differentiable Relaxation of Sorting},\n  author={Przemyslaw Pobrotyn and Radoslaw Bialobrzeski},\n  journal={ArXiv},\n  year={2021},\n  volume={abs/2102.07831}\n}\n```\n\n## License\n\nApache 2 License\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fallegro%2FallRank","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fallegro%2FallRank","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fallegro%2FallRank/lists"}