{"id":15425142,"url":"https://github.com/eisber/recommenders","last_synced_at":"2025-07-01T05:04:26.942Z","repository":{"id":95102448,"uuid":"149432698","full_name":"eisber/Recommenders","owner":"eisber","description":"Recommender Systems","archived":false,"fork":false,"pushed_at":"2018-12-12T12:12:32.000Z","size":1318,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":11,"default_branch":"master","last_synced_at":"2025-04-08T21:37:05.350Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/eisber.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-09-19T10:21:43.000Z","updated_at":"2018-12-12T13:16:10.000Z","dependencies_parsed_at":"2023-03-09T11:00:19.079Z","dependency_job_id":null,"html_url":"https://github.com/eisber/Recommenders","commit_stats":{"total_commits":548,"total_committers":25,"mean_commits":21.92,"dds":0.7390510948905109,"last_synced_commit":"8389b63e412520af0cea8e1cefbdf7b6cce727b3"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/eisber/Recommenders","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eisber%2FRecommenders","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eisber%2FRecommenders/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eisber%2FRecommenders/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eisber%2FRecommenders/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/eisber","download_url":"https://codeload.github.com/eisber/Recommenders/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eisber%2FRecommenders/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262900084,"owners_count":23381657,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-01T17:50:00.317Z","updated_at":"2025-07-01T05:04:26.915Z","avatar_url":"https://github.com/eisber.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Recommenders \n\nThis repository provides examples and best practices for building recommendation systems, provided as Jupyter notebooks. The examples detail our learnings on four key tasks: \n1. [Prepare Data](notebooks/01_prepare_data/README.md): Preparing and loading data for each recommender algorithm\n2. [Model](notebooks/02_model/README.md): Building models using various recommender algorithms such as Alternating Least Squares ([ALS](https://spark.apache.org/docs/latest/api/python/_modules/pyspark/ml/recommendation.html#ALS)), Singular Value Decomposition ([SVD](https://surprise.readthedocs.io/en/stable/matrix_factorization.html#surprise.prediction_algorithms.matrix_factorization.SVD)), etc.\n3. [Evaluate](notebooks/03_evaluate/README.md): Evaluating algorithms with offline metrics\n4. [Operationalize](notebooks/04_operationalize/README.md): Operationalizing models in a production environment on Azure\n\nSeveral utilities are provided in [reco_utils](reco_utils) to support common tasks such as loading datasets in the format expected by different algorithms, evaluating model outputs, and splitting train/test data. Implementations of several state-of-the-art algorithms are provided for self-study and customization in your own applications.\n\n## Getting Started\nPlease see the [setup guide](SETUP.md) for more details on setting up your machine locally, on Spark, or on [Azure Databricks](/SETUP.md#setup-guide-for-azure-databricks). \n\nTo setup on your local machine:\n1. Install Anaconda with Python \u003e= 3.6. [Miniconda](https://conda.io/miniconda.html) is a quick way to get started.\n2. Clone the repository\n    ```\n    git clone https://github.com/Microsoft/Recommenders\n    ```\n3. Run the generate conda file script and create a conda environment:   \n    ```\n    cd Recommenders\n    ./scripts/generate_conda_file.sh\n    conda env create -n reco -f conda_bare.yaml  \n    ```\n4. Activate the conda environment and register it with Jupyter:\n    ```\n    conda activate reco\n    python -m ipykernel install --user --name reco --display-name \"Python (reco)\"\n    ```\n5. Start the Jupyter notebook server\n    ```\n    cd notebooks\n    jupyter notebook\n    ```\n5. Run the [SAR Python CPU Movielens](notebooks/00_quick_start/sar_python_cpu_movielens.ipynb) notebook under the 00_quick_start folder. Make sure to change the kernel to \"Python (reco)\".\n\n## Notebooks\n\nWe provide several notebooks to show how recommendation algorithms can be designed, evaluated and operationalized.\n\n- The [Quick-Start Notebooks](notebooks/00_quick_start) detail how you can quickly get up and run with state-of-the-art algorithms such as the Smart Adaptive Recommendation ([SAR](https://github.com/Microsoft/Product-Recommendations/blob/master/doc/sar.md)) algorithm and ALS algorithm. \n\n- The [Data Preparation Notebook](notebooks/01_prepare_data) shows how to prepare and split data properly for recommendation systems.\n\n- The [Modeling Notebooks](notebooks/02_model) provide a deep dive into implementations of different recommender algorithms.\n\n- The [Evaluation Notebooks](notebooks/03_evaluate) show how to evaluate recommender algorithms for different ranking and rating metrics.\n\n- The [Operationalizion Notebook](notebooks/04_operationalize) demonstrates how to deploy models in production systems.\n\nIn addition, We also provide a [comparison notebook](notebooks/03_evaluate/comparison.ipynb) to illustrate how different algorithms could be evaluated and compared. In this notebook, data (MovieLens 1M) is randomly split into train/test sets at a 75/25 ratio. A recommendation model is trained using each of the collaborative filtering algorithms below. We utilize empirical parameter values reported in literature [here](http://mymedialite.net/examples/datasets.html). For ranking metrics we use k = 10 (top 10 results). We run the comparison on a Standard NC6s_v2 [Azure DSVM](https://azure.microsoft.com/en-us/services/virtual-machines/data-science-virtual-machines/) (6 vCPUs, 112 GB memory and 1 K80 GPU). Spark ALS is run in local standalone mode. \n\n**Preliminary Comparison**\n\n| Algo | MAP | nDCG@k | Precision@k | Recall@k | RMSE | MAE | R\u003csup\u003e2\u003c/sup\u003e | Explained Variance | \n| --- | --- | --- | --- | --- | --- | --- | --- | --- | \n| ALS | 0.002020 | 0.024313 | 0.030677 | 0.009649 | 0.860502 | 0.680608 | 0.406014 | 0.411603 | \n| SAR | 0.064013 | 0.308012 | 0.277215 | 0.109292 | N/A | N/A | N/A | N/A | \n| SVD | 0.010915 | 0.102398 | 0.092996 | 0.025362 | 0.888991 | 0.696781 | 0.364178 | 0.364178 | \n\n\n## Contributing\nThis project welcomes contributions and suggestions. Before contributing, please see our [contribution guidelines](CONTRIBUTING.md).\n\n\n## Build Status\n| Build Type | Branch | Status |  | Branch | Status | \n| --- | --- | --- | --- | --- | --- | \n| **Linux CPU** |  master | [![Status](https://msdata.visualstudio.com/AlgorithmsAndDataScience/_apis/build/status/nightly?branchName=master)](https://msdata.visualstudio.com/AlgorithmsAndDataScience/_build/latest?definitionId=4792)  | | staging | [![Status](https://msdata.visualstudio.com/AlgorithmsAndDataScience/_apis/build/status/nightly_staging?branchName=staging)](https://msdata.visualstudio.com/AlgorithmsAndDataScience/_build/latest?definitionId=4594) | \n| **Linux Spark** | master | [![Status](https://msdata.visualstudio.com/AlgorithmsAndDataScience/_apis/build/status/nightly_spark?branchName=master)](https://msdata.visualstudio.com/AlgorithmsAndDataScience/_build/latest?definitionId=4804) | | staging | [![Status](https://msdata.visualstudio.com/AlgorithmsAndDataScience/_apis/build/status/nightly_spark_staging?branchName=staging)](https://msdata.visualstudio.com/AlgorithmsAndDataScience/_build/latest?definitionId=4805)|\n\n**NOTE** - the tests are executed every night, we use `pytest` for testing python utilities in [reco_utils](reco_utils) and [notebooks](notebooks).\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feisber%2Frecommenders","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Feisber%2Frecommenders","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feisber%2Frecommenders/lists"}