{"id":49623869,"url":"https://github.com/iamsikun/recsysk","last_synced_at":"2026-05-05T04:41:14.805Z","repository":{"id":354293113,"uuid":"882462915","full_name":"iamsikun/recsysk","owner":"iamsikun","description":"Framework-agnostic recsys benchmarking harness — pinned benchmarks, pluggable algorithms, reproducible runs (PyPI: recsysk)","archived":false,"fork":false,"pushed_at":"2026-04-27T23:26:45.000Z","size":28443,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-04-28T01:22:08.025Z","etag":null,"topics":["benchmark","ctr-prediction","deep-learning","python","pytorch","pytorch-lightning","recommender-systems"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/recsysk/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/iamsikun.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-11-02T20:54:36.000Z","updated_at":"2026-04-27T23:26:49.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/iamsikun/recsysk","commit_stats":null,"previous_names":["iamsikun/recsysk"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/iamsikun/recsysk","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iamsikun%2Frecsysk","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iamsikun%2Frecsysk/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iamsikun%2Frecsysk/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iamsikun%2Frecsysk/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/iamsikun","download_url":"https://codeload.github.com/iamsikun/recsysk/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/iamsikun%2Frecsysk/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32636096,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-04T10:08:07.713Z","status":"online","status_checked_at":"2026-05-05T02:00:06.033Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmark","ctr-prediction","deep-learning","python","pytorch","pytorch-lightning","recommender-systems"],"created_at":"2026-05-05T04:41:11.387Z","updated_at":"2026-05-05T04:41:14.798Z","avatar_url":"https://github.com/iamsikun.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# RecSysK\n\n[![PyPI version](https://img.shields.io/pypi/v/recsysk.svg)](https://pypi.org/project/recsysk/)\n[![Python versions](https://img.shields.io/pypi/pyversions/recsysk.svg)](https://pypi.org/project/recsysk/)\n[![License](https://img.shields.io/pypi/l/recsysk.svg)](https://github.com/iamsikun/recsysk/blob/main/LICENSE)\n\nA framework-agnostic benchmarking harness for recommendation algorithms. Benchmarks are pinned contracts (dataset + split + eval protocol + metrics), algorithms plug in behind a narrow protocol, and every run is persisted to a parquet store so multi-seed comparisons are one command away.\n\nInstall from PyPI:\n\n```bash\npip install recsysk\n# or with uv\nuv add recsysk\n```\n\n## Quickstart\n\nOne-time setup:\n\n```bash\nuv sync\n```\n\n**Datasets.**\n\n- **MovieLens 20m** must be extracted manually to `./datasets/ml-20m/` (the directory containing `ratings.csv`, `movies.csv`, `tags.csv`, ...). There is no auto-download for MovieLens.\n- **KuaiRec and KuaiRand** auto-download from Zenodo on first use and cache under `./datasets/`. The loaders (`src/recsys/data/kuairec.py`, `kuairand.py`) expose `download(dataset_root=...)` and `load(dataset_root=...)` with a default that resolves to the repo-root `datasets/` directory regardless of CWD. Partial downloads are rejected (Content-Length verified) so a stalled transfer cannot poison the cache. *Known caveat:* Zenodo is currently throttling the 432 MB `KuaiRec.zip` at ~0-22 KB/s, so the first KuaiRec run may be slow; KuaiRand-Pure (47 MB) finishes in minutes.\n\nRun an experiment (benchmark x algorithm x seeds):\n\n```bash\nuv run recsys bench --experiment conf/experiments/deepfm_on_movielens_ctr.yaml     --seeds 1,2,3\nuv run recsys bench --experiment conf/experiments/din_on_movielens_seq.yaml        --seeds 1,2,3\nuv run recsys bench --experiment conf/experiments/popularity_on_movielens_ctr.yaml --seeds 1,2,3\nuv run recsys bench --experiment conf/experiments/popularity_on_kuairand_ctr.yaml  --seeds 1,2,3\nuv run recsys bench --experiment conf/experiments/popularity_on_kuairec_ctr.yaml   --seeds 1,2,3\n```\n\nView aggregated results (mean +/- std across seeds):\n\n```bash\nuv run recsys report --benchmark movielens_ctr\nuv run recsys report --benchmark movielens_seq\nuv run recsys report --benchmark kuairand_ctr\nuv run recsys report --benchmark kuairec_ctr\n```\n\nList what's registered:\n\n```bash\nuv run recsys list benchmarks\nuv run recsys list algorithms\n```\n\nResults land in `results/\u003cbenchmark\u003e.parquet` (git-ignored). The default experiment YAMLs bake in a fast \"smoke\" trainer profile (`limit_train_batches: 4`, `limit_val_batches: 4`, `max_epochs: 1`) so the gate runs in seconds; remove those overrides for a real training run.\n\n## Adding a new algorithm\n\n1. **Pick a backend.**\n   - Classical / non-neural: subclass `recsys.algorithms.base.Algorithm` under `src/recsys/algorithms/classical/`. Do **not** import `torch` at module scope. The runner branches on `isinstance(algo, Algorithm)` and calls `fit(train, val)` directly — no Lightning `Trainer`, no `nn.Module` shim.\n   - Lightning-backed neural: add a plain `nn.Module` under `src/recsys/algorithms/torch/` (see `deepfm.py`, `din.py`). The runner wraps it in `engine.CTRTask` and runs `L.Trainer.fit` against the benchmark's datamodule.\n2. **Declare compatibility.** Set class attributes `supported_tasks: set[TaskType]` and `required_roles: set[str]` (role names from `FeatureRole`: `user`, `item`, `context`, `sequence`, `group`, `label`).\n3. **Register.** Decorate the class with `@ALGO_REGISTRY.register(\"my_algo\")` from `recsys.utils`.\n4. **Side-effect import.** Add the module to `src/recsys/algorithms/classical/__init__.py` or `src/recsys/algorithms/torch/__init__.py` so the decorator fires at import time. *This step is mandatory.* If you skip it, `recsys list algorithms` will not show your algo.\n5. **Write a config.** Add `conf/algorithms/my_algo.yaml` with the constructor kwargs.\n6. **Write an experiment.** Add `conf/experiments/my_algo_on_\u003cbenchmark\u003e.yaml` that points at the benchmark and algo YAMLs and sets `seeds:` + optional `trainer:` overrides.\n\n## Adding a new benchmark\n\n1. **Builder.** Add a data builder under `src/recsys/data/builders/` that produces a `DatasetBundle` (train/val/test + `feature_map` + `feature_specs`). Reuse the split modules under `data/splits/` and negative samplers under `data/negatives/`; write new ones there if needed.\n2. **Datamodule (optional).** If the benchmark feeds a Lightning-backed algo, wrap the builder in a `BuilderDataModule` under `src/recsys/data/datamodules/`.\n3. **Benchmark class.** Create `src/recsys/benchmarks/my_bench.py` that subclasses `recsys.benchmarks.base.Benchmark`, pins its `Task` (CTR / retrieval / sequential) and its `metric_names` list, and implements `build() -\u003e BenchmarkData` and `version()`.\n4. **Register.** `@BENCHMARK_REGISTRY.register(\"my_bench\")` and import the module from `src/recsys/benchmarks/__init__.py`.\n5. **Config.** Add `conf/benchmarks/my_bench.yaml` with the data/eval blocks the benchmark class consumes.\n\nMetrics and splits are **pinned by the benchmark class**, not by config. If you need a different metric set or a different split, that is a new benchmark, not a new config — this is what makes benchmark comparisons meaningful.\n\n## Architecture at a glance\n\n**`FeatureSpec` + `FeatureRole`** (`src/recsys/schemas/features.py`). Every column in a benchmark is annotated with a role (`user`, `item`, `context`, `sequence`, `group`, `label`). Builders partition columns by role and algos declare which roles they need; the runner can reject incompatible combinations before `fit` is called.\n\n**`Algorithm`** (`src/recsys/algorithms/base.py`). A framework-agnostic protocol: `fit(train, val)`, `predict_scores(batch)`, `predict_topk(users, k, candidates)`, `save/load`. Classical baselines like `popularity` live under `algorithms/classical/`, inherit directly from `Algorithm`, and never import torch at module scope — the runner bypasses Lightning for them entirely. Neural models live under `algorithms/torch/` as plain `nn.Module` subclasses; the runner wraps them in an `engine.CTRTask` Lightning module at fit time.\n\n**`Task`** (`src/recsys/tasks/base.py`). Declares the I/O contract between an algo and the evaluator: which roles are required, which prediction method is called, and how predictions turn into metric values. v1 ships `CTRTask`, `RetrievalTask`, and `SequentialTask`.\n\n**`Benchmark`** (`src/recsys/benchmarks/base.py`). Immutable bundle of dataset + task + split + eval protocol + metric set. `build()` returns a `BenchmarkData` with train/val/test, feature map, feature specs, and candidate item list. `version()` is a hash of `(dataset, split, eval_protocol)` so result rows are traceable.\n\n**Runtime flow.** `recsys bench` loads an experiment YAML and calls `recsys.runner.run_experiment(algo_cfg, benchmark_cfg, seed, trainer_overrides, results_dir, store)`. That function builds the benchmark, builds the algo, and then branches: classical algos (`isinstance(algo, Algorithm)`) go through `algo.fit(train, val)` directly with no Lightning involvement; neural algos get wrapped in `engine.CTRTask` and trained via `L.Trainer.fit` against the benchmark's datamodule. The fitted algo (or Lightning task) is handed to `benchmark.task.evaluate(...)`, and a `RunResult` row is written to `results/\u003cbenchmark\u003e.parquet`. `recsys report` reads the same parquet and prints a mean +/- std table across seeds.\n\n## Scope (v1 + landed v2.0 work)\n\n- **Benchmarks:** `movielens_ctr`, `movielens_seq`, `kuairec_ctr`, `kuairand_ctr`. KuaiRec/KuaiRand loaders auto-download the archives from Zenodo to `./datasets/` on first use; subsequent runs hit the cache.\n- **Algorithms:** `deepfm` (CTR), `din` (sequential, with working ranking metrics), `popularity` (classical baseline — bypasses Lightning entirely via the framework-agnostic fit path).\n- **Metrics:** AUC, LogLoss for CTR; NDCG@{10,50}, Recall@{10,50}, HR@{10,50}, MRR for ranking. Ranking metrics now compute correctly for sequential dict-batch algos like DIN.\n- **Infrastructure:** parquet result store keyed by `(benchmark, algo, config_hash, seed, timestamp)`, argparse CLI (`bench`, `report`, `list`), multi-seed runs as the default.\n\n## Deferred to v2\n\nSession / conversational / cold-start tasks, beyond-accuracy metrics (coverage, diversity, novelty, fairness), statistical significance tests, experiment-tracker hooks (MLflow / W\u0026B), hyperparameter sweeps, a typer-based CLI, additional classical / neural baselines (item-KNN, BPR-MF, SASRec, BERT4Rec, ...), and additional benchmarks (Amazon Reviews, Yoochoose / Diginetica / RetailRocket, MovieLens 1M, Netflix Prize, Yelp, Last.fm, Criteo / Avazu). See `docs/dev.md` for the full v1 plan and v2 roadmap.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fiamsikun%2Frecsysk","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fiamsikun%2Frecsysk","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fiamsikun%2Frecsysk/lists"}