{"id":50323026,"url":"https://github.com/arnim/spear-algorithm-methodshub","last_synced_at":"2026-05-29T04:01:42.686Z","repository":{"id":357132943,"uuid":"1235507716","full_name":"arnim/spear-algorithm-methodshub","owner":"arnim","description":null,"archived":false,"fork":false,"pushed_at":"2026-05-11T12:57:20.000Z","size":59,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-11T14:30:24.398Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/arnim.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-11T11:47:26.000Z","updated_at":"2026-05-11T12:57:30.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/arnim/spear-algorithm-methodshub","commit_stats":null,"previous_names":["arnim/spear-algorithm-methodshub"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/arnim/spear-algorithm-methodshub","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/arnim%2Fspear-algorithm-methodshub","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/arnim%2Fspear-algorithm-methodshub/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/arnim%2Fspear-algorithm-methodshub/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/arnim%2Fspear-algorithm-methodshub/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/arnim","download_url":"https://codeload.github.com/arnim/spear-algorithm-methodshub/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/arnim%2Fspear-algorithm-methodshub/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33635961,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-29T02:00:06.066Z","response_time":107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-05-29T04:01:41.779Z","updated_at":"2026-05-29T04:01:42.676Z","avatar_url":"https://github.com/arnim.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# SPEAR: Ranking User Expertise and Resource Quality from Time-Ordered Interactions\n\n[![Launch with Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/arnim/spear-algorithm-methodshub/HEAD?urlpath=lab/tree/spear_algorithm.ipynb)\n[![Launch with Jupyter4NFDI](https://img.shields.io/badge/Launch-Jupyter4NFDI-orange)](https://hub.nfdi-jupyter.de/v2/gh/arnim/spear-algorithm-methodshub/HEAD?urlpath=lab/tree/spear_algorithm.ipynb\u0026localstoragepath=%2Fhome%2Fjovyan%2Fwork)\n\nOpen the notebook directly in a temporary executable environment with [mybinder.org](https://mybinder.org/) or [Jupyter4NFDI](https://nfdi-jupyter.de/users/jupyterlab/repo2docker).\n\n## Description\n\nSPEAR estimates which users are likely to be experts and which resources are likely to be high quality from a chronological list of user-resource interactions. It is useful when the timing of an interaction matters: users who discover resources before other users do can receive more credit than later users. The method returns two ranked tables, one for user expertise and one for resource quality.\n\nSPEAR takes tabular activity data as input. Each row should identify a timestamp, a user, and a resource. The implementation in [`spear.py`](spear.py) first creates a weighted user-resource matrix and then iteratively updates user expertise and resource quality scores. The accompanying notebook [`spear_algorithm.ipynb`](spear_algorithm.ipynb) demonstrates the complete workflow on a small public example dataset.\n\nThe algorithm follows Noll et al. (2009), who introduced SPEAR for telling experts from spammers in social bookmarking and folksonomy data. It is related to Kleinberg's (1999) HITS algorithm, but adds time-aware credit scoring for early user-resource interactions. This repository uses a small synthetic dataset in [`data/social_bookmarks.csv`](data/social_bookmarks.csv), so all examples are reproducible without credentials or external data access. The Python implementation was checked against the public project documentation (Noll, n.d.) and the Julia implementation by Bleier (2013).\n\nImportant parameters are the credit scoring function and the number of iterations. The default `sqrt_credit` function gives earlier users more credit while dampening very large differences. The `constant_credit` option ignores timing and can be used as a robustness comparison similar to a HITS-style ranking.\n\n## Use Cases\n\n- Identifying knowledgeable curators in social bookmarking data. One can use SPEAR to rank users who repeatedly find resources that later become popular among other users.\n- Ranking resources in online communities. One can use SPEAR to prioritize links, posts, documents, or datasets that are connected to high-expertise users.\n- Comparing expert and spam-like behavior. One can inspect whether accounts that mostly promote low-quality or isolated resources receive low expertise scores.\n- Studying early adoption in digital behavioral data. One can apply SPEAR to timestamped interactions such as bookmarks, likes, citations, reposts, or hyperlink creation.\n\nExample publication:\n\n- Noll, M. G., Au Yeung, C.-m., Gibbins, N., Meinel, C., \u0026 Shadbolt, N. (2009). Telling experts from spammers. In *Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval* (pp. 612–619). https://doi.org/10.1145/1571941.1572046\n\n## Input Data\n\nSPEAR takes a CSV table with one interaction per row. The example input file is [`data/social_bookmarks.csv`](data/social_bookmarks.csv):\n\n```csv\ntimestamp,user,resource,tag\n2026-01-01 09:00,Alice,open-data-portal,data\n2026-01-01 09:05,Bob,open-data-portal,statistics\n2026-01-01 09:10,Chandra,miracle-cure-shop,health\n```\n\nRequired fields:\n\n- `timestamp`. Time at which the activity happened. This field is required because SPEAR uses chronological order.\n- `user`. Actor identifier, such as an account, author, or participant ID. Required.\n- `resource`. Item identifier, such as a URL, document, post, paper, or dataset. Required.\n\nOptional fields such as `tag` can be present but are ignored by the basic implementation.\n\n## Output Data\n\nThe method returns three pandas data frames:\n\n- `expertise`: users ranked by normalized expertise score;\n- `quality`: resources ranked by normalized quality score;\n- `adjacency`: the weighted user-resource matrix used by the algorithm.\n\nExample output shape:\n\n| user | expertise |\n| --- | ---: |\n| Alice | 0.31 |\n| Bob | 0.28 |\n\n| resource | quality |\n| --- | ---: |\n| open-data-portal | 0.37 |\n| validated-election-data | 0.34 |\n\nScores are relative within the analyzed dataset and sum to one.\n\n## Hardware Requirements\n\nThe example notebook runs on standard Binder hardware with one CPU and less than 1 GB of memory. Larger datasets may require more memory because the simple implementation stores the user-resource matrix in memory.\n\n## Environment Setup\n\nInstall Python 3.10 or newer and the packages in [`requirements.txt`](requirements.txt):\n\n```bash\npip install -r requirements.txt\n```\n\nBinder also uses [`postBuild`](postBuild) to install Quarto support for Methods Hub rendering.\n\n## How to Use\n\nOpen and run [`spear_algorithm.ipynb`](spear_algorithm.ipynb) in JupyterLab, or execute the method from Python:\n\n```python\nimport pandas as pd\nfrom spear import run_spear, sqrt_credit\n\nactivities = pd.read_csv(\"data/social_bookmarks.csv\", parse_dates=[\"timestamp\"])\nresult = run_spear(activities, credit=sqrt_credit, iterations=20)\n\nprint(result.expertise)\nprint(result.quality)\n```\n\nTo compare with a non-temporal baseline, replace `sqrt_credit` with `constant_credit`.\n\n## Example Commands and Parameters\n\nRun the notebook locally:\n\n```bash\njupyter nbconvert --to notebook --execute spear_algorithm.ipynb --output /tmp/spear_algorithm_executed.ipynb\n```\n\nTest the Binder build locally with repo2docker:\n\n```bash\nrepo2docker --no-run .\n```\n\nMain parameters in [`spear.py`](spear.py):\n\n- `credit`: credit scoring function. Use `sqrt_credit` for time-aware SPEAR or `constant_credit` for a non-temporal comparison.\n- `iterations`: maximum number of expertise/quality update steps.\n- `tolerance`: convergence threshold.\n- `user_col`, `resource_col`, `time_col`: input column names.\n\n## AI Use Acknowledgement\n\nThis submission was prepared with assistance from an AI coding assistant. The assistant helped draft explanatory text, create the example notebook structure, implement and test the Python code, and check Binder readiness. The author reviewed, edited, and takes responsibility for the final content, code, and citations.\n\nA shared log of the AI-assisted preparation process is available here: [AI assistance log](https://pi.dev/session/#7935c5947f3c94520397f1abbb777216).\n\n## References\n\nThe references follow a consistent APA style, in line with the Methods Hub guidelines.\n\n- Bleier, A. (2013). *SpearAlgorithm.jl* [Computer software]. GitHub. Retrieved May 11, 2026, from https://github.com/arnim/SpearAlgorithm.jl\n- Kleinberg, J. M. (1999). Authoritative sources in a hyperlinked environment. *Journal of the ACM, 46*(5), 604–632. https://doi.org/10.1145/324133.324140\n- Noll, M. G. (n.d.). *SPEAR algorithm*. Retrieved May 11, 2026, from https://www.michael-noll.com/projects/spear-algorithm/\n- Noll, M. G., Au Yeung, C.-m., Gibbins, N., Meinel, C., \u0026 Shadbolt, N. (2009). Telling experts from spammers. In *Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval* (pp. 612–619). Association for Computing Machinery. https://doi.org/10.1145/1571941.1572046\n\n## Contact Details\n\nFor questions about this Methods Hub submission, contact Arnim Bleier via the repository issue tracker.\n\n## Funding Acknowledgement\n\nThis work was supported by Jupyter4NFDI, funded through Base4NFDI by the German Research Foundation (DFG), project number 521453681.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farnim%2Fspear-algorithm-methodshub","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Farnim%2Fspear-algorithm-methodshub","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farnim%2Fspear-algorithm-methodshub/lists"}