{"id":20129584,"url":"https://github.com/yilinjuang/github-repo-recommender","last_synced_at":"2026-04-15T21:32:08.681Z","repository":{"id":81586358,"uuid":"95487692","full_name":"yilinjuang/GitHub-Repo-Recommender","owner":"yilinjuang","description":"Github Repo Recommender System. 2017 Network Science Final Project.","archived":false,"fork":false,"pushed_at":"2017-09-03T07:39:08.000Z","size":145,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-02T21:32:38.371Z","etag":null,"topics":["github","network-science","recommendation-system","recommender-system","repository"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yilinjuang.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-06-26T20:41:10.000Z","updated_at":"2021-06-13T08:05:18.000Z","dependencies_parsed_at":null,"dependency_job_id":"409be867-c15d-48e6-9880-ea8e027b7164","html_url":"https://github.com/yilinjuang/GitHub-Repo-Recommender","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/yilinjuang/GitHub-Repo-Recommender","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yilinjuang%2FGitHub-Repo-Recommender","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yilinjuang%2FGitHub-Repo-Recommender/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yilinjuang%2FGitHub-Repo-Recommender/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yilinjuang%2FGitHub-Repo-Recommender/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yilinjuang","download_url":"https://codeload.github.com/yilinjuang/GitHub-Repo-Recommender/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yilinjuang%2FGitHub-Repo-Recommender/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31861353,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-15T15:24:51.572Z","status":"ssl_error","status_checked_at":"2026-04-15T15:24:39.138Z","response_time":63,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["github","network-science","recommendation-system","recommender-system","repository"],"created_at":"2024-11-13T20:35:08.954Z","updated_at":"2026-04-15T21:32:08.665Z","avatar_url":"https://github.com/yilinjuang.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Octomender\n```\nOctomender = Octopus (GitHub) + Recommender\n```\nGithub Repo Recommender System.\n\n2017 Network Science Final Project with J. C. Liang.\n\n## Requirement\n- python3\n- [NetworkX](https://github.com/networkx/networkx): High-productivity software for complex networks.\n- [NumPy](https://github.com/numpy/numpy)\n- [SciPy](https://github.com/scipy/scipy)\n- [OpenMP\u003e=4.0](http://www.openmp.org/): C/C++ API that supports multi-platform shared memory multiprocessing programming.\n\n## Dataset\n[Github Archive](https://www.githubarchive.org/)\n\n## Preprocessing\n### [parse.py](preprocessing/parse.py)\nParse raw json data files into three pickle data files.\n- output-data-basename.user: map of user id (str) to user name (str)\n- output-data-basename.repo: map of repo id (int) to repo name (str)\n- output-data-basename.edge: list of tuples of user-repo edge (str, int)\n```\nUsage: parse.py {-m|--member|-w|--watch} {\u003cinput-json-directory\u003e|\u003cinput-json-file\u003e} \u003coutput-data-basename\u003e\n  -m, --member      parse MemberEvent.\n  -w, --watch       parse WatchEvent.\nEx:    parse.py -m 2017-06-01-0.json data\nEx:    parse.py --watch json/2017-05/ data/2017-05\n```\nRefer raw json data format to [GitHub API v3](https://developer.github.com/v3/activity/events/types/).\n\n### [parse_mp.py](preprocessing/parse_mp.py)\nDitto, but run with multiprocessing. Default number of processes is 16.\n```\nUsage: parse.py {-m|--member|-w|--watch} {\u003cinput-json-directory\u003e|\u003cinput-json-file\u003e} \u003coutput-data-basename\u003e [n-process]\n  -m, --member      parse MemberEvent.\n  -w, --watch       parse WatchEvent.\n  n-process         number of processes when multiprocessing.\nEx:    parse.py -m 2017-06-01-0.json data\nEx:    parse.py --watch json/2017-05/ data/2017-05 32\n```\n\n### [mergedata.py](preprocessing/mergedata.py)\nMerge multiple pickle data files into one.\n```\nUsage: mergedata.py \u003cinput-data-dir\u003e \u003coutput-data-basename\u003e\nEx:    mergedata.py data/2016-010203/ data/2016-Q1\n```\n\n### [generate.py](preprocessing/generate.py)\nGenerate bipartite graph and project to unipartite graph (optional).\n```\nUsage: generate.py \u003cinput-data-basename\u003e \u003coutput-graph-basename\u003e [-p|--project]\n  -p, --project     project to unipartite graph (multigraph).\nEx:    generate.py data/2017-05 graph/2017-05\nEx:    generate.py data/2016-Q1 graph/2016-Q1 -p\n```\nRefer implementation of bipartite graph to [algorithms.bipartite](https://networkx.readthedocs.io/en/stable/reference/algorithms.bipartite.html) of NetworkX.\n\n### [filter.py](preprocessing/filter.py)\nFilter multigraph to single graph with different mode.\n```\nUsage: filter.py {-m|-t|-p} \u003cinput-unipartite-nxgraph\u003e \u003coutput-filtered-nxgraph\u003e\n  -m                filtering mode: Multiplicity \u003e 1.\n  -t                filtering mode: Top % of multiplicity.\n  -p                filtering mode: Multiplicity proportion \u003e threshold.\nEx:    filter.py -m graph/2017-05_user.nxgraph graph/2017-05_user_m.nxgraph\nEx:    filter.py -t graph/2016-Q1_repo.nxgraph graph/2016-Q1_repo_t.nxgraph\n```\n\n### [nxgraph2edgelist.py](preprocessing/nxgraph2edgelist.py)\nConvert NetworkX `Graph` object (`.nxgraph`) to edge list.\n```\nUsage: nxgraph2edgelist.py \u003cinput-nxgraph\u003e \u003coutput-edgelist-basename\u003e\nEx:    nxgraph2edgelist.py graph/2017-05_bi.nxgraph graph/2017-05_bi\n```\n\n## SVD Predictor\n### [bi_eval.py](svd_predictor/bi_eval.py)\n### [bi_eval_sep.py](svd_predictor/bi_eval_sep.py)\n### [bi_pred.py](svd_predictor/bi_pred.py)\n\n## Octomender\n### Build\n```\nmake\n```\n\n### Run\n```\nUsage: ./octomender \u003cinput-edgelist\u003e\nEx:    ./octomender graph/2017-05_bi.edgelist\n```\nOr direct output to file.\n```\nUsage: ./octomender \u003cinput-edgelist\u003e \u003e output.log\nEx:    ./octomender graph/2017-05_bi.edgelist \u003e log/2017-05.log\n```\n\n### [whatsthisrepoid.py](octomender/whatsthisrepoid.py)\nConvert log file to readable format including interpretation of repo id to repo name.\n```\nUsage: whatsthisrepoid.py \u003cinput-log-file\u003e \u003cinput-repo-data-file\u003e\nEx:    whatsthisrepoid.py log/2017-05.log data/2017-05.repo\n```\n\n### [lookup.py](octomender/lookup.py)\nLook up the corresponding id/name of user/repo to name/id of it.\n```\nUsage: lookup.py \u003cinput-data-file\u003e \u003cquery\u003e\nEx:    lookup.py data/2017-05.user frankyjuang\nEx:    lookup.py data/2017-05.user 6175880\nEx:    lookup.py data/2017-05.repo tensorflow/tensorflow\nEx:    lookup.py data/2017-05.repo 45717250\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyilinjuang%2Fgithub-repo-recommender","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyilinjuang%2Fgithub-repo-recommender","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyilinjuang%2Fgithub-repo-recommender/lists"}