{"id":31154802,"url":"https://github.com/bk-scoss/scoss","last_synced_at":"2025-09-18T19:42:02.364Z","repository":{"id":43175487,"uuid":"354218756","full_name":"BK-SCOSS/scoss","owner":"BK-SCOSS","description":"A Source Code Similarity System - SCOSS","archived":false,"fork":false,"pushed_at":"2022-03-15T10:48:47.000Z","size":32366,"stargazers_count":8,"open_issues_count":1,"forks_count":1,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-08-16T23:17:17.029Z","etag":null,"topics":["code-similarity","moss","plagiarism-detection","scoss"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BK-SCOSS.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-04-03T06:31:40.000Z","updated_at":"2024-06-05T15:18:21.000Z","dependencies_parsed_at":"2022-09-13T21:50:46.264Z","dependency_job_id":null,"html_url":"https://github.com/BK-SCOSS/scoss","commit_stats":null,"previous_names":["ngocjr7/scoss"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/BK-SCOSS/scoss","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BK-SCOSS%2Fscoss","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BK-SCOSS%2Fscoss/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BK-SCOSS%2Fscoss/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BK-SCOSS%2Fscoss/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BK-SCOSS","download_url":"https://codeload.github.com/BK-SCOSS/scoss/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BK-SCOSS%2Fscoss/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":275821047,"owners_count":25534826,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-18T02:00:09.552Z","response_time":77,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["code-similarity","moss","plagiarism-detection","scoss"],"created_at":"2025-09-18T19:41:59.518Z","updated_at":"2025-09-18T19:42:02.355Z","avatar_url":"https://github.com/BK-SCOSS.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# scoss\nA Source Code Similarity System - SCOSS\n\nThere are four supported metrics:\n\n* `count_operator`: A metric that counts operators in source-code to calculate similarity score.\n* `set_operator`: A metric that checks the presence of operators in source-code to calculate similarity score.\n* `hash_operator`: A metric that uses the combination of adjacent operators to calculate similarity score.\n* `SMoss`: A wrapper of [MOSS](http://theory.stanford.edu/~aiken/moss/) (the same as `mosspy`).\n\n## Installations\nThis package requires `python 3.6` or later.\n```sh\npip install scoss\n```\n\n## Usages\nYou can use SCOSS as a Command Line Interface, or a library in your project, or web-app interface\n\n### Command Line Interface (CLI)\nSee document by passing ```--help``` argument.\n```\nscoss --help\nUsage: scoss [OPTIONS]\n\nOptions:\n  -i, --input-dir TEXT      Input directory.  [required]\n  -o, --output-dir TEXT           Output directory.\n  -tc, --threshold-combination [AND|OR]\n                                  AND: All metrics are greater than threshold.\n                                  OR: At least 1 metric is greater than\n                                  threshold.\n\n  -mo, --moss FLOAT RANGE         Use moss metric and set up moss threshold.\n  -co, --count-operator FLOAT RANGE\n                                  Use count operator metric and set up count\n                                  operator threshold.\n\n  -so, --set-operator FLOAT RANGE\n                                  Use set operator metric and set up set\n                                  operator threshold.\n\n  -ho, --hash-operator FLOAT RANGE\n                                  Use hash operator metric and set up hash\n                                  operator threshold.\n\n  --help                          Show this message and exit.\n```\nTo get plagiarism report of a directory containing source code files, add ```-i/ --input-dir``` option. Add at least 1 similarity metric in [```-mo/--moss```, ```-co/--count-operator```, ```-so/--set-operator```, ```-ho/--hash-operator```] and its threshold (in range [0,1]). If using 2 or more metrics, you need to define how they should be combined using ```-tc/--threshold-combination``` (```AND``` will be used by default).\n\nBasic command: ```scoss -i path/to/source_code_dir/ -tc OR -co 0.1 -ho 0.1 -mo 0.1 -o another_path/to/plagiarism_report/```\n### Using as a library\n\n1. Define a `Scoss` object and register some metrics:\n```python\nfrom scoss import Scoss\nsc = Scoss(lang='cpp')\n# only show pairs that have similarity score \u003e threshold\nsc.add_metric('count_operator', threshold=0.7) \nsc.add_metric('set_operator', threshold=0.5)\n```\n\n2. Register source-codes to defined `scoss` object:\n```python\nsc.add_file('./tests/data/a.cpp')\nsc.add_file('./tests/data/b.cpp')\nsc.add_file('./tests/data/c.cpp')\n# or add by wide-card\nsc.add_file_by_wildcard('./tests/data/problem_A_*.cpp')\n```\n\n3. Run `Scoss` and get results:\n```python\nsc.run()\n# filter results by combine thresholds from different metrics (and_threshold)\nprint(sc.get_matches(and_thresholds=True))\n```\n\nThe same behaviours is defined in `SMoss`. You can create `SMoss` object to use MOSS system.\n\n### Web-app interface\nPlease check our web-app interface [here](https://github.com/ngocjr7/scoss_webapp).\n\n\n## Issues\nThis project is in development, if you find any issues, please create an issue [here](https://github.com/ngocjr7/scoss/issues).\n\n## Contributors\n[Ngoc Bui](https://github.com/ngocjr7), [Thai Do](https://github.com/Dec1mo), [Tran Vien](https://github.com/tranvien98).\n\n## Acknowledgements\nThis project is sponsored and led by Prof. Do Phan Thuan, Hanoi University of Science and Technology.\n\nA part of this code adapts this source code https://github.com/soachishti/moss.py as baseline for `SMoss`.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbk-scoss%2Fscoss","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbk-scoss%2Fscoss","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbk-scoss%2Fscoss/lists"}