{"id":21229362,"url":"https://github.com/shibam120302/plagiarism_checker_python","last_synced_at":"2025-03-15T02:14:17.604Z","repository":{"id":156954093,"uuid":"589858238","full_name":"shibam120302/Plagiarism_checker_Python","owner":"shibam120302","description":"A python project for checking plagiarism of documents based on cosine similarity","archived":false,"fork":false,"pushed_at":"2023-01-20T03:43:58.000Z","size":55,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-21T17:50:41.333Z","etag":null,"topics":["ml","nlp","python","python-library","python-programming","python-projects"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/shibam120302.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-01-17T05:18:30.000Z","updated_at":"2023-08-15T17:22:04.000Z","dependencies_parsed_at":null,"dependency_job_id":"b9b4403b-995c-4471-8d25-78857dcb972a","html_url":"https://github.com/shibam120302/Plagiarism_checker_Python","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shibam120302%2FPlagiarism_checker_Python","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shibam120302%2FPlagiarism_checker_Python/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shibam120302%2FPlagiarism_checker_Python/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shibam120302%2FPlagiarism_checker_Python/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/shibam120302","download_url":"https://codeload.github.com/shibam120302/Plagiarism_checker_Python/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243672484,"owners_count":20328767,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ml","nlp","python","python-library","python-programming","python-projects"],"created_at":"2024-11-20T23:26:59.667Z","updated_at":"2025-03-15T02:14:17.578Z","avatar_url":"https://github.com/shibam120302.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Plagiarism Checker Python\n\nThis repo consists of a source code of a python script to detect plagiarism in textual document using **cosine similarity**\n\n![PL](https://i.ytimg.com/vi/lRzC3w2NDg0/maxresdefault.jpg)\n\n## How is it done?\n\nYou might be wondering on how plagiarism detection on textual data is done, well it aint that complicated as you may think.\n\nWe all all know that computer are good at numbers, so in order to compute the simlilarity between on two text documents, the textual  raw data is transformed into vectors =\u003e arrays of numbers and then from that we are going to use a basic knowledge vector to compute the the similarity between them.\n\nThis repo consist of a basic example on how to do that.\n\n\n## Getting started\n\nTo get started with the code on this repo, you need to either *clone* or *download* this repo into your machine just as shown below;\n\n```bash\ngit clone https://github.com/shibam120302/Plagiarism-checker-Python\n```\n\n## Dependencies \n\nBefore you begin playing with the source code you might need to install deps just as shown below;\n\n```bash\npip3 install -r requirements.txt\n```\n\n## Running the App\n\nTo run this code you need to have your textual document in your project directory with extension **.txt** and then when you run the script, it will automatically loads all the document with that extension and then compute the similarity between them just as shown below;\n\n```bash\n$-\u003e cd Plagiarism-checker-Python\n$ Plagiarism-checker-Python-\u003e python3 app.py\n('john.txt', 'juma.txt', 0.5465972177348937)\n('fatma.txt', 'john.txt', 0.14806887549598566)\n('fatma.txt', 'juma.txt', 0.18643448370323362)\n\n```\n\n## A python library ?\n\nWould you like to use Python library instead to help you compare strings and documents without spending time writing the vectorizers by your own then take a look at [Pysimilar](https://github.com/Kalebu/pysimilar).\n\n## Explore it \n\nExplore it and twist it to your own use case, in case of any question feel free to reach me out directly *isaackeinstein(at)gmail.com*\n\n## Issues \n\nIncase you have any difficulties or issues while trying to run the script\nyou can raise it on the issues. \n\n## Pull Requests\n\nIf you have something to add I welcome pull requests on improvement , you're helpful contribution will be merged as soon as possible\n\n## Give it a Star\n\nIf you find this repo useful , give it a star so as many people can get to know it.\n\n## Credits\n\nAll the credits to [Shibam](https://github.com/shibam120302)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshibam120302%2Fplagiarism_checker_python","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fshibam120302%2Fplagiarism_checker_python","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshibam120302%2Fplagiarism_checker_python/lists"}