https://github.com/sandeepkundalwal/automated-plagiarism-detector
An automated plagiarism detector that handles unzipping, generates plagiarism report and scraps the reports for threshold plagiarism.
https://github.com/sandeepkundalwal/automated-plagiarism-detector
gradle html-scraper java jsoup maven plagiarism-detector python3 teaching-assistant unzipping-files
Last synced: 2 months ago
JSON representation
An automated plagiarism detector that handles unzipping, generates plagiarism report and scraps the reports for threshold plagiarism.
- Host: GitHub
- URL: https://github.com/sandeepkundalwal/automated-plagiarism-detector
- Owner: SandeepKundalwal
- License: mit
- Created: 2023-06-12T10:49:22.000Z (over 2 years ago)
- Default Branch: master
- Last Pushed: 2023-10-07T06:59:11.000Z (about 2 years ago)
- Last Synced: 2025-06-08T08:03:11.346Z (5 months ago)
- Topics: gradle, html-scraper, java, jsoup, maven, plagiarism-detector, python3, teaching-assistant, unzipping-files
- Language: Java
- Homepage:
- Size: 101 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
#
Automated Plagiarism Detector
An automated plagiarism detector that handles unzipping, generates plagiarism report and scraps the reports for minimum threshold plagiarism.
### The project consists of three modules:
-
Staging Files: All the zip files are unzipped and the files that are present in unzipped folder are segreggated based on a particular format which is {rollno.}_{questionno.}. Makes multiple directories based on the number of questions.
-
Plagiarism Script: Checks for plagiarism and provides the percentage of plagiarism for each file against all the other files that are present in a directory. Generates a HTML file containing plagiarism report.
- run script: python3 plag.py {Assignment file location} {Report generation location}
-
Scrapping Plagiarism Report: Scrapes the percentage of plagiarism from each report generated by Plagiarism Script and returns a .txt file that contains the Roll No. of all the students who have plagiarism
above the minimum allowed threshold.