Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/qw3ry/sigclone
A tool to detect source clones based on method signatures
https://github.com/qw3ry/sigclone
clone-detection code-quality kotlin research-project static-analysis
Last synced: about 1 month ago
JSON representation
A tool to detect source clones based on method signatures
- Host: GitHub
- URL: https://github.com/qw3ry/sigclone
- Owner: qw3ry
- License: mpl-2.0
- Created: 2020-02-24T11:41:09.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2021-02-05T11:23:44.000Z (almost 4 years ago)
- Last Synced: 2024-11-07T23:43:02.564Z (3 months ago)
- Topics: clone-detection, code-quality, kotlin, research-project, static-analysis
- Language: Java
- Homepage:
- Size: 128 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# SigClone - the signature based clone detector
SigClone is a clone detector comparing only method signatures.
This allows the detection of truely semantic clones.## Building
SigClone is built using gradle.
You need to build two targets: `jar` and `librariesJar`.
The former only builds the source code, the latter creates a jar containing its dependencies.## Usage
To execute SigClone, run `java -jar sigclone.jar --help`.
The libraries jar created during the build should be included in the classpath automatically.
When called with `--help`, SigClone displays an appropriate help message.
This should aid you with the further usage.## How it works
SigClone is the result of my master thesis, and the details are described in there.
It is not currently published, but if you are interested, I can email you a copy, if you reach out to me.A short overview:
SigClone extracts the method signatures, consisting of the return type, the method identifier, and the parameters, each consisting of a type and a name.
The implicit `this`-parameter is also considered, if applicable.
The method signatures are then vectorized and finally compared to each other.
If two signatures are similar enough, according to some similarity measure and a threshold, they are considered clones.
SigClone supports different similarity measures, one of them including an AI trained for natural language processing.
The different approaches are described and evaluated in my master thesis.## FAQ
**What languages does SigClone support?** - Only Java, currently. Feel free to add a parser for your favorite language and file a PR.
**Does my code need to compile?** - No, a compilation is not required. You can easily supply sub-sets of your codebase to SigClone, or even files from different projects.
**What similarity measures are supported?** - In short: Euclidean distance, relative word distance (not an official term), and cosine distance.
More detail is available in the code (look [here](https://github.com/qw3ry/sigclone/tree/master/src/main/kotlin/main/runners)), or in my thesis.
Feel free to code another similarity measure and file a PR.