Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/chriniko13/eresearch-author-matcher
https://github.com/chriniko13/eresearch-author-matcher
dropwizard-metrics guava jsonassert lombok spring-boot spring-boot-test spring-boot-undertow spring-boot-web
Last synced: 2 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/chriniko13/eresearch-author-matcher
- Owner: chriniko13
- Created: 2019-04-09T23:35:06.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2022-08-18T19:06:18.000Z (about 2 years ago)
- Last Synced: 2023-03-04T17:47:11.620Z (over 1 year ago)
- Topics: dropwizard-metrics, guava, jsonassert, lombok, spring-boot, spring-boot-test, spring-boot-undertow, spring-boot-web
- Language: Java
- Size: 77.1 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Eresearch Author Matcher Service #
### Description
The purpose of this service is to accept two names of authors, name: firstname + initials + surname
and do a similarity comparison to these two, using different algorithms such as levenshtein, simonwhite, etc.The response is the similarity comparison of these two names.
This service makes heavy usage of this library: https://github.com/Simmetrics/simmetricsOur service uses the following algorithms:
* SIMON_WHITE
* LEVENSHTEIN
* JARO
* JARO_WINKLER
* COSINE_SIMILARITY
* EUCLIDEAN_DISTANCE
* BLOCK_DISTANCE
* DAMERAU_LEVENSHTEIN
* DICE
* GENERALIZED_JACCARD
* IDENTITY
* JACCARD
* LONGEST_COMMON_SUBSEQUENCE
* LONGEST_COMMON_SUBSTRING
* MONGE_ELKAN
* NEEDLEMAN_WUNCH
* OVERLAP_COEFFICIENT
* Q_GRAMS_DISTANCE
* SMITH_WATERMAN
* SMITH_WATERMAN_GOTOH### Integration Tests
* Execute: `mvn clean verify`
### Create Docker Image
* Execute: `mvn clean install -DskipITs=true`
* Execute: `docker build -t chriniko/eresearch-author-matcher:1.0 .` in order to build docker image.* Fast: `mvn clean install -DskipITs=true && docker build -t chriniko/eresearch-author-matcher:1.0 .`
### How to run service (not dockerized)
* Execute: `docker-compose up`* Two options:
* Execute:
* `mvn clean install -DskipITs=true`
* `java -jar -Dspring.profiles.active=dev target/eresearch-author-matcher-1.0-boot.jar`
* Execute:
* `mvn spring-boot:run -Dspring.profiles.active=dev`* (Optional) When you finish: `docker-compose down`
### How to run service (dockerized)
* Uncomment the section in `docker-compose.yml` file for service: `eresearch-author-matcher:`* Execute: `mvn clean install -DskipITs=true`
* Execute: `docker-compose build`
* Execute: `docker-compose up`
* (Optional) When you finish: `docker-compose down`
### Example Request
```json
{
"first-author-name":{
"firstname":"Dimitris",
"initials":"",
"surname":"Piliotis"
},
"second-author-name":{
"firstname":"D.",
"initials":"",
"surname":"Piliotis"
}
}```
### Example Response
```json
{
"operation-result": true,
"process-finished-date": 1487182289.869,
"comparison-results": {
"DAMERAU_LEVENSHTEIN": {
"comparison-result": 0.5625,
"comparison-result-floor": 0.56,
"comparison-result-ceil": 0.57
},
"COSINE_SIMILARITY": {
"comparison-result": 0,
"comparison-result-floor": 0,
"comparison-result-ceil": 0
},
"EUCLIDEAN_DISTANCE": {
"comparison-result": 0,
"comparison-result-floor": 0,
"comparison-result-ceil": 0
},
"MONGE_ELKAN": {
"comparison-result": 0.800000011920929,
"comparison-result-floor": 0.8,
"comparison-result-ceil": 0.81
},
"LONGEST_COMMON_SUBSTRING": {
"comparison-result": 0.5,
"comparison-result-floor": 0.5,
"comparison-result-ceil": 0.5
},
"SMITH_WATERMAN": {
"comparison-result": 0.800000011920929,
"comparison-result-floor": 0.8,
"comparison-result-ceil": 0.81
},
"OVERLAP_COEFFICIENT": {
"comparison-result": 0,
"comparison-result-floor": 0,
"comparison-result-ceil": 0
},
"JARO": {
"comparison-result": 0.7097222208976746,
"comparison-result-floor": 0.7,
"comparison-result-ceil": 0.71
},
"NEEDLEMAN_WUNCH": {
"comparison-result": 0.75,
"comparison-result-floor": 0.75,
"comparison-result-ceil": 0.75
},
"BLOCK_DISTANCE": {
"comparison-result": 0,
"comparison-result-floor": 0,
"comparison-result-ceil": 0
},
"LEVENSHTEIN": {
"comparison-result": 0.5625,
"comparison-result-floor": 0.56,
"comparison-result-ceil": 0.57
},
"GENERALIZED_JACCARD": {
"comparison-result": 0,
"comparison-result-floor": 0,
"comparison-result-ceil": 0
},
"JACCARD": {
"comparison-result": 0,
"comparison-result-floor": 0,
"comparison-result-ceil": 0
},
"IDENTITY": {
"comparison-result": 0,
"comparison-result-floor": 0,
"comparison-result-ceil": 0
},
"LONGEST_COMMON_SUBSEQUENCE": {
"comparison-result": 0.5625,
"comparison-result-floor": 0.56,
"comparison-result-ceil": 0.57
},
"JARO_WINKLER": {
"comparison-result": 0.7387499809265137,
"comparison-result-floor": 0.73,
"comparison-result-ceil": 0.74
},
"SMITH_WATERMAN_GOTOH": {
"comparison-result": 0.800000011920929,
"comparison-result-floor": 0.8,
"comparison-result-ceil": 0.81
},
"DICE": {
"comparison-result": 0,
"comparison-result-floor": 0,
"comparison-result-ceil": 0
},
"Q_GRAMS_DISTANCE": {
"comparison-result": 0.6000000238418579,
"comparison-result-floor": 0.6,
"comparison-result-ceil": 0.61
},
"SIMON_WHITE": {
"comparison-result": 0.5833333134651184,
"comparison-result-floor": 0.58,
"comparison-result-ceil": 0.59
}
},
"comparison-input": {
"first-author-name": {
"firstname": "Dimitris",
"initials": "",
"surname": "Piliotis"
},
"second-author-name": {
"firstname": "D.",
"initials": "",
"surname": "Piliotis"
}
}
}```