{"id":13791252,"url":"https://github.com/rosette-api/custom-processor-sample","last_synced_at":"2025-03-01T06:44:45.165Z","repository":{"id":146184496,"uuid":"290315092","full_name":"rosette-api/custom-processor-sample","owner":"rosette-api","description":"Custom processor sample for Babel Street Analytics Entity Extractor","archived":false,"fork":false,"pushed_at":"2020-09-28T18:28:01.000Z","size":10,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-01-11T20:46:27.484Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rosette-api.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-08-25T20:20:38.000Z","updated_at":"2024-11-04T18:46:42.000Z","dependencies_parsed_at":null,"dependency_job_id":"ff1d35ad-8d00-445c-aefc-0b9b73c126c9","html_url":"https://github.com/rosette-api/custom-processor-sample","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rosette-api%2Fcustom-processor-sample","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rosette-api%2Fcustom-processor-sample/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rosette-api%2Fcustom-processor-sample/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rosette-api%2Fcustom-processor-sample/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rosette-api","download_url":"https://codeload.github.com/rosette-api/custom-processor-sample/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241329413,"owners_count":19944984,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-03T22:00:57.915Z","updated_at":"2025-03-01T06:44:45.142Z","avatar_url":"https://github.com/rosette-api.png","language":"Java","readme":"# Rosette Entity Extractor Custom Processor Sample\n\nThe Rosette Entity Extractor supports custom processors that allow you to preprocess text at the pre-extraction phase or modify the output at the pre-redaction phase. This sample will show how a custom processor can be used to correct an entity type.\n\nRequirements:\n- Rosette Enterprise 1.17.0\n- Rosette Enterprise license for Entity Extraction and English\n- [OPTIONAL] If you start Rosette Server via docker-compose, Docker must be installed and running on your machine\n\n## Observe default behavior\n\nStart Rosette Server\n```\nexport ROSAPI_HOME=/path-to-rosette-install/server directory\n$ROSAPI_HOME/bin/launch.sh console\n```\nCall the /entities endpoint\n```\ncurl -s --request POST 'http://localhost:8181/rest/v1/entities' \\\n--header 'Content-Type: application/json' \\\n--header 'Accept: application/json' \\\n--data '{\"content\":\"My name is Karin\", \"language\":\"eng\"}' \\\n| jq .\n```\nYou will see that \"Karin\" is extracted with the incorrect type ORGANIZATION. The custom processor sample code will correct entities that follow the string \"My name is\" to be type PERSON.\n\n## Build\n\nTo build the custom processor sample, run\n```\nexport CP_SAMPLE=/path-to-this-directory\ncd $CP_SAMPLE\nmvn -Drosapi.home=$ROSAPI_HOME -P extract-rex-core-jar\nmvn -Drosapi.home=$ROSAPI_HOME -P build-custom-processor\n```\n\nThis will build the `custom-processor-sample-1.0.jar` in the target directory.\n\n## Integrate with Rosette Server\n\nCopy `custom-processor-sample-1.0.jar` into Rosette Server\n```\ncp target/custom-processor-sample-1.0.jar $ROSAPI_HOME/launcher/bundles\n```\n\nEdit `$ROSAPI_HOME/launcher/config/rosapi/rex-factory-config.yaml` and add the following lines\n\n```\n#Custom processors to add to annotators. See AppDev guide for more details on custom processor.\ncustomProcessors:\n    - personContextAnnotator\n    - boundaryAdjustAnnotator\n    - metadataAnnotator\n\n#Register a custom processor class. See AppDev guide for more details on custom processor.\ncustomProcessorClasses:\n    - sample.SampleCustomProcessor\n```\n\n## Run\n\nStart Rosette Server\n```\n$ROSAPI_HOME/bin/launch.sh console\n```\nExecute the same curl command as before: \n```\ncurl -s --request POST 'http://localhost:8181/rest/v1/entities' \\\n--header 'Content-Type: application/json' \\\n--header 'Accept: application/json' \\\n--data '{\"content\":\"My name is Karin\", \"language\":\"eng\"}' \\\n| jq .\n``` \n\nThe output is\n```\n{\n  \"entities\": [\n    {\n      \"type\": \"PERSON\",\n      \"mention\": \"Karin\",\n      \"normalized\": \"Karin\",\n      \"count\": 1,\n      \"mentionOffsets\": [\n        {\n          \"startOffset\": 11,\n          \"endOffset\": 16\n        }\n      ],\n      \"entityId\": \"T0\",\n      \"confidence\": 0.03511071\n    }\n  ]\n}\n```\nNotice that the entity type is now correct as PERSON.\n\n\n## The Docker way\n\n1. Edit the `docker-compose.yaml` file, adding the following files to the volumes section.\n```\nvolumes:\n  - rosette-roots-vol:/rosette/server/roots:ro\n  - ${ROSAPI_LICENSE_PATH}:/rosette/server/launcher/config/rosapi/rosette-license.xml:ro\n  - ${CP_SAMPLE}/config/rex-factory-config.yaml:/rosette/server/launcher/config/rosapi/rex-factory-config.yaml:ro\n  - ${CP_SAMPLE}/target/custom-processor-sample-1.0.jar:/rosette/server/launcher/bundles/custom-processor-sample-1.0.jar:ro\n```\n\n2. Start the Rosette Server Docker container\n```\nROSAPI_LICENSE_PATH=\u003cpath-to-license\u003e/rosette-license.xml docker-compose up\n```\n\nCall the /entities endpoint with the same command as shown above. You should see \"Karin\" extracted\nas the correct entity type, PERSON.\n","funding_links":[],"categories":["Sample code"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frosette-api%2Fcustom-processor-sample","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frosette-api%2Fcustom-processor-sample","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frosette-api%2Fcustom-processor-sample/lists"}