https://github.com/philterd/phinder-pii-opensearch-plugin
An OpenSearch plugin to redact PII from search results.
https://github.com/philterd/phinder-pii-opensearch-plugin
opensearch pii plugin redaction
Last synced: 8 months ago
JSON representation
An OpenSearch plugin to redact PII from search results.
- Host: GitHub
- URL: https://github.com/philterd/phinder-pii-opensearch-plugin
- Owner: philterd
- License: apache-2.0
- Created: 2024-11-13T19:46:56.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2025-02-03T21:02:58.000Z (8 months ago)
- Last Synced: 2025-02-03T22:19:23.363Z (8 months ago)
- Topics: opensearch, pii, plugin, redaction
- Language: Java
- Homepage: https://www.philterd.ai
- Size: 87.9 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
README
# Phinder PII Plugin for OpenSearch
This repository is a plugin for Amazon OpenSearch that redacts PII from search results. It uses the [phileas](https://github.com/philterd/phileas/) library for redaction.
There is also a [version](https://github.com/philterd/phinder-pii-elasticsearch-plugin) available for Elasticsearch.
## Related Topics
If you are here, you may also be interested in:
* [Field-level security in OpenSearch](https://opensearch.org/docs/latest/security/access-control/field-level-security/)
* [Field masking in OpenSearch](https://opensearch.org/docs/latest/security/access-control/field-masking/)## Build and Install
To build the plugin:
```
./gradlew build
```To install the plugin:
```
/usr/share/opensearch/bin/opensearch-plugin install --batch file:/path/to/phinder-1.0.0-SNAPSHOT.zip
```## Using Docker
To quickly run OpenSearch and the plugin for development or testing:
```
docker compose build
docker compose up
```## Usage
To use the plugin, create an index and then index some documents:
```
curl -s -X PUT "http://localhost:9200/sample_index" -H 'Content-Type: application/json'
``````
curl -s -X POST "http://localhost:9200/sample_index/_doc" -H 'Content-Type: application/json' -d'
{
"name": "Another Example",
"description": "My email is something@something.com ok!"
}'curl -s -X POST "http://localhost:9200/sample_index/_doc" -H 'Content-Type: application/json' -d'
{
"name": "Yet Another Example",
"description": "No email addresses in this one"
}'curl -s -X POST "http://localhost:9200/sample_index/_doc" -H 'Content-Type: application/json' -d'
{
"name": "A Third Example",
"description": "tom@tom.com"
}'```
Next, do a search providing a filter policy and specifying which field you want to redact. (For more on policies, see Phileas' documentation on [Policies](https://philterd.github.io/phileas/filter_policies/filter_policies/).) In this example,
we are going to redact email addresses that appear in the `description` field:```
curl -s http://localhost:9200/sample_index/_search -H "Content-Type: application/json" -d'
{
"ext": {
"phinder": {
"field": "description",
"policy": "{\"identifiers\": {\"emailAddress\":{\"emailAddressFilterStrategies\":[{\"strategy\":\"REDACT\",\"redactionFormat\":\"{{{REDACTED-%t}}}\"}]}}}"
}
},
"query": {
"match_all": {}
}
}'
```The value of `field` in the request can be a single field, or a comma-separated list of fields to redact.
In the response, you will see the email address in the indexed document has been redacted:
```
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "sample_index",
"_id": "3sZ8cJQBeC9RICk83_tV",
"_score": 1,
"_source": {
"name": "Another Example",
"description": "My email is {{{REDACTED-email-address}}} ok!"
}
},
{
"_index": "sample_index",
"_id": "38Z8cJQBeC9RICk83_t1",
"_score": 1,
"_source": {
"name": "Yet Another Example",
"description": "No email addresses in this one"
}
},
{
"_index": "sample_index",
"_id": "4MZ8cJQBeC9RICk83_uI",
"_score": 1,
"_source": {
"name": "A Third Example",
"description": "{{{REDACTED-email-address}}}"
}
}
]
}
}
```## License
This code is licensed under the Apache 2.0 License. See [LICENSE.txt](LICENSE.txt).## Copyright
Copyright 2025 Philterd, LLC. See [NOTICE](NOTICE.txt) for details.