{"id":34114209,"url":"https://github.com/privateai/pai-thin-client","last_synced_at":"2026-04-07T04:32:09.547Z","repository":{"id":154990817,"uuid":"623981411","full_name":"privateai/pai-thin-client","owner":"privateai","description":"A python client used to interact with the Private AI's API","archived":false,"fork":false,"pushed_at":"2026-01-16T13:39:59.000Z","size":640,"stargazers_count":22,"open_issues_count":0,"forks_count":3,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-01-17T04:06:23.515Z","etag":null,"topics":["anonymization","de-identification","deidentification","dlp","gdpr","hippa","redact","redaction","synthetic-data"],"latest_commit_sha":null,"homepage":"https://docs.private-ai.com/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/privateai.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2023-04-05T13:49:16.000Z","updated_at":"2026-01-16T13:38:11.000Z","dependencies_parsed_at":"2024-02-01T19:42:06.023Z","dependency_job_id":"a6093e91-4d7b-442d-afd4-a35971e1a349","html_url":"https://github.com/privateai/pai-thin-client","commit_stats":null,"previous_names":[],"tags_count":18,"template":false,"template_full_name":null,"purl":"pkg:github/privateai/pai-thin-client","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/privateai%2Fpai-thin-client","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/privateai%2Fpai-thin-client/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/privateai%2Fpai-thin-client/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/privateai%2Fpai-thin-client/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/privateai","download_url":"https://codeload.github.com/privateai/pai-thin-client/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/privateai%2Fpai-thin-client/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31500397,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-07T03:10:19.677Z","status":"ssl_error","status_checked_at":"2026-04-07T03:10:13.982Z","response_time":105,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anonymization","de-identification","deidentification","dlp","gdpr","hippa","redact","redaction","synthetic-data"],"created_at":"2025-12-14T19:28:05.332Z","updated_at":"2026-04-07T04:32:09.540Z","avatar_url":"https://github.com/privateai.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Private AI Python Client\n\nA Python client library for communicating with the Private AI API. This document provides information about how to best use the client. For more information, see Private AI's [API Documentation.][1]\n\n### Quick Links\n\n1. [Installation](#installation)\n1. [Quick Start](#quick-start)\n1. [Running the tests](#testing)\n1. [Working with the Client](#client)\n1. [Request Objects](#request-objects)\n1. [Sample Use](#sample-use)\n\n### Installation \u003ca name=installation\u003e\u003c/a\u003e\n\n```\npip install privateai_client\n```\n\n### Quick Start \u003ca name=quick-start\u003e\u003c/a\u003e\n\n```python\n\nfrom privateai_client import PAIClient\nfrom privateai_client import request_objects\n\nclient = PAIClient(url=\"http://localhost:8080\")\ntext_request = request_objects.process_text_obj(text=[\"My sample name is John Smith\"])\nresponse = client.process_text(text_request)\n\nprint(text_request.text)\nprint(response.processed_text)\n\n\n```\n\nOutput:\n\n```\n['My sample name is John Smith']\n['My sample name is [NAME_1]']\n```\n\n### Running the tests \u003ca name=testing\u003e\u003c/a\u003e\n\nWe use [pytest](https://docs.pytest.org/) to run our tests in the tests folder.\n\nTo run from command line, ensure you have pytest installed, and then run `pytest` from the main project folder.\n\n```shell\npip install -r requirements.dev.txt\npytest\n```\n\nAlternatively, you can run automatically run all tests from the Testing window in Visual Studio Code.\n\n### Working With The Client \u003ca name=client\u003e\u003c/a\u003e\n\n#### Initializing the Client\n\nThe PAI client requires a scheme, host, and optional port to initialize. \nAlternatively, a full url can be used.\nOnce created, the connection can be tested with the client's `ping` function\n\n```python\nscheme = 'http'\nhost = 'localhost'\nport= '8080'\nclient = PAIClient(scheme, host, port)\n\nclient.ping()\n\n\nurl = \"http://localhost:8080\"\nclient = PAIClient(url=url)\n\nclient.ping()\n```\n\nOutput:\n\n```\nTrue\nTrue\n```\n\n#### Adding Authorization to the Client\n\n```python\nfrom privateai_client import PAIClient\n# On initialization\nclient = PAIClient(url=\"http://localhost:8080\", api_key='testkey')\n\n# After initialization\nclient = PAIClient(url=\"http://localhost:8080\")\nclient.ping()\nclient.add_api_key(\"testkey\")\nclient.ping()\n```\nOutput:\n\n```\nThe request returned with a 401 Unauthorized\nTrue\n```\n\n\n#### Making Requests\n\nOnce initialized the client can be used to make any request listed in the [Private-AI documentation][1]\n\nAvailable requests:\n\n| Client Function          | Endpoint                |\n| ------------------------ | ----------------------- |\n| `get_version()`          | `/`                     |\n| `ping()`                 | `/healthz`              |\n| `get_metrics()`          | `/metrics`              |\n| `get_diagnostics()`      | `/diagnostics`          |\n| `process_text()`         | `/process/text`         |\n| `process_files_uri()`    | `/process/files/uri`    |\n| `process_files_base64()` | `/process/files/base64` |\n| `bleep()`                | `/bleep`                |\n\nRequests can be made using dictionaries:\n\n```python\nsample_text = [\"This is John Smith's sample dictionary request\"]\ntext_dict_request = {\"text\": sample_text}\n\nresponse = client.process_text(text_dict_request)\nprint(response.processed_text)\n```\n\nOutput:\n\n```\n[\"This is [NAME_1]'s sample dictionary request\"]\n```\n\nor using built-in request objects:\n\n```python\nfrom privateai_client import request_objects\n\nsample_text = \"This is John Smith's sample process text object request\"\ntext_request_object =  request_objects.process_text_obj(text=[sample_text])\n\nresponse = client.process_text(text_request_object)\nprint(response.processed_text)\n```\n\nOutput:\n\n```\n[\"This is [NAME_1]'s sample process text object request\"]\n```\n\n### Request Objects \u003ca name=request-objects\u003e\u003c/a\u003e\n\nRequest objects are a simple way of creating request bodies without the tediousness of writing dictionaries. Every post request (as listed in the [Private-AI documentation][1]) has its own request own request object.\n\n```python\nfrom privateai_client import request_objects\n\nsample_obj = request_objects.file_uri_obj(uri='path/to/file.jpg')\nsample_obj.uri\n```\n\nOutput:\n\n```\n'path/to/file.jpg'\n```\n\nAdditionally there are request objects for each nested dictionary of a request:\n\n```python\nfrom privateai_client import request_objects\n\nsample_text = \"This is John Smith from Sample Company to show a sample process text object request where organizations won't be removed, but John will be recognized as the same entity\"\n\n# sub-dictionary of entity_detection\nsample_entity_type_selector = request_objects.entity_type_selector_obj(type=\"DISABLE\", value=[\"ORGANIZATION\"])\n\n# sub-dictionary of a process text request\nsample_entity_detection = request_objects.entity_detection_obj(entity_types=[sample_entity_type_selector])\n\n# sub-dictionary of a process text request\nsample_processed_text = request_objects.processed_text_obj(type=\"MARKER\", pattern=\"[UNIQUE_NUMBERED_ENTITY_TYPE]\", coreference_resolution=\"model_prediction\")\n\n# request object created using the sub-dictionaries\nsample_request = request_objects.process_text_obj(text=[sample_text], entity_detection=sample_entity_detection, processed_text=sample_processed_text)\nresponse = client.process_text(sample_request)\nprint(response.processed_text)\n```\n\nOutput:\n\n```\n[\"This is [NAME_1] from Sample Company to show a sample process text object request where organizations won't be removed, but [NAME_1] will be recognized as the same entity\"]\n```\n\n#### Building Request Objects\n\nRequest objects can initialized by passing in all the required values needed for the request as arguments or from a dictionary, using the object's `fromdict` function. Any object can be created as per the [Private AI documentation][1].\n\n```python\n# Passing arguments\nsample_data = \"JVBERi0xLjQKJdPr6eEKMSAwIG9iago8PC9UaXRsZSAoc2FtcGxlKQovUHJvZHVj...\"\nsample_content_type = \"application/pdf\"\n\nsample_file_obj = request_objects.file_obj(data=sample_data, content_type=sample_content_type)\n\n# Passing a dictionary using .fromdict()\nsample_dict = {\"data\": \"JVBERi0xLjQKJdPr6eEKMSAwIG9iago8PC9UaXRsZSAoc2FtcGxlKQovUHJvZHVj...\",\n               \"content_type\": \"application/pdf\"}\n\nsample_file_obj2 = request_objects.file_obj.fromdict(sample_dict)\n```\n\nRequest objects also can be formatted as dictionaries, using the request object's `to_dict()` function:\n\n```python\nfrom privateai_client import request_objects\n\nsample_text = \"Sample text.\"\nsample_accuracy = \"standard\"\n\n# Create the nested request objects\nsample_entity_type_selector = request_objects.entity_type_selector_obj(type=\"DISABLE\", value=['HIPAA_SAFE_HARBOR'])\nsample_entity_detection = request_objects.entity_detection_obj(\n    entity_types=[sample_entity_type_selector],\n    accuracy=sample_accuracy\n)\n\n# Create the request object\nsample_request = request_objects.process_text_obj(text=[sample_text], entity_detection=sample_entity_detection)\n\n# All nested request objects are also formatted\nprint(sample_request.to_dict())\n```\n\nOutput:\n\n```\n{\n 'text': ['Sample text.'],\n 'link_batch': False,\n 'entity_detection': {'accuracy': 'standard', 'entity_types': [{'type': 'DISABLE', 'value': ['HIPAA_SAFE_HARBOR']}], 'filter': [], 'return_entity': True},\n 'processed_text': {'type': 'MARKER', 'pattern': '[UNIQUE_NUMBERED_ENTITY_TYPE]'}\n}\n```\n\n### Sample Use \u003ca name=sample-use\u003e\u003c/a\u003e\n\n#### Processing a directory of files\n\n```python\nfrom privateai_client import PAIClient\nfrom privateai_client.objects import request_objects\nimport os\nimport logging\n\nfile_dir = \"/path/to/file/directory\"\nclient = PAIClient(url=\"http://localhost:8080\")\nfor file_name in os.listdir(file_dir):\n    filepath = os.path.join(file_dir, file_name)\n    if not os.path.isfile(filepath):\n        continue\n    req_obj = request_objects.file_uri_obj(uri=filepath)\n    # NOTE this method of file processing requires the container to have an the input and output directories mounted\n    resp = client.process_files_uri(req_obj)\n```\n\n#### Processing a Base64 file\n\n```python\nfrom privateai_client import PAIClient\nfrom privateai_client.objects import request_objects\nimport base64\nimport os\nimport logging\n\nfile_dir = \"/path/to/your/file\"\nfile_name = 'sample_file.pdf'\nfilepath = os.path.join(file_dir,file_name)\nfile_type= \"type/of_file\" #eg. application/pdf\nclient = PAIClient(url=\"http://localhost:8080\")\n\n# Read from file\nwith open(filepath, \"rb\") as b64_file:\n    file_data = base64.b64encode(b64_file.read())\n    file_data = file_data.decode(\"ascii\")\n\n# Make the request\nfile_obj = request_objects.file_obj(data=file_data, content_type=file_type)\nrequest_obj = request_objects.file_base64_obj(file=file_obj)\nresp = client.process_files_base64(request_object=request_obj)\n\n# Write to file\nwith open(os.path.join(file_dir,f\"redacted-{file_name}\"), 'wb') as redacted_file:\n    processed_file = resp.processed_file.encode(\"ascii\")\n    processed_file = base64.b64decode(processed_file, validate=True)\n    redacted_file.write(processed_file)\n```\n\n#### Bleep an audio file\n\n```python\nfrom privateai_client import PAIClient\nfrom privateai_client.objects import request_objects\nimport base64\nimport os\nimport logging\n\nfile_dir = \"/path/to/your/file\"\nfile_name = 'sample_file.pdf'\nfilepath = os.path.join(file_dir,file_name)\nfile_type= \"type/of_file\" #eg. audio/mp3 or audio/wav\nclient = PAIClient(url=\"http://localhost:8080\")\n\n\nfile_dir = \"/home/adam/workstation/file_processing/test_audio\"\nfile_name = \"test_audio.mp3\"\nfilepath = os.path.join(file_dir,file_name)\nfile_type = \"audio/mp3\"\nwith open(filepath, \"rb\") as b64_file:\n    file_data = base64.b64encode(b64_file.read())\n    file_data = file_data.decode(\"ascii\")\n\nfile_obj = request_objects.file_obj(data=file_data, content_type=file_type)\ntimestamp = request_objects.timestamp_obj(start=1.12, end=2.14)\nrequest_obj = request_objects.bleep_obj(file=file_obj, timestamps=[timestamp])\n\nresp = client.bleep(request_object=request_obj)\nwith open(os.path.join(file_dir,f\"redacted-{file_name}\"), 'wb') as redacted_file:\n    processed_file = resp.bleeped_file.encode(\"ascii\")\n    processed_file = base64.b64decode(processed_file, validate=True)\n    redacted_file.write(processed_file)\n```\n\n#### Working with structured data\n\nRedacting a data frame column by column\n\n##### NOTE: When de-identifying smaller strings of structured data, more accurate results can be achieved by passing in the whole column as a string (including the header) and a delimiter. For example, making a request row by row for a column named SSN will return data identified as PHONE_NUMBER, even when the header is included\n\n```python\n# Working with data frames\nimport pandas as pd\nfrom privateai_client import PAIClient\nfrom privateai_client.objects import request_objects\n\nclient = PAIClient(url=\"http://localhost:8080\")\ndata_frame = pd.DataFrame(\n    {\n        \"Name\": [\n            \"Braund, Mr. Owen Harris\",\n            \"Allen, Mr. William Henry\",\n            \"Bonnell, Miss. Elizabeth\",\n        ],\n        \"Age\": [22, 35, 58],\n        \"Sex\": [\"male\", \"male\", \"female\"],\n    }\n)\nprint(data_frame)\ntext_req = request_objects.process_text_obj(text=[])\nfor column in data_frame.columns:\n    text_req.text.append(f\"{column}:{' | '.join([str(row) for row in data_frame[column]])}\")\n\nresp = client.process_text(text_req)\nredacted_data = dict()\nfor row in resp.processed_text:\n    data = row.split(':',1)\n    redacted_data[data[0]] = data[1].split(' | ')\nredacted_data_frame = pd.DataFrame(redacted_data)\nprint(redacted_data_frame)\n```\n\nRedacting cell by cell for columns with large text content\n\n```python\n# Working with data frames\nimport pandas as pd\nfrom privateai_client import PAIClient\nfrom privateai_client.objects import request_objects\n\nclient = PAIClient(url=\"http://localhost:8080\")\ndata_frame = pd.DataFrame(\n    {\n        \"Book\": [\n            \"Treasure Island\",\n            \"Moby Dick\",\n        ],\n        \"chapter\": [1,1],\n        \"paragraph\": [1,1],\n        \"text\": [\"The Old Sea-dog at the Admiral Benbow\\nSquire Trelawney, Dr. Livesey, and the rest of...\",\n                 \"Call me Ishmael. Some years ago—never mind how long precisely—having little or no money in my purse...\"\n                 ]\n    }\n)\nobj = request_objects.process_text_obj\nfunc = client.process_text\ndata_frame['text'] = [(lambda x: func(obj(text=[x])).processed_text[0])(row) for row in data_frame['text']]\n```\n\nReidentifying Text\n```python\nfrom privateai_client import PAIClient\nfrom privateai_client import request_objects\n\nclient = PAIClient(url=\"http://localhost:8080\")\n\n# Deidentify the text\ninitial_text = 'My name is John. I work for Private AI'\nrequest_obj = request_objects.process_text_obj(text=[initial_text])\nresponse_obj = client.process_text(request_obj)\n\n# Build reidentify request object from the deidentified response\nnew_request_obj = response_obj.get_reidentify_request()\n# Call the reidentify Route\nnew_response_obj = client.reidentify_text(new_request_obj)\nprint(new_response_obj.body)\n```\n\nBlocking Misspelled Sensitive Words\n\n```python\n\"\"\"This example demonstrates using fuzzy matching as a post-processing step to block potentially misspelled sensitive words.\nIn this case, we allow OCCUPATION to remain unredacted, except for identifiable roles like CEO. Even though the user mistakenly typed COE, the entity is still redacted.\n\"\"\"\n\nfrom privateai_client import PAIClient, request_objects\nfrom privateai_client.post_processing import (\n    FuzzyMatchEntityProcessor,\n    MarkerEntityProcessor,\n    deidentify_text,\n)\nclient = PAIClient(url=\"http://localhost:8080\")\n\n\ndefault_marker_processor = MarkerEntityProcessor()\n\nfuzzy_processor = FuzzyMatchEntityProcessor(\n    known_words_list=[\"CFO\", \"CTO\", \"CEO\"],\n    threshold=2,\n    strategy=\"BLOCK\",\n    process_type=\"MARKER\",\n    ignore_casing=True,\n)\n\ntext_in = [\"John is our COE. This is Peter, he is a Software Engineer.\"]\nrequest_object = request_objects.analyze_text_obj(\n    text=text_in,\n    locale=\"en\",\n)\nanalyze_text_rsp = client.analyze_text(request_object)\n\ntext_out = deidentify_text(\n    text=text_in,\n    response=analyze_text_rsp,\n    entity_processors={\"OCCUPATION\": fuzzy_processor},\n    default_processor=default_marker_processor,\n)\nprint(text_out)\n```\n\nOutput:\n```\n['[NAME_GIVEN_1] is our [OCCUPATION_1]. This is [NAME_GIVEN_2], he is a Software Engineer.']\n```\n\n[1]: https://docs.private-ai.com/reference/latest/operation/process_text_process_text_post/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprivateai%2Fpai-thin-client","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fprivateai%2Fpai-thin-client","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprivateai%2Fpai-thin-client/lists"}