Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with pii

A curated list of projects in awesome lists tagged with pii .

https://github.com/catchthetornado/pdf-extract-api

Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown

anonymization api extract json llm ocr ocr-python pdf pii

Last synced: 20 Dec 2024

https://github.com/tokern/piicatcher

Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub

aws-athena aws-glue aws-redshift catalog data data-catalog database phi pii python snowflake

Last synced: 18 Dec 2024

https://github.com/microsoft/presidio-research

This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.

deep-learning flair machine-learning named-entity-recognition natural-language-processing ner nlp pii privacy spacy transformers

Last synced: 21 Dec 2024

https://github.com/googlecloudplatform/dlp-dataflow-deidentification

Multi Cloud Data Tokenization Solution By Using Dataflow and Cloud DLP

beam bigquery data dataflow dlp pii tokenization

Last synced: 15 Dec 2024

https://github.com/deliciousinsights/mongoose-pii

A Mongoose plugin that lets you transparently cipher stored PII and use securely-hashed passwords

bcrypt mongodb mongoose mongoose-plugin password passwords pii pii-ciphering security

Last synced: 11 Oct 2024

https://github.com/edwardcooper/piidetect

A package to build an end-to-end pipeline for detecting personally identifiable information from text.

nlp pii pii-detection word2vec

Last synced: 11 Nov 2024

https://github.com/PovertyAction/PII_detection

Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasets.

deidentification field-survey natural-language-processing pii

Last synced: 11 Nov 2024

https://github.com/kylemclaren/scrub

A Python package to scrub PII

pii python sanitization

Last synced: 10 Nov 2024

https://github.com/dotfurther/OpenDiscoverSDK

.NET 6 API for document file format identification, text/metadata/attachment/embedded object/sensitive item (PII/PHI)/entity extraction.

archive csharp dotnet email embedded-objects entity-extraction extraction file-deduplication file-format-detection file-identification indexing metadata microsoft-office phi pii pii-detection pst sdk text text-extraction

Last synced: 07 Nov 2024

https://github.com/solodynamo/custom-log-marshaler

Attempt to R.I.P PII or unnecessary info in logs and reduce log ingestion costs in the process.

cost-optimization golang logging pii piidata zap zerolog

Last synced: 18 Nov 2024

https://github.com/sajacy/fcc-ecfs-scrape

Extract FCC ECFS filings to BigQuery

fcc-api pii python vulnerability

Last synced: 12 Nov 2024

https://github.com/HabaneroCake/pii-filter

A personally identifiable information (PII) filter.

dutch pii pii-detection

Last synced: 11 Nov 2024

https://github.com/virgilsecurity/virgil-purekit-java

PureKit SDK allows developers to protect users' passwords and sensitive personal information in a database from data breaches and both online and offline attacks and make stolen passwords useless even if a database is breached.

cryptography encryption gdpr hipaa passw0rd password password-hardened-encryption phe pii piidata sdk

Last synced: 09 Nov 2024

https://github.com/nitzano/databye

CLI Database & File Anonymizer

anonymizer cli database pii

Last synced: 09 Oct 2024

https://github.com/jacobpstein/pii

Repo for the pii package, an easy way to identify personally identifiable information in your data

pii r

Last synced: 20 Dec 2024

https://github.com/mlukman/keycloak-pii-data-encryption-provider

A Keycloak provider that enables encryption of user attributes that contain PII data to be automatically encrypted upon storing to database and then decrypted upon loading from database

keycloak-provider pii pii-anonymization

Last synced: 03 Dec 2024

https://github.com/virgilsecurity/virgil-purekit-php

PureKit PHP SDK allows developers to protect users' passwords and sensitive personal information in a database from data breaches and both online and offline attacks and make stolen passwords useless even if a database is breached.

cryptography encryption gdpr hipaa passw0rd password password-hardened-encryption phe pii piidata sdk

Last synced: 09 Nov 2024

https://github.com/insightsengineering/presidio-cli

CLI tool that analyze Text for PII Entities with Microsoft Presidio framework.

pii presidio python

Last synced: 07 Nov 2024

https://github.com/dwisiswant0/leakz-passive-workflow

Caido's passive workflow to find potential leaked secrets, PII, and sensitive fields.

caido caido-passive-workflow caido-workflow leaks leaks-scanner pii secrets sensitive-data sensitive-data-discovery

Last synced: 28 Oct 2024

https://github.com/virgilsecurity/virgil-purekit-net

PureKit SDK allows developers to protect users' passwords and sensitive personal information in a database from data breaches and both online and offline attacks and make stolen passwords useless even if a database is breached.

cryptography encryption gdpr hipaa passw0rd password password-hardened-encryption phe pii piidata sdk

Last synced: 09 Nov 2024

https://github.com/stefen-taime/kafka-pipeline

In the following post, we will learn how to build a data pipeline using a combination of open-source software (OSS), including Debezium, Apache Kafka, Kafka Connect.

bash data docker elasticsearch etl-pipeline k kafka kafka-connect kafka-streams kafka-topic kibana ksqldb masking mongodb mysql pii pipeline postgresql

Last synced: 16 Nov 2024

https://github.com/Acceis/exploit-CVE-2022-0482

Easy!Appointments < 1.4.3 - Unauthenticated PII (events) disclosure

cve cve-2022-0482 disclosure exploit pii

Last synced: 23 Oct 2024

https://github.com/acceis/exploit-cve-2022-0482

Easy!Appointments < 1.4.3 - Unauthenticated PII (events) disclosure

cve cve-2022-0482 disclosure exploit pii

Last synced: 06 Nov 2024

https://github.com/berislavlopac/sanitary

Utility to remove or replace sensitive data from complex structures.

logging pii

Last synced: 07 Nov 2024

https://github.com/jwinman91/ai-ner

An AI-powered, but model-agnostic name-entity recognition toolkit.

anonymization de-identification name-entity-recognition ner nlp-machine-learning pii pii-anonymization python

Last synced: 19 Nov 2024

https://github.com/insightsengineering/presidio-action

Github Action that analyze Text for PII Entities with Microsoft Presidio framework.

actions pii presidio python

Last synced: 07 Nov 2024

https://github.com/kibae/typeorm-pii-compliance

TypeORM PII Compliance Service: Cascading Personally Identifiable Information Disposal

compliance nodejs orm pii pii-anonymization typeorm typescript

Last synced: 21 Nov 2024

https://github.com/kangaroos-are-cool/directoryscanner

A go module for scanning directories for sensitive information (or anything you'd like really)

gathering go golang information infosec pii regex scanning sensitive

Last synced: 14 Nov 2024

https://github.com/michael-ortiz/terraform-aws-s3-audio-pii-guardian

🕵️‍♂️ Personally Identifiable Information (PII) Detection and Redaction for Voice Audio Files Stored in S3 and AWS Transcribe

audio-to-text aws aws-transcribe ffmpeg lambda personal-identifiable-information pii pii-detection pii-detector redaction s3 terraform transcribe typescript

Last synced: 22 Dec 2024

https://github.com/omers/pii-anonymizer-api

PII Anonymizer service based on python with FastAPI

anonymization fastapi healthdata phi pii

Last synced: 10 Nov 2024

https://github.com/xeger/pipeclean

Parallel, streaming data sanitizer. Fast multi-core execution with no file size limits.

masking pii sanitization

Last synced: 15 Nov 2024

https://github.com/ianluites/resource_id

REST endpoints without PII in URLs.

elixir pii plug

Last synced: 08 Nov 2024

https://github.com/parthapray/pii_scrubbing_llm

This repo contains codes about PII scrubbing heuristics search before calling to LLM (local and remote)

chatgpt-api claude-api cloud edge fastapi hybrid llm ner-spacy ollama-api pii pii-detection scrubbing spacy sqlalchemy uvicorn

Last synced: 20 Dec 2024

https://github.com/gordonmurray/debezium_exclude_columns

A working example of using Debezium for CDC while excluding some columns to prevent consumption of personally private information (PII)

cdc debezium kafka pii

Last synced: 04 Dec 2024

https://github.com/adamdecaf/xmlq

pretty print and mask xml

masking pii xml

Last synced: 24 Nov 2024

https://github.com/datafog/datafog

Python library to redact PII/business information from entering semantic data pipelines (RAG, 'chat on your data')

ai embeddings llm ml mlops pii preprocessing preprocessing-data privacy privacy-protection privacy-tools rag semantic-analysis

Last synced: 10 Nov 2024