Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with pii
A curated list of projects in awesome lists tagged with pii .
https://github.com/microsoft/presidio
Context aware, pluggable and customizable data protection and de-identification SDK for text and images
anonymization anonymization-service data-anonymization data-loss-prevention data-masking data-protection data-scrubbing de-identification dlp microsoft pii pii-anonymization pii-anonymization-service pii-detection presidio privacy privacy-protection python text-anonymization transformers
Last synced: 16 Dec 2024
https://github.com/capitalone/dataprofiler
What's in your data? Extract schema, statistics and entities from datasets
avro csv data-analysis data-labels data-science dataprofiling dataset gdpr graph-data machine-learning network-data nlp npi pandas pii privacy python security sensitive-data tabular-data
Last synced: 19 Dec 2024
https://github.com/capitalone/DataProfiler
What's in your data? Extract schema, statistics and entities from datasets
avro csv data-analysis data-labels data-science dataprofiling dataset gdpr graph-data machine-learning network-data nlp npi pandas pii privacy python security sensitive-data tabular-data
Last synced: 03 Nov 2024
https://github.com/catchthetornado/pdf-extract-api
Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
anonymization api extract json llm ocr ocr-python pdf pii
Last synced: 20 Dec 2024
https://github.com/securitybunker/databunker
Secure Vault for Customer PII/PHI/PCI/KYC Records
anonymization application-server ccpa compliance data-anonymization data-protection database encryption gdpr legaltech passportjs pii piidata privacy privacy-by-design secure-storage security tokenization user-consent vault
Last synced: 19 Dec 2024
https://github.com/redhuntlabs/octopii
An AI-powered Personal Identifiable Information (PII) scanner.
blackhat cloud cybersecurity image-processing machine-learning nlp ocr optical-character-recognition pii pii-detection python
Last synced: 21 Dec 2024
https://github.com/redhuntlabs/Octopii
An AI-powered Personal Identifiable Information (PII) scanner.
blackhat cloud cybersecurity image-processing machine-learning nlp ocr optical-character-recognition pii pii-detection python
Last synced: 05 Nov 2024
https://github.com/tokern/piicatcher
Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub
aws-athena aws-glue aws-redshift catalog data data-catalog database phi pii python snowflake
Last synced: 18 Dec 2024
https://github.com/microsoft/presidio-research
This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.
deep-learning flair machine-learning named-entity-recognition natural-language-processing ner nlp pii privacy spacy transformers
Last synced: 21 Dec 2024
https://github.com/samber/slog-formatter
🚨 slog: Attribute formatting
anonymization error formatter formatting go golang handler log-level logger logging middleware pii slog structured-logging
Last synced: 16 Dec 2024
https://github.com/googlecloudplatform/dlp-dataflow-deidentification
Multi Cloud Data Tokenization Solution By Using Dataflow and Cloud DLP
beam bigquery data dataflow dlp pii tokenization
Last synced: 15 Dec 2024
https://github.com/deliciousinsights/mongoose-pii
A Mongoose plugin that lets you transparently cipher stored PII and use securely-hashed passwords
bcrypt mongodb mongoose mongoose-plugin password passwords pii pii-ciphering security
Last synced: 11 Oct 2024
https://github.com/edwardcooper/piidetect
A package to build an end-to-end pipeline for detecting personally identifiable information from text.
nlp pii pii-detection word2vec
Last synced: 11 Nov 2024
https://github.com/PovertyAction/PII_detection
Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasets.
deidentification field-survey natural-language-processing pii
Last synced: 11 Nov 2024
https://github.com/poogles/piiregex
Search for PII in Python
personally-identifiable-information pii piiregex python3
Last synced: 08 Nov 2024
https://github.com/dotfurther/OpenDiscoverSDK
.NET 6 API for document file format identification, text/metadata/attachment/embedded object/sensitive item (PII/PHI)/entity extraction.
archive csharp dotnet email embedded-objects entity-extraction extraction file-deduplication file-format-detection file-identification indexing metadata microsoft-office phi pii pii-detection pst sdk text text-extraction
Last synced: 07 Nov 2024
https://github.com/solodynamo/custom-log-marshaler
Attempt to R.I.P PII or unnecessary info in logs and reduce log ingestion costs in the process.
cost-optimization golang logging pii piidata zap zerolog
Last synced: 18 Nov 2024
https://github.com/dotfurther/OpenDiscoverPlatformCaseStudy
Case study using dotfurther's Open Discover Platform with the RavenDB document store to rapidly create a full-text search/eDiscovery/information governance capable demonstration application.
archive-extractor data-breach document-ingestion ediscovery file-deduplication file-format-detection file-identification full-text full-text-extraction full-text-search indexing-engine information-governance information-governance-catalog metadata personally-identifiable-information pii pii-detection ravendb text-extraction
Last synced: 07 Nov 2024
https://github.com/sajacy/fcc-ecfs-scrape
Extract FCC ECFS filings to BigQuery
fcc-api pii python vulnerability
Last synced: 12 Nov 2024
https://github.com/HabaneroCake/pii-filter
A personally identifiable information (PII) filter.
Last synced: 11 Nov 2024
https://github.com/virgilsecurity/virgil-purekit-java
PureKit SDK allows developers to protect users' passwords and sensitive personal information in a database from data breaches and both online and offline attacks and make stolen passwords useless even if a database is breached.
cryptography encryption gdpr hipaa passw0rd password password-hardened-encryption phe pii piidata sdk
Last synced: 09 Nov 2024
https://github.com/jacobpstein/pii
Repo for the pii package, an easy way to identify personally identifiable information in your data
Last synced: 20 Dec 2024
https://github.com/mlukman/keycloak-pii-data-encryption-provider
A Keycloak provider that enables encryption of user attributes that contain PII data to be automatically encrypted upon storing to database and then decrypted upon loading from database
keycloak-provider pii pii-anonymization
Last synced: 03 Dec 2024
https://github.com/virgilsecurity/virgil-purekit-php
PureKit PHP SDK allows developers to protect users' passwords and sensitive personal information in a database from data breaches and both online and offline attacks and make stolen passwords useless even if a database is breached.
cryptography encryption gdpr hipaa passw0rd password password-hardened-encryption phe pii piidata sdk
Last synced: 09 Nov 2024
https://github.com/insightsengineering/presidio-cli
CLI tool that analyze Text for PII Entities with Microsoft Presidio framework.
Last synced: 07 Nov 2024
https://github.com/datafog/datafog-python
Privacy Engineering for the Generative AI era
ai data-anonymization data-preprocessing data-science devsecaiops devsecops-ai llm-privacy machine-learning observability open-source pii pii-detection privacy privacy-protection python rag stream-processing
Last synced: 10 Nov 2024
https://github.com/dwisiswant0/leakz-passive-workflow
Caido's passive workflow to find potential leaked secrets, PII, and sensitive fields.
caido caido-passive-workflow caido-workflow leaks leaks-scanner pii secrets sensitive-data sensitive-data-discovery
Last synced: 28 Oct 2024
https://github.com/virgilsecurity/virgil-purekit-net
PureKit SDK allows developers to protect users' passwords and sensitive personal information in a database from data breaches and both online and offline attacks and make stolen passwords useless even if a database is breached.
cryptography encryption gdpr hipaa passw0rd password password-hardened-encryption phe pii piidata sdk
Last synced: 09 Nov 2024
https://github.com/stefen-taime/kafka-pipeline
In the following post, we will learn how to build a data pipeline using a combination of open-source software (OSS), including Debezium, Apache Kafka, Kafka Connect.
bash data docker elasticsearch etl-pipeline k kafka kafka-connect kafka-streams kafka-topic kibana ksqldb masking mongodb mysql pii pipeline postgresql
Last synced: 16 Nov 2024
https://github.com/Acceis/exploit-CVE-2022-0482
Easy!Appointments < 1.4.3 - Unauthenticated PII (events) disclosure
cve cve-2022-0482 disclosure exploit pii
Last synced: 23 Oct 2024
https://github.com/acceis/exploit-cve-2022-0482
Easy!Appointments < 1.4.3 - Unauthenticated PII (events) disclosure
cve cve-2022-0482 disclosure exploit pii
Last synced: 06 Nov 2024
https://github.com/berislavlopac/sanitary
Utility to remove or replace sensitive data from complex structures.
Last synced: 07 Nov 2024
https://github.com/jwinman91/ai-ner
An AI-powered, but model-agnostic name-entity recognition toolkit.
anonymization de-identification name-entity-recognition ner nlp-machine-learning pii pii-anonymization python
Last synced: 19 Nov 2024
https://github.com/insightsengineering/presidio-action
Github Action that analyze Text for PII Entities with Microsoft Presidio framework.
Last synced: 07 Nov 2024
https://github.com/kibae/typeorm-pii-compliance
TypeORM PII Compliance Service: Cascading Personally Identifiable Information Disposal
compliance nodejs orm pii pii-anonymization typeorm typescript
Last synced: 21 Nov 2024
https://github.com/kangaroos-are-cool/directoryscanner
A go module for scanning directories for sensitive information (or anything you'd like really)
gathering go golang information infosec pii regex scanning sensitive
Last synced: 14 Nov 2024
https://github.com/michael-ortiz/terraform-aws-s3-audio-pii-guardian
🕵️♂️ Personally Identifiable Information (PII) Detection and Redaction for Voice Audio Files Stored in S3 and AWS Transcribe
audio-to-text aws aws-transcribe ffmpeg lambda personal-identifiable-information pii pii-detection pii-detector redaction s3 terraform transcribe typescript
Last synced: 22 Dec 2024
https://github.com/omers/pii-anonymizer-api
PII Anonymizer service based on python with FastAPI
anonymization fastapi healthdata phi pii
Last synced: 10 Nov 2024
https://github.com/xeger/pipeclean
Parallel, streaming data sanitizer. Fast multi-core execution with no file size limits.
Last synced: 15 Nov 2024
https://github.com/ianluites/resource_id
REST endpoints without PII in URLs.
Last synced: 08 Nov 2024
https://github.com/parthapray/pii_scrubbing_llm
This repo contains codes about PII scrubbing heuristics search before calling to LLM (local and remote)
chatgpt-api claude-api cloud edge fastapi hybrid llm ner-spacy ollama-api pii pii-detection scrubbing spacy sqlalchemy uvicorn
Last synced: 20 Dec 2024
https://github.com/gordonmurray/debezium_exclude_columns
A working example of using Debezium for CDC while excluding some columns to prevent consumption of personally private information (PII)
Last synced: 04 Dec 2024
https://github.com/datafog/datafog
Python library to redact PII/business information from entering semantic data pipelines (RAG, 'chat on your data')
ai embeddings llm ml mlops pii preprocessing preprocessing-data privacy privacy-protection privacy-tools rag semantic-analysis
Last synced: 10 Nov 2024