Topic: "pii-detection"
microsoft/presidio
An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.
Language: Python - Size: 222 MB - Last synced at: about 10 hours ago - Pushed at: 1 day ago - Stars: 4,470 - Forks: 634

redhuntlabs/Octopii
An AI-powered Personal Identifiable Information (PII) scanner.
Language: Python - Size: 4.34 MB - Last synced at: 8 days ago - Pushed at: 3 months ago - Stars: 670 - Forks: 58

google/magritte 📦
Mediapipe-based library to redact faces from videos and images
Language: C++ - Size: 322 KB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 440 - Forks: 16

awslabs/sensitive-data-protection-on-aws
The Sensitive Data Protection on AWS solution allows enterprise customers to create data catalogs, discover, protect, and visualize sensitive data across multiple AWS accounts. The solution eliminates the need for manual tagging to track sensitive data such as Personal Identifiable Information (PII) and classified information.
Language: TypeScript - Size: 43.2 MB - Last synced at: 3 days ago - Pushed at: 3 months ago - Stars: 118 - Forks: 10

databrickslabs/discoverx
A Swiss-Army-knife for your Data Intelligence platform administration.
Language: Python - Size: 495 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 115 - Forks: 13

EdyVision/pii-codex
A research python package for detecting, categorizing, and assessing the severity of personal identifiable information (PII)
Language: Python - Size: 659 KB - Last synced at: about 13 hours ago - Pushed at: almost 2 years ago - Stars: 85 - Forks: 10

edwardcooper/piidetect
A package to build an end-to-end pipeline for detecting personally identifiable information from text.
Language: Python - Size: 21.5 KB - Last synced at: 8 days ago - Pushed at: almost 6 years ago - Stars: 44 - Forks: 9

apicrafter/metacrafter
Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully customizable and flexible rules
Language: Python - Size: 81.7 MB - Last synced at: 9 months ago - Pushed at: 10 months ago - Stars: 43 - Forks: 6

arcjet/example-nextjs
An example Next.js application protected by Arcjet.
Language: TypeScript - Size: 761 KB - Last synced at: 2 days ago - Pushed at: 5 days ago - Stars: 32 - Forks: 5

mddunlap924/PII-Detection
Personal Identifiable Information (PII) entity detection and performance enhancement with synthetic data generation
Language: Python - Size: 548 KB - Last synced at: 3 days ago - Pushed at: 8 months ago - Stars: 25 - Forks: 3

Akshay7591/Web-Scanner
Web Scanner written in Python which after scanning the given URL returns it's domain name, ip address, nmap scan results and also the contents the URL's robots.txt.
Language: Python - Size: 59.6 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 18 - Forks: 3

fvaleye/metadata-guardian
Provide an easy way with Python to protect your data sources by searching its metadata. 🛡️
Language: Python - Size: 16.5 MB - Last synced at: 11 months ago - Pushed at: about 1 year ago - Stars: 18 - Forks: 1

seanpedrick-case/doc_redaction
Redact PDF/image-based documents, or CSV/XLSX files using a Gradio-based GUI interface
Language: Python - Size: 1.02 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 17 - Forks: 4

dotfurther/OpenDiscoverSDK
.NET 8 API for document file format identification, text/metadata/attachment/embedded object/sensitive item (PII/PHI)/entity extraction.
Language: C# - Size: 170 MB - Last synced at: 11 days ago - Pushed at: 4 months ago - Stars: 16 - Forks: 0

apicrafter/metacrafter-registry
Registry of metadata identifier entities like UUID, GUID, person fullname, address and so on. Linked with other sources
Language: Python - Size: 1.04 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 12 - Forks: 0

edwardcooper/data-sentry
A project to build a machine learning pipeline to detect personal identifiable information (PII)
Language: Jupyter Notebook - Size: 8.43 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 12 - Forks: 7

dotfurther/OpenDiscoverPlatformCaseStudy
Case study using dotfurther's Open Discover Platform with the RavenDB document store to rapidly create a full-text search/eDiscovery/information governance capable demonstration application.
Size: 5.93 MB - Last synced at: 11 days ago - Pushed at: 11 months ago - Stars: 11 - Forks: 0

akazah/prompt-anonymizer
Anonymize / mask personal information before sending prompts to chat AI (like ChatGPT provided by OpenAI)
Language: Python - Size: 3.15 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 11 - Forks: 1

HabaneroCake/pii-filter 📦
A personally identifiable information (PII) filter.
Language: TypeScript - Size: 18.7 MB - Last synced at: 9 days ago - Pushed at: almost 4 years ago - Stars: 10 - Forks: 1

DataFog/codexify 📦
An open-source API that identifies, masks, and replaces Personallly Identifying Information (PII)
Language: Python - Size: 59.6 KB - Last synced at: 12 months ago - Pushed at: almost 2 years ago - Stars: 8 - Forks: 0

gretelai/multi-table 📦
Notebook and code to synthesize relational databases such as Postgres and Mysql.
Language: Jupyter Notebook - Size: 2.78 MB - Last synced at: 18 days ago - Pushed at: over 2 years ago - Stars: 8 - Forks: 1

oxytis/oxidize
Discover PII sensitive data. Find most common personally identifiable information in your environment such as financial related information. Quickly determine exposure after a breach.
Language: Go - Size: 13.8 MB - Last synced at: 10 months ago - Pushed at: about 2 years ago - Stars: 7 - Forks: 2

DataFog/datafog-python
Privacy Engineering for the Generative AI era
Language: Python - Size: 78.1 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 5 - Forks: 2

bballamudi/data-sentry Fork of edwardcooper/data-sentry
A project to build a machine learning pipeline to detect personal identifiable information (PII)
Size: 8.41 MB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 4 - Forks: 0

arcjet/example-remix
An example Remix application protected by Arcjet.
Language: TypeScript - Size: 713 KB - Last synced at: 7 days ago - Pushed at: 11 days ago - Stars: 3 - Forks: 0

Srujanrana07/pii-protection Fork of Rudra8984/pii-protection
Cyprus: PII Protection and Verification System A web-based solution using Python, Django, Tesseract OCR, and AES-256 encryption to extract, mask, and securely verify PII from government documents. Improved efficiency by 60% and accuracy by 70%, replacing traditional methods with a scalable digital process.
Language: CSS - Size: 84.4 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

michael-ortiz/terraform-aws-s3-audio-pii-guardian
🕵️♂️ Personally Identifiable Information (PII) Detection and Redaction for Voice Audio Files Stored in S3 and AWS Transcribe
Language: TypeScript - Size: 71.4 MB - Last synced at: 13 days ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

hperer02/PII-data-detection
This project was developed for a Kaggle competition focused on detecting Personally Identifiable Information (PII) in student writing. The primary objective was to build a robust model capable of identifying PII with high recall. The DeBERTa v3 transformer model was chosen for this task after comparing its performance with other transformer models.
Language: Jupyter Notebook - Size: 162 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

Lizhecheng02/Kaggle-PII_Data_Detection
Implement named entity recognition (NER) using regex and fine-tuned LLM, with a total of 15 categories. The ultimate goal is to apply the model to detect personally identifiable information (PII) in student writing.
Language: Jupyter Notebook - Size: 20.8 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

aws-samples/aws-appconfig-pii-extn
Sample AWS AppConfig Extension integrating with Amazon Comprehend for PII detection
Language: Python - Size: 6.84 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

chchench/pii-detect
Objective-C sample code for detecting PII such as SSN and credit card numbers
Language: Objective-C - Size: 26.4 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

juanmill4/RansomDBAlert
An Advanced tool to Extract PII of Ransomware leaks
Language: Python - Size: 13.7 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

BoHarris/DataOps-Hub
ALEX – PII Sentinel API A real-time, machine learning-powered privacy scanner to detect and redact Personally Identifiable Information (PII) from structured datasets.
Language: Python - Size: 15 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

Ricky-saha/PII_DETECTION_AND_PROTECTION_SYSTEM
We have developed a comprehensive PII Detection and Protection System using Python and the MERN (MongoDB, Express.js, React.js, Node.js) stack.
Language: JavaScript - Size: 2.56 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

burgonet-eu/piiserver
Piiranha Server - PII Detection and Masking Service
Language: Python - Size: 127 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

ParthaPRay/PII_Scrubbing_LLM
This repo contains codes about PII scrubbing heuristics search before calling to LLM (local and remote)
Language: Python - Size: 56.6 KB - Last synced at: 16 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

ausdfrost/anonymizePy
🌱 anonymizePy helps you anonymize your data with ease
Language: Python - Size: 8.71 MB - Last synced at: about 15 hours ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

CogNetSys/Sonarum
Sonarum revolutionizes human-machine communication by securing real-time text, audio, and video streams while remaining fast, secure, and lightweight. It detects and controls sensitive and secure data on-the-fly, ensuring privacy and security without compromising quality.
Size: 1.95 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

david-acker/redact-pii
Redact PII from images with Azure, OpenAI, and SkiaSharp
Language: C# - Size: 578 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

BhavyaMPatel/BroadbandHack_Prototype
This is a PII Masker application where user can mask their pdf and make use of it
Language: JavaScript - Size: 23.1 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

caesarw0/sanityze Fork of UBC-MDS/sanityze
Spot & Redact PII from Pandas data frames
Size: 268 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

caesarw0/sanityzeR Fork of UBC-MDS/sanityzeR
Spot & Redact PII from R data frames/Tibbles
Size: 505 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

nachiketdhamankar/playstore-scraper-moniotr
Scraping the Play Store and determining categories of apps to be analyzed for PII. (As RA for MonIOTr lab under Prof. Choffness)
Language: Jupyter Notebook - Size: 3.32 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

EmediongFrancis/alx-backend-user-data
Repository of projects involving user data.
Language: Python - Size: 43 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 1

stevelange17/AzureDevOpsPIIScan.CLI
Bare bones code meant for sample/education only.
Language: C# - Size: 10.7 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

CruelMoney/-moderation-api-node
Size: 0 Bytes - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

bballamudi/privapi Fork of Veridax/privapi
Detect Sensitive REST API communication using Deep Neural Networks
Size: 18.9 MB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 1
