An open API service providing repository metadata for many open source software ecosystems.

Topic: "pii-detection"

microsoft/presidio

An open-source framework for detecting, redacting, masking, and anonymizing sensitive data (PII) across text, images, and structured data. Supports NLP, pattern matching, and customizable pipelines.

Language: Python - Size: 222 MB - Last synced at: about 10 hours ago - Pushed at: 1 day ago - Stars: 4,470 - Forks: 634

redhuntlabs/Octopii

An AI-powered Personal Identifiable Information (PII) scanner.

Language: Python - Size: 4.34 MB - Last synced at: 8 days ago - Pushed at: 3 months ago - Stars: 670 - Forks: 58

google/magritte 📦

Mediapipe-based library to redact faces from videos and images

Language: C++ - Size: 322 KB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 440 - Forks: 16

awslabs/sensitive-data-protection-on-aws

The Sensitive Data Protection on AWS solution allows enterprise customers to create data catalogs, discover, protect, and visualize sensitive data across multiple AWS accounts. The solution eliminates the need for manual tagging to track sensitive data such as Personal Identifiable Information (PII) and classified information.

Language: TypeScript - Size: 43.2 MB - Last synced at: 3 days ago - Pushed at: 3 months ago - Stars: 118 - Forks: 10

databrickslabs/discoverx

A Swiss-Army-knife for your Data Intelligence platform administration.

Language: Python - Size: 495 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 115 - Forks: 13

EdyVision/pii-codex

A research python package for detecting, categorizing, and assessing the severity of personal identifiable information (PII)

Language: Python - Size: 659 KB - Last synced at: about 13 hours ago - Pushed at: almost 2 years ago - Stars: 85 - Forks: 10

edwardcooper/piidetect

A package to build an end-to-end pipeline for detecting personally identifiable information from text.

Language: Python - Size: 21.5 KB - Last synced at: 8 days ago - Pushed at: almost 6 years ago - Stars: 44 - Forks: 9

apicrafter/metacrafter

Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully customizable and flexible rules

Language: Python - Size: 81.7 MB - Last synced at: 9 months ago - Pushed at: 10 months ago - Stars: 43 - Forks: 6

arcjet/example-nextjs

An example Next.js application protected by Arcjet.

Language: TypeScript - Size: 761 KB - Last synced at: 2 days ago - Pushed at: 5 days ago - Stars: 32 - Forks: 5

mddunlap924/PII-Detection

Personal Identifiable Information (PII) entity detection and performance enhancement with synthetic data generation

Language: Python - Size: 548 KB - Last synced at: 3 days ago - Pushed at: 8 months ago - Stars: 25 - Forks: 3

Akshay7591/Web-Scanner

Web Scanner written in Python which after scanning the given URL returns it's domain name, ip address, nmap scan results and also the contents the URL's robots.txt.

Language: Python - Size: 59.6 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 18 - Forks: 3

fvaleye/metadata-guardian

Provide an easy way with Python to protect your data sources by searching its metadata. 🛡️

Language: Python - Size: 16.5 MB - Last synced at: 11 months ago - Pushed at: about 1 year ago - Stars: 18 - Forks: 1

seanpedrick-case/doc_redaction

Redact PDF/image-based documents, or CSV/XLSX files using a Gradio-based GUI interface

Language: Python - Size: 1.02 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 17 - Forks: 4

dotfurther/OpenDiscoverSDK

.NET 8 API for document file format identification, text/metadata/attachment/embedded object/sensitive item (PII/PHI)/entity extraction.

Language: C# - Size: 170 MB - Last synced at: 11 days ago - Pushed at: 4 months ago - Stars: 16 - Forks: 0

apicrafter/metacrafter-registry

Registry of metadata identifier entities like UUID, GUID, person fullname, address and so on. Linked with other sources

Language: Python - Size: 1.04 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 12 - Forks: 0

edwardcooper/data-sentry

A project to build a machine learning pipeline to detect personal identifiable information (PII)

Language: Jupyter Notebook - Size: 8.43 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 12 - Forks: 7

dotfurther/OpenDiscoverPlatformCaseStudy

Case study using dotfurther's Open Discover Platform with the RavenDB document store to rapidly create a full-text search/eDiscovery/information governance capable demonstration application.

Size: 5.93 MB - Last synced at: 11 days ago - Pushed at: 11 months ago - Stars: 11 - Forks: 0

akazah/prompt-anonymizer

Anonymize / mask personal information before sending prompts to chat AI (like ChatGPT provided by OpenAI)

Language: Python - Size: 3.15 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 11 - Forks: 1

HabaneroCake/pii-filter 📦

A personally identifiable information (PII) filter.

Language: TypeScript - Size: 18.7 MB - Last synced at: 9 days ago - Pushed at: almost 4 years ago - Stars: 10 - Forks: 1

DataFog/codexify 📦

An open-source API that identifies, masks, and replaces Personallly Identifying Information (PII)

Language: Python - Size: 59.6 KB - Last synced at: 12 months ago - Pushed at: almost 2 years ago - Stars: 8 - Forks: 0

gretelai/multi-table 📦

Notebook and code to synthesize relational databases such as Postgres and Mysql.

Language: Jupyter Notebook - Size: 2.78 MB - Last synced at: 18 days ago - Pushed at: over 2 years ago - Stars: 8 - Forks: 1

oxytis/oxidize

Discover PII sensitive data. Find most common personally identifiable information in your environment such as financial related information. Quickly determine exposure after a breach.

Language: Go - Size: 13.8 MB - Last synced at: 10 months ago - Pushed at: about 2 years ago - Stars: 7 - Forks: 2

DataFog/datafog-python

Privacy Engineering for the Generative AI era

Language: Python - Size: 78.1 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 5 - Forks: 2

bballamudi/data-sentry Fork of edwardcooper/data-sentry

A project to build a machine learning pipeline to detect personal identifiable information (PII)

Size: 8.41 MB - Last synced at: almost 2 years ago - Pushed at: over 5 years ago - Stars: 4 - Forks: 0

arcjet/example-remix

An example Remix application protected by Arcjet.

Language: TypeScript - Size: 713 KB - Last synced at: 7 days ago - Pushed at: 11 days ago - Stars: 3 - Forks: 0

Srujanrana07/pii-protection Fork of Rudra8984/pii-protection

Cyprus: PII Protection and Verification System A web-based solution using Python, Django, Tesseract OCR, and AES-256 encryption to extract, mask, and securely verify PII from government documents. Improved efficiency by 60% and accuracy by 70%, replacing traditional methods with a scalable digital process.

Language: CSS - Size: 84.4 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

michael-ortiz/terraform-aws-s3-audio-pii-guardian

🕵️‍♂️ Personally Identifiable Information (PII) Detection and Redaction for Voice Audio Files Stored in S3 and AWS Transcribe

Language: TypeScript - Size: 71.4 MB - Last synced at: 13 days ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

hperer02/PII-data-detection

This project was developed for a Kaggle competition focused on detecting Personally Identifiable Information (PII) in student writing. The primary objective was to build a robust model capable of identifying PII with high recall. The DeBERTa v3 transformer model was chosen for this task after comparing its performance with other transformer models.

Language: Jupyter Notebook - Size: 162 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

Lizhecheng02/Kaggle-PII_Data_Detection

Implement named entity recognition (NER) using regex and fine-tuned LLM, with a total of 15 categories. The ultimate goal is to apply the model to detect personally identifiable information (PII) in student writing.

Language: Jupyter Notebook - Size: 20.8 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

aws-samples/aws-appconfig-pii-extn

Sample AWS AppConfig Extension integrating with Amazon Comprehend for PII detection

Language: Python - Size: 6.84 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

chchench/pii-detect

Objective-C sample code for detecting PII such as SSN and credit card numbers

Language: Objective-C - Size: 26.4 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

juanmill4/RansomDBAlert

An Advanced tool to Extract PII of Ransomware leaks

Language: Python - Size: 13.7 KB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

BoHarris/DataOps-Hub

ALEX – PII Sentinel API A real-time, machine learning-powered privacy scanner to detect and redact Personally Identifiable Information (PII) from structured datasets.

Language: Python - Size: 15 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 0 - Forks: 0

Ricky-saha/PII_DETECTION_AND_PROTECTION_SYSTEM

We have developed a comprehensive PII Detection and Protection System using Python and the MERN (MongoDB, Express.js, React.js, Node.js) stack.

Language: JavaScript - Size: 2.56 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

burgonet-eu/piiserver

Piiranha Server - PII Detection and Masking Service

Language: Python - Size: 127 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

ParthaPRay/PII_Scrubbing_LLM

This repo contains codes about PII scrubbing heuristics search before calling to LLM (local and remote)

Language: Python - Size: 56.6 KB - Last synced at: 16 days ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

ausdfrost/anonymizePy

🌱 anonymizePy helps you anonymize your data with ease

Language: Python - Size: 8.71 MB - Last synced at: about 15 hours ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

CogNetSys/Sonarum

Sonarum revolutionizes human-machine communication by securing real-time text, audio, and video streams while remaining fast, secure, and lightweight. It detects and controls sensitive and secure data on-the-fly, ensuring privacy and security without compromising quality.

Size: 1.95 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

david-acker/redact-pii

Redact PII from images with Azure, OpenAI, and SkiaSharp

Language: C# - Size: 578 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

BhavyaMPatel/BroadbandHack_Prototype

This is a PII Masker application where user can mask their pdf and make use of it

Language: JavaScript - Size: 23.1 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

caesarw0/sanityze Fork of UBC-MDS/sanityze

Spot & Redact PII from Pandas data frames

Size: 268 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

caesarw0/sanityzeR Fork of UBC-MDS/sanityzeR

Spot & Redact PII from R data frames/Tibbles

Size: 505 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

nachiketdhamankar/playstore-scraper-moniotr

Scraping the Play Store and determining categories of apps to be analyzed for PII. (As RA for MonIOTr lab under Prof. Choffness)

Language: Jupyter Notebook - Size: 3.32 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

EmediongFrancis/alx-backend-user-data

Repository of projects involving user data.

Language: Python - Size: 43 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 1

stevelange17/AzureDevOpsPIIScan.CLI

Bare bones code meant for sample/education only.

Language: C# - Size: 10.7 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

CruelMoney/-moderation-api-node

Size: 0 Bytes - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

bballamudi/privapi Fork of Veridax/privapi

Detect Sensitive REST API communication using Deep Neural Networks

Size: 18.9 MB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 1