mattwagstaff/mcp-pii-guard-au
MCP server for Australian PII detection — TFN, Medicare, ABN with checksum validation. Compliance audit logging for Australian Privacy Act, GDPR, HIPAA. Built on Presidio + spaCy.
Platform-specific configuration:
{
"mcpServers": {
"mcp-pii-guard-au": {
"command": "npx",
"args": [
"-y",
"mcp-pii-guard-au"
]
}
}
}Add the config above to .claude/settings.json under the mcpServers key.
Australian PII detection and sanitisation for AI agents. An MCP server that finds and redacts Tax File Numbers (TFN), Medicare card numbers, ABNs, ACNs, BSB and bank account numbers, drivers licence numbers, passport numbers, Centrelink CRNs, Australian addresses, and 13 standard entity types — before text reaches an LLM or gets stored. Built on Microsoft Presidio with 10 custom Australian recognisers that use real checksum validation and context-word boosting, not just regex.
Model Context Protocol (MCP) is an open standard that lets AI assistants — Claude, Cursor, Copilot, custom agents — call external tools over a standardised interface. This server exposes PII detection and sanitisation as MCP tools. Any MCP-compatible client can call them without custom integration code.
Built for teams in regulated Australian industries — financial services, government, healthcare — who need to prove PII was scrubbed before data left a trust boundary. Compliance-ready for the Australian Privacy Act (APPs), GDPR, HIPAA, SOX, and PCI-DSS.
---
Ten Australian entity types that don't exist elsewhere. TFN, Medicare, ABN, ACN, BSB, bank account, drivers licence, passport, Centrelink CRN, and Australian address recognisers — with real checksum validation where algorithms exist, and context-word boosting throughout. No other Presidio wrapper or PII tool on GitHub handles these. If you're building AI tooling for AU/NZ enterprise, government, or health, this is the gap.
Audit logging that a compliance officer can actually use. Every scan writes structured JSON to an append-only log file. It records *what types of PII were found*, *how many*, *what tool was called*, and *what confidence threshold was used*. It never records the original text. It never records the detected values. This is the difference between an audit trail and a liability — and
Loading reviews...