Redacta — Pseudonymise Medical Text for AI | PharmaTools.AI

Privacy · Skill & MCP server

Pseudonymise patient data before AI ever sees it.

Redacta replaces names, NHS numbers, dates of birth and more with labelled tokens — so clinical text can be safely processed by AI, with the meaning intact.

1,000+ installs · v1.1.0 · MIT-0 license · by PharmaTools.AI
$openclaw skills install redacta

Two layers of detection

Patterns catch the structured identifiers. Reasoning catches the ones that don't follow a pattern — the names, addresses and ages that regex alone reliably misses.

LAYER 01 · PATTERNS

Deterministic matching

Fixed-format identifiers are matched exactly, every time — including a Modulus 11 checksum to confirm real NHS numbers.

  • NHS numbers (checksum-validated)
  • Dates of birth
  • UK postcodes & phone numbers
  • Email & hospital / MRN numbers
  • National Insurance numbers
LAYER 02 · REASONING

Agent judgement

For everything a pattern can't pin down, Redacta reads context — and tells a patient apart from the clinician treating them.

  • Patient names (not clinician names)
  • Relatives & carers named in the text
  • Postal addresses
  • Identifying ages
  • Self-checks the output for anything missed

Identifiers become labelled tokens

[PATIENT_NAME] [RELATIVE_NAME] [DATE_OF_BIRTH] [AGE] [NHS_NUMBER] [POSTCODE] [PHONE] [EMAIL] [MRN] [NI_NUMBER] [ADDRESS]

The clinical meaning stays. The patient behind it doesn't travel with the text.

The judgement calls, made well

Redaction is full of decisions a blunt tool gets wrong. Redacta makes the ones that matter — protecting the person behind the text without gutting the clinical record.

DATES

Dates of birth, not appointments

A patient's date of birth is removed; the date of their next appointment stays. Stripping every date would erase the timeline a clinician actually needs — unless you ask for Safe Harbor mode, which removes them all.

NAMES

The patient, not their clinician

Redacta protects the person being treated — it redacts the patient and keeps the GP, consultant and hospital named in the letter, so the text still reads sensibly. Need a fuller scrub? It can remove those too.

UNCERTAINTY

When in doubt, redact

Faced with something that might identify a patient, Redacta errs toward removing it — and lists every change in a report, so nothing happens silently.

Stricter, for US HIPAA

Ask for HIPAA Safe Harbor — or "US de-identification" — and Redacta switches to its strictest pass, aligned with the §164.514 Safe Harbor method.

On top of everything above, it removes every date tied to the individual — not just the date of birth, but appointment, admission and discharge dates too — along with all specific ages, and the remaining HIPAA identifiers a clinical letter rarely shows but sometimes does: fax and certificate/licence numbers, device serial numbers, vehicle VINs, and health-plan beneficiary numbers. It errs on the safe side, removing a little more than the letter of the standard requires.

Added in Safe Harbor mode

[DATE] [AGE] [FAX] [LICENSE] [DEVICE_ID] [VIN] [HEALTH_PLAN_NUMBER]

Anyone putting medical text into AI

If you've ever pasted a letter, note or report into a chatbot to summarise or rewrite it, Redacta is the step that should come first.

Clinicians & care teams

Summarise or rewrite a letter with AI without exposing the patient behind it.

Researchers & medical writers

Work with real case text in AI tools while keeping identifiers out of the prompt.

Builders & AI agents

Drop a pseudonymisation step into any agent workflow that touches clinical text — as a skill or an MCP server.

Install in one line

Redacta is free and open source under the MIT-0 license. Use it as an agent skill or an MCP server, or build with it as a Python or TypeScript library — same engine throughout.

01

Install the skill

Add Redacta from ClawHub with a single command.

02

Point it at your text

A letter, note, discharge summary or report — paste it in.

03

Get clean output

A pseudonymised document plus a report of every identifier replaced.

Use it — Agent skill · OpenClaw / ClawHub

$openclaw skills install redacta

Use it — MCP server · Claude Desktop, Cursor & more

$npx -y redacta-mcp

Build with it — Python library · pip

$pip install redacta

Build with it — TypeScript library · npm

$npm install @pharmatools/redacta

An honest note on limits. Redacta is a strong first line of defence, not a guarantee. It won't catch every possible identifier and isn't a substitute for formal data-protection processes. Always review the redaction report before sharing text.

Need this at scale?

Redacta runs as an agent skill and an MCP server today. If you want pseudonymisation as an API, an on-prem deployment, or built into your own clinical workflow, let's talk.

Discuss Integration