PDFE
Active DevelopmentA cross-platform PDF editor with true content-level redaction. Most PDF redaction tools just draw black boxes over sensitive text—the data is still there, extractable by anyone who knows to look. PDFE actually removes the content from the PDF structure.
Overview
PDFE addresses a critical flaw in most PDF redaction tools: they don't actually redact. Drawing a black rectangle over text is not redaction—the text remains in the PDF structure, trivially extractable with basic tools. This has led to countless security incidents where "redacted" documents leaked sensitive information.
PDFE performs true content-level redaction by removing glyphs from the PDF content streams themselves.
Features
- Glyph-level text removal from PDF content streams, not visual masking
- Verified with external tools (pdftotext, PdfPig, pdfer)—redacted text cannot be extracted
- CLI tool (pdfer) for batch redaction, search, and verification with regex support
- 1600+ automated tests verify redaction integrity across real-world documents
- OCR support for scanned documents via Tesseract with auto-download language data
- Digital signature verification to validate certificates and detect tampering
- GUI automation via C# scripting (Roslyn) for testing and workflows
- Built with .NET 8 and AvaloniaUI for cross-platform support
How It Works
PDFs store text as a series of glyph positioning commands within content streams. PDFE:
- Parses the PDF structure to locate text content streams
- Identifies the specific glyphs that render the target text
- Removes those glyphs from the content stream
- Optionally draws a visual indicator (black box) where text was removed
- Rewrites the PDF with the modified content streams
The result is a PDF where the redacted text simply doesn't exist—not hidden, not obscured, but gone.
Verification
Trust but verify. PDFE includes verification tools that confirm redaction worked:
pdfer verify document.pdf --redacted "secret text"
This uses multiple extraction methods to confirm the text is truly gone from the document.