OmniPDF

MCP Tool

lucientong/OmniPDF

powerful pdf parser

Install

$ npx loaditout add lucientong/OmniPDF

Platform-specific configuration:

.claude/settings.json

{
  "mcpServers": {
    "OmniPDF": {
      "command": "npx",
      "args": [
        "-y",
        "OmniPDF"
      ]
    }
  }
}

Add the config above to .claude/settings.json under the mcpServers key.

About

OmniPDF

[](LICENSE) [](https://python.org)

AI-powered PDF parsing tool with OCR, watermark removal, and table extraction. Outputs structured Markdown.

Works as both an MCP Server (for AI assistants like Claude, Cursor, CodeBuddy) and a CLI tool for direct use.

[中文文档](README_zh.md)

✨ Features

🔍 Smart Page Detection — Automatically identifies text / scanned / mixed pages and applies the optimal strategy
🌐 Multi-Engine OCR — PaddleOCR (Chinese-first) + Tesseract (English), auto-selected based on language
💧 Watermark Removal — Text watermark filtering and image watermark reduction
📊 Table Extraction — Auto-detect and convert to Markdown tables
📄 Structured Markdown — Preserves headings, paragraphs, lists, inline formatting
🔒 Encrypted PDF — Password-protected PDF support
📖 Page-Range Parsing — Process large files in batches
🖥️ Dual Mode — MCP Server for AI clients + CLI for manual use

📦 Quick Start

One-Line Install

# Install with pip
pip install -e ".[all]"

# Or with uv (recommended for speed)
uv pip install -e ".[all]"

Minimal Install (No OCR)

pip install -e .

Install Specific OCR Engine

# PaddleOCR — best for Chinese / mixed Chinese-English
pip install -e ".[paddle]"

# Tesseract — lightweight, good for English
pip install -e ".[tesseract]"

System Dependencies

Tesseract (optional, only if using Tesseract OCR):

# macOS
brew install tesseract tesseract-lang

# Ubuntu / Debian
sudo apt-get install tesseract-ocr tesseract-ocr-chi-sim

# Windows — download from https://github.com/UB-Mannheim/tesseract/wiki

> 💡 PaddleOCR requires no system dependencies — pip install is all you need.

🔧 MCP Server Setup

Add this to your AI client's MCP configuration:

{
  "mcpServers":

Reviews

Loading reviews...

Quality Signals

Installs

Last updated1 day ago

Security: AREADME

New

OmniPDF

Install

About

Tags

Reviews

Quality Signals

Safety

Details

Embed Badge