lucientong/OmniPDF
powerful pdf parser
Platform-specific configuration:
{
"mcpServers": {
"OmniPDF": {
"command": "npx",
"args": [
"-y",
"OmniPDF"
]
}
}
}Add the config above to .claude/settings.json under the mcpServers key.
[](LICENSE) [](https://python.org)
AI-powered PDF parsing tool with OCR, watermark removal, and table extraction. Outputs structured Markdown.
Works as both an MCP Server (for AI assistants like Claude, Cursor, CodeBuddy) and a CLI tool for direct use.
[中文文档](README_zh.md)
# Install with pip
pip install -e ".[all]"
# Or with uv (recommended for speed)
uv pip install -e ".[all]"pip install -e .# PaddleOCR — best for Chinese / mixed Chinese-English
pip install -e ".[paddle]"
# Tesseract — lightweight, good for English
pip install -e ".[tesseract]"Tesseract (optional, only if using Tesseract OCR):
# macOS
brew install tesseract tesseract-lang
# Ubuntu / Debian
sudo apt-get install tesseract-ocr tesseract-ocr-chi-sim
# Windows — download from https://github.com/UB-Mannheim/tesseract/wiki> 💡 PaddleOCR requires no system dependencies — pip install is all you need.
Add this to your AI client's MCP configuration:
{
"mcpServers": Loading reviews...