🔍 VLMs OCR Playground

Compare and use multiple state-of-the-art Vision-Language OCR models:

DeepSeek-OCR-2: Document to markdown with layout detection
GLM-OCR: Specialized recognition for text, formulas, and tables
PaddleOCR-VL-1.5: Full-page document parsing with layout detection

Select a model below to get started!

⚠️ Current Deployment Status:

🚀 DeepSeek-OCR-2: Requires GPU - temporarily unavailable on this CPU-only deployment

🔮 GLM-OCR: Available ✅

📄 PaddleOCR-VL-1.5: Available ✅ (requires API key)

Select OCR Model

Full-page document parsing with layout detection and element-level recognition.

Upload Image or PDF

Input

Task

Example Images

Extracted Text