HVACvisor (Beta)

Upload an image to extract text. Supports various document types and handwriting recognition.

Model Sizes:

Tiny: Fastest, lower accuracy (512×512)
Small: Fast, good accuracy (640×640)
Base: Balanced performance (1024×1024)
Large: Best accuracy, slower (1280×1280)
Gundam (Recommended): Optimized for documents (1024 base, 640 image, crop mode)

Task Types:

Convert to Markdown: Converts document to structured Markdown with layout
OCR this image: Standard OCR with grounding
Free OCR: Simple text extraction without layout
Parse the figure: Specialized for charts, diagrams, and figures
Describe this image in detail: Generates detailed image description
Locate item: Find specific referenced items (requires custom input)

Upload Image

Model Size

Task Type

Select the type of processing to perform

When enabled, returns plain text (may be faster). Disable to also get an annotated image and markdown.

Enable Evaluation Mode

Image