Docira

Concepts

Agentic routing

Docira does not pick one model and hope it works. For every page, the router reads the page, decides which tier and provider should handle it, verifies the output against an independent OCR baseline, and records the entire decision so you can audit it after the fact.

The six-stage pipeline

Each request passes through six stages. The router sits at stages 2–4 and is the part you usually mean when you say “Docira”.

  1. 1. Ingestion. Document bytes are normalized and split into pages. PDFs are rasterised to image tiles. Office docs round-trip through a headless converter.
  2. 2. Classification. Each page is scored on a complexity vector — table density, column count, handwriting, equations, scan quality, language, document- type hint. Output is a single 0–1 complexity score plus the feature map.
  3. 3. Tier selection. The score maps to one of three tiers — Fast (≤0.30), Pro (0.30–0.80), Expert (>0.80). Thresholds are tunable via the PW_FAST_TIER_MAX_SCORE and PW_PRO_TIER_MAX_SCORE settings.
  4. 4. Provider selection. Within the tier, the router picks the provider with the highest verified accuracy on this feature mix, subject to the per-provider circuit-breaker state.
  5. 5. Verification. The VLM output is compared to a Tesseract OCR baseline. If overlap or structural score is below threshold, the page is rerouted to a higher tier and the original output is kept as a fallback.
  6. 6. Delivery. Pages stitch back into a single response with Markdown, JSON, bounding boxes, the routing trace, and the metering record.

The page classifier

The classifier runs locally on a small SmolDocling-class model (with a heuristic fallback for cold-start). It does not call any upstream provider. Its job is to be fast (typically <50 ms per page) and to surface the features the router actually decides on.

Features the classifier surfaces

Tier selection

Tiers exist because clean text on a single-column PDF does not need a frontier-class model — and equations on a scanned chemistry exam should not be answered by a small one. The mapping is deterministic and tunable.

Tier definitions and routing
TierScore rangeTypical pagesCost band
fast≤ 0.30Single-column typed text, simple lists$0.001 / page
pro0.30 – 0.80Multi-column, tables, charts, scanned but readable$0.005 – $0.012 / page
expert> 0.80Equations, dense tables, handwriting, RTL, low-DPI scans$0.025 – $0.060 / page

Provider selection within a tier

Each tier has a pool of qualified providers. Within the pool the router picks the one with the highest verified accuracy on the page’s feature mix — that is, the model that has historically performed best on this combination of layout, language, and content type. Two things can override that pick:

Verification and re-routing

Before returning a page’s output, Docira runs an independent Tesseract OCR pass and compares it to the VLM’s text. The comparison produces three signals:

If any score falls below the configured threshold (PW_GRADE_*_MIN) the page is rerouted to the next tier up and the second result is returned. The original result is kept in the trace as reroute_history so you can compare them.

The routing trace

Every page in the response carries a complete routing trace. This is the differentiator: nothing the router does is hidden. You can audit per page why a tier was picked, which provider answered, what the verification score was, and whether the result was rerouted.

{
  "trace_id": "rt_01HZX7K8M2",
  "page_index": 4,
  "classification": {
    "complexity_score": 0.71,
    "features": {
      "has_tables": true,
      "has_equations": false,
      "has_handwriting": false,
      "is_scanned": false,
      "column_count": 2,
      "language": "en",
      "doc_type_hint": "academic_paper"
    },
    "duration_ms": 38
  },
  "tier_selection": {
    "tier": "pro",
    "reason": "complexity_score in [0.30, 0.80]",
    "thresholds": { "fast_max": 0.30, "pro_max": 0.80 }
  },
  "provider_selection": {
    "candidates": ["anthropic/claude-sonnet-4-6", "openai/gpt-4-1-mini", "google/gemini-2-5-pro"],
    "chosen": "anthropic/claude-sonnet-4-6",
    "reason": "highest verified accuracy on 'tables + 2-column' benchmark",
    "circuit_breaker_state": "closed"
  },
  "vlm_call": { "duration_ms": 1240, "tokens_in": 1842, "tokens_out": 967, "cost_usd": 0.0073 },
  "verification": {
    "ocr_baseline_overlap": 0.91,
    "table_structure_score": 0.94,
    "verdict": "accept",
    "reroute_attempted": false
  },
  "total_duration_ms": 1314
}

See also

Ready to integrate? Read the API docs →