Docira

SDKs & HTTP recipes

HTTP recipes and SDK roadmap

First-party Python and Node SDKs ship in Q3 2026. For now, the HTTP API is the full interface — any client that can POST multipart or JSON can use Docira.

SDK status

ClientStatusAvailable now
docira-pythonQ3 2026httpx recipe below
docira-nodeQ3 2026fetch recipe below
cURLAvailableAll examples below

Until the SDKs ship, the recipes below are copy-paste ready. The full endpoint specification is in the API reference.

File upload

POST /v1/parse/upload — multipart. Accepts PDF, PNG, JPEG, TIFF, HEIC, WEBP, DOCX, PPTX, XLSX up to 50 MB.

curl -X POST https://api.docira.io/v1/parse/upload \
  -H "X-API-Key: $DOCIRA_API_KEY" \
  -F "file=@report.pdf" \
  -F "operation_mode=ocr"

URL parse

POST /v1/parse — JSON body. Docira fetches the document from the URL. The URL must be publicly reachable over HTTPS.

curl -X POST https://api.docira.io/v1/parse \
  -H "X-API-Key: $DOCIRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "file_url": "https://example.com/report.pdf",
    "operation_mode": "ocr"
  }'

Batch submission and polling

POST /v1/parse/batch accepts up to 100 documents and returns a batch_id immediately. Poll GET /v1/batch/{batch_id} or pass a webhook_url to get notified on completion.

# Submit
curl -X POST https://api.docira.io/v1/parse/batch \
  -H "X-API-Key: $DOCIRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "documents": [
      {"file_url": "https://example.com/doc1.pdf", "operation_mode": "ocr"},
      {"file_url": "https://example.com/doc2.pdf", "operation_mode": "ocr"}
    ],
    "webhook_url": "https://your.app/webhook"
  }'

# Poll status (use batch_id from the response above)
curl https://api.docira.io/v1/batch/batch_01JEXAMPLE \
  -H "X-API-Key: $DOCIRA_API_KEY"

Webhook signatures and retry behaviour are documented on the Webhooks page.

Streaming SSE

POST /v1/parse/stream — same multipart upload as /parse/upload but Docira streams pipeline events back as Server-Sent Events. Each event is a JSON line prefixed data:. The final event is result.final.

import httpx, json, os

with open("report.pdf", "rb") as f, httpx.stream(
    "POST",
    "https://api.docira.io/v1/parse/stream",
    headers={"X-API-Key": os.environ["DOCIRA_API_KEY"]},
    files={"file": ("report.pdf", f, "application/pdf")},
    timeout=None,
) as r:
    for line in r.iter_lines():
        if not line.startswith("data: "):
            continue
        evt = json.loads(line[6:])
        print(evt.get("type"), evt)

Error handling

All errors return JSON {"detail": "..."}. Key status codes:

CodeWhenAction
401Missing or invalid X-API-KeyCheck the header and key prefix
402Plan limit reachedUpgrade or wait for quota reset
413File > 50 MBCompress or split the file
429Rate limit hitWait for Retry-After header value (seconds)
500Server errorCheck request_id in detail, contact support
503All providers circuit-brokenRetry with exponential backoff
import httpx, os, time

def parse_with_retry(path: str, max_attempts: int = 3) -> dict:
    for attempt in range(max_attempts):
        try:
            with open(path, "rb") as f:
                r = httpx.post(
                    "https://api.docira.io/v1/parse/upload",
                    headers={"X-API-Key": os.environ["DOCIRA_API_KEY"]},
                    files={"file": (path, f, "application/pdf")},
                    data={"operation_mode": "ocr"},
                    timeout=120.0,
                )
            if r.status_code == 429:
                wait = int(r.headers.get("Retry-After", 10))
                time.sleep(wait)
                continue
            r.raise_for_status()
            return r.json()
        except httpx.HTTPStatusError as exc:
            if exc.response.status_code >= 500 and attempt < max_attempts - 1:
                time.sleep(2 ** attempt)
                continue
            raise
    raise RuntimeError("Max retries exceeded")

When SDKs ship

The first-party SDKs (Python and Node) planned for Q3 2026 will wrap:

  • Auth — API key injection, environment variable detection
  • Batch — automatic polling loop with configurable interval and timeout
  • Streaming — async iterator over SSE events with typed event objects
  • Webhooks — HMAC signature verification helper
  • Retries — exponential backoff with jitter, respects Retry-After
  • Pydantic v2 models for ParseResponse, VLMResult, UsageSummary, and all sub-objects

Follow github.com/kimhons/Docira for release announcements.

Ready to integrate? Read the API docs →