API reference

Document Intelligence API

A local Python pipeline that turns PDFs, Word documents and images into structured JSON. Six endpoints, server-sent events for progress, nothing leaves your machine.

Endpoints

POST /api/process-document

Upload a file as multipart/form-data in field file. Optional form field htr_mode: auto (default), force, off. Returns {"task_id": "...", "status": "queued"}.

GET /api/status/<task_id>

Current task state. Includes htr_attempted and htr_used diagnostics; error and hint on failure.

GET /api/stream/<task_id>

Server-sent events emitted every 500 ms until completed, failed or cancelled.

GET /api/results/<task_id>

Full JSON result. metadata.timings holds per-stage durations in seconds; metadata.extraction.pages[] gives per-page OCR confidence and HTR flags.

GET /api/samples

Lists sample files served from static/samples/.

GET /api/admin/htr-status

Reports installed HTR backends (Kraken / Calamari) and whether configured model files exist on disk.

Quickstart

curl -F "file=@/path/to/doc.pdf" \
     -F "htr_mode=auto" \
     http://localhost:8000/api/process-document
# → {"task_id": "abcd-1234", "status": "queued"}

curl http://localhost:8000/api/results/abcd-1234

Handwriting recognition

Pages with low OCR confidence (default threshold 65) are retried through a local HTR backend.

autoOCR first, HTR fallback per page when confidence is low. Recommended for mixed documents.
forceTry HTR first on every page. Use when you know the document is handwritten.
offDisable HTR entirely. Faster, useful when no models are installed.

Enable HTR locally:

export HTR_KRAKEN_MODEL=/path/to/kraken/model.checkpoint
# or
export HTR_CALAMARI_MODEL=/path/to/calamari/model.zip

Limits

· demo rate limit: 1 document per hour per IP
· max file size: MAX_FILE_SIZE_MB (default 10 MB)
· processing is local — no external AI services are called
· uploads & results are auto-deleted after 1 h

Contact

Interested in running this internally or production deployment? Reach out at guch79@gmail.com.