Architecture Overview
This document provides a high-level overview of the IRIS IDP (Intelligent Document Processing) platform architecture and its 6-phase processing pipeline. OCR is a core component used in phases 5-6 for specialized text extraction.
System Architecture
6-Phase Pipeline
- Phase 1: Preprocessing (Image Processor)
- Unwarping, denoise, rotate, crop
- OpenCV + FastAPI
- Phase 2: Embeddings & Clustering (ML Embeddings)
- Vision Transformer embeddings, K-means + silhouette
- PyTorch + sklearn
- Phases 3-4: Classification (ML Classifier)
- Fine-tuned EfficientNet, supervised labels
- PyTorch + FastAPI
- Phases 5-6: OCR & Extraction (OCR Extractor)
- PaddleOCR models, regex-based field extraction
- PaddleOCR + FastAPI
Service Communication
SERVICES = {
"image_processor": "http://localhost:8001",
"ml_embeddings": "http://localhost:8002",
"ml_classifier": "http://localhost:8003",
"ocr_extractor": "http://localhost:8004"
}
Non-Functional Concerns
- Scalability: Horizontal scaling per microservice, GPU optional
- Performance: Cached OCR instances, async IO, batching
- Reliability: Health checks, structured logging, timeouts
- Security: JWT, CORS, private registry, secrets management
For installation and requirements, see What is IRIS and Requirements.