Saltar al contenido principal

Architecture Overview

This document provides a high-level overview of the IRIS IDP (Intelligent Document Processing) platform architecture and its 6-phase processing pipeline. OCR is a core component used in phases 5-6 for specialized text extraction.

System Architecture

6-Phase Pipeline

  • Phase 1: Preprocessing (Image Processor)
    • Unwarping, denoise, rotate, crop
    • OpenCV + FastAPI
  • Phase 2: Embeddings & Clustering (ML Embeddings)
    • Vision Transformer embeddings, K-means + silhouette
    • PyTorch + sklearn
  • Phases 3-4: Classification (ML Classifier)
    • Fine-tuned EfficientNet, supervised labels
    • PyTorch + FastAPI
  • Phases 5-6: OCR & Extraction (OCR Extractor)
    • PaddleOCR models, regex-based field extraction
    • PaddleOCR + FastAPI

Service Communication

SERVICES = {
"image_processor": "http://localhost:8001",
"ml_embeddings": "http://localhost:8002",
"ml_classifier": "http://localhost:8003",
"ocr_extractor": "http://localhost:8004"
}

Non-Functional Concerns

  • Scalability: Horizontal scaling per microservice, GPU optional
  • Performance: Cached OCR instances, async IO, batching
  • Reliability: Health checks, structured logging, timeouts
  • Security: JWT, CORS, private registry, secrets management

For installation and requirements, see What is IRIS and Requirements.