Amazon Web Services (AWS) has unveiled a new two-model pipeline on Amazon Bedrock designed to optimize the cost and efficiency of digitizing scanned documents. The solution integrates Amazon Nova 2 Lite with Anthropic's Claude Sonnet 4.6, demonstrating its capability to accurately extract and link information from complex, unstructured documents. For instance, in a test involving 336 scanned yearbook pages, the pipeline successfully produced 3,122 name-to-face associations, with 93 percent of these achieving a confidence score of 0.95 or higher. This innovative two-model approach significantly cuts processing expenses, costing about two-thirds less per page than a single-model alternative that attempts to handle the entire task with one vision-language model.
This pipeline addresses a common challenge in document digitization: accurately processing pages with interleaved text and images that lack machine-readable structure. Traditional optical character recognition (OCR) often struggles with the spatial reasoning required to link disparate elements, such as matching names to faces based on page layout. The AWS solution leverages the distinct strengths of each model: Amazon Nova 2 Lite performs native multimodal extraction in a single call, detecting photos, extracting visible names with coordinates, and returning page-level metadata. Claude Sonnet 4.6 then applies spatial reasoning to accurately match names to faces. Notably, testing revealed that setting Nova 2 Lite's reasoning level to "LOW" for this structured extraction task yielded no meaningful accuracy difference compared to higher settings, while offering the most cost-effective option.
The introduction of this specialized pipeline marks a significant step forward for enterprises seeking to enhance their intelligent document processing capabilities. By combining models tailored for specific tasks, developers can create more robust, accurate, and cost-efficient AI solutions. This modular approach not only improves the digitization of challenging documents like historical archives, legal records, or medical charts but also provides a blueprint for handling a wide array of complex data extraction needs. The substantial cost savings, coupled with high accuracy, make advanced AI document processing more accessible and practical for organizations dealing with large volumes of unstructured information, driving broader adoption of sophisticated AI technologies.