Why genAI-powered intelligent document processing is a big deal

Tuesday March 4, 2025. 10:00 AM , from InfoWorld

One of the earliest digital transformation drivers was digitalization, in which organizations converted paper-based processes to digitized workflows. In ideal circumstances, paper documents became web or mobile forms, and workflows replaced handoffs. However, many paper-based documents were converted to digitalized versions in PDF, Microsoft Word, or some other document format. Organizations then invested in document processing technologies to help extract structured information from them.

Pre-AI document processing technologies were often primitive, relying on rules and patterns to identify and extract information. These processors were reasonably accurate in extracting basic information from invoices, contracts, and other structured documents, but they often required human intervention to work through exceptions and manually extract missing information.

These pre-AI document processing technologies were no match for certain complex document formats. For example, a system based on rules and pattern matching might be able to extract the names and dates from a non-disclosure agreement, but it would not be able to summarize the information in the contract or ensure it met the organization’s NDA requirements.

Now we have an emerging class of genAI-enabled intelligent document processing (IDP) technologies, sometimes called document mining and analytics platforms. These are a game changer for organizations whose document repositories previously offered only basic search capabilities, or workflows that required document review by subject matter experts. Document processing may be a relatively boring use case for genAI, but it can have significant business impacts. Examples include claims processing in insurance, conducting clinical trials in life sciences, and regulatory reporting in financial services.

“We can finally fulfill the dream of the paperless office, which will transform government, banking, insurance, and life sciences,” says Michael Beckley, CTO and founder of Appian. “Wherever regulations are strongest, the paper’s the heaviest, and that’s where the impact is exponentially greater.”

Understanding intelligent document processing

Before diving into business requirements, understanding what IDP technologies (IDPs) can do and the differentiating capabilities among vendors is a good start. Consider the list of vendors to choose from:

ABBYY

Appian

Automation Anywhere

AWS

Datamatics

expert.ai

Google

Hyperscience

IBM

Indico Data

Instabase

Microsoft

OpenText

Rossum

UiPath

UST

WorkFusion

It is also worth reviewing IDC Perspective, Forrester Wave, and QKS Spark Matrix for industry insight and platform reviews.

“Generative AI is redefining document processing by transforming unstructured data into actionable insights at scale,” says Srikumar Ramanathan, chief solutions officer at Mphasis. “Tasks like entity extraction, taxonomy classification, and quantitative analysis are automated with greater precision.”

Entity extraction refers to pulling out names, places, organizations, dates, and currency along with their associated contexts.

Taxonomy classifications can be supervised when there is a known taxonomy of categories or other hierarchies, or they can be unsupervised when the LLM extracts a list of topics with little upfront guidance.

Quantitative analysis examples include extracting invoice amounts and calculating totals, extracting insurance claim amounts and coverage limits, and quantifying fees and penalties in regulatory requirements.

IDPs that are integrated with large language models can take information extraction to higher levels by providing summaries, capturing context, and mining other semantic information.

“GenAI has significantly enhanced intelligent document processing by introducing advanced capabilities beyond traditional tasks like data extraction, classification, and redaction,” says Varun Goswami, VP of product management at Newgen. “It now excels in summarization and content generation, enabling more efficient and precise handling of documents. GenAI has revolutionized search and retrieval processes by enabling more intelligent, context-aware interactions with content, making document management more intuitive and effective than ever before.”

Automating workflows with genAI-enabled IDP

The adoption path to IDP begins with establishing objectives, defining requirements, understanding compliance factors, and specifying minimal quality metrics.

“Organizations can unlock greater AI automation through genAI-powered IDP, which recognizes document types, extracts key fields, and quickly converts unstructured data into usable data,” says Rich Waldron, CEO and co-founder of Tray.ai. “It goes beyond basic extraction to automate workflows and provide confidence scoring to indicate model accuracy and data relevance.”

Here are some preliminary steps for organizations transitioning from traditional document processing to genAI-powered IDP:

Identify document types, file formats, document quantities, data volumes, and storage locations.

Establish requirements for all confidential data or compliance factors around privacy, security, and end-use entitlements.

Review workflow requirements by identifying which people, business processes, and systems use the document types for information extraction, automation, and decision-making.

Analyze a sample of the documents to understand how information is structured, consider where there’s consistency and disparate formats, and capture the data complexities that may require implementation and testing considerations.

Establish information extraction goals and set quality targets.

Preparing unstructured data for IDP

Except for the simplest document types, the full document process will likely have preprocessing steps to capture document structure and metadata, along with postprocessing steps where quality validations and quantitative analysis are implemented.

Clemens Mewald, head of product at Instabase, says, “In our experience, the overall performance of our AI for IDP tasks is mostly attributable to how we digitize, parse, and represent complex documents—all the analysis of unstructured data that takes place before the LLM is even involved.”

Mewald recommends using machine learning and AI to detect the semantic structure of a document, annotate it with metadata about the style and location of the content, and detect relevant entities like tables, checkboxes, signatures, and logos. “The resulting representation helps LLMs better answer complex extraction and reasoning questions, as well as provide critical metadata for computing confidence scores and exact provenance, which helps avoid hallucination,” says Mewald.

Ramanathan of Mphasis agrees and says successful IDP depends on robust preprocessing, including image de-noising, alignment correction, and content standardization to ensure clean, structured inputs for LLMs. “This seamless integration enhances both accuracy and efficiency, empowering organizations to extract meaningful value from their data,” Ramanathan says.

Fine-tuning LLM prompts for IDP accuracy

Pre-LLM IDPs were often run once on a document and rerun only after introducing new information extraction rules and patterns. Search engines were then used to handle any ad-hoc querying.

With LLMs, the processing can be more dynamic. First, prompts and examples can steer LLMs toward the information extraction goals and help them work around document complexities. Second, the same LLMs can be used for ad hoc querying, and feedback mechanisms can be instrumented to improve the information extractions based on end-user prompts.

“The advancement of genAI and LLMs is allowing us to use natural language to describe a desired program, expression, or result, and they are particularly good at extracting data from unstructured and multimodal sources,” says Greg Benson, professor of computer science at the University of San Francisco and chief scientist at SnapLogic. “Accurate information extraction from documents, like PDFs, has been notoriously difficult to write as code. We are realizing the power of prompt engineering and how sharing a few examples of desired extracted data helps the LLM “learn” how to apply the pattern to future input documents.”

Integrate IDP for smarter workflows

IDP is a fan-in, fan-out process where documents are stored in multiple locations, and many downstream platforms, workflows, and analytics can leverage the extracted information. Enterprises with significant document repositories and many enterprise applications should consider iPaaS (integration platforms as a service), data fabrics, and data pipelines to manage the integrations.

“Rather than adding complexity, integrating these capabilities directly into an AI-ready iPaaS means IT and development teams can extract and validate the information in real-time,” says Waldron of Tray.ai. “Organizations can now maximize their AI automation potential by effectively processing unstructured data across their current systems.”

Current limitations of IDP

While LLMs can enhance the ability to extract entities and values from documents for AI and analytics, there are limitations.

“Even the most advanced RAG systems today are primarily informational in nature and can’t perform calculations,” says Edward Calvesbert, VP of the watsonx platform at IBM. “Improvements in accuracy and data governance of genAI systems are necessary to advance to more agentic, operational systems that will combine vector embeddings with the entities and values stored in data lakehouse table formats and enable access across diverse data platforms.”

IT teams should set realistic expectations during the requirements phase, especially for complex document structures, when there are significant information accuracy expectations, or when multipart quantitative analytics is needed. In these situations, it may be important to construct IDP workflows with people reviewing critical information, analytics, and exceptions.

That said, IDPs with genAI have dramatically improved accuracy and have fewer exceptions than their predecessors. “With traditional OCR and AI models, you might get 60% straight-through processing, 70% if you’re lucky, but now generative AI solves all the edge cases, and your processing rates go up to 99%,” Beckley of Appian says.

The future evolution of IDP

In addition to workflow benefits, organizations should research using intelligent document processing to improve RAG, LLM, and agentic AI accuracy. IDP technologies could transform document repositories into more structured information sources supportive of genAI capabilities.

“This year at NeurIPS, the leading AI/ML research conference, researchers demonstrated how genAI can read a football manual and learn how to win games,” says Nikolaos Vasiloglou, VP of Research ML at RelationalAI. “Soon, genAI will be able to read the HR handbook and serve employees.”

The Policy Learning from Books (PLfB) study shows that rule-based AIs outperform genAI methodologies in medium-level football games, but genAI does slightly better in hard-level games. IDPs with genAI have tremendous potential, but organizations should set appropriate expectations and validate the quality of output. Expect IDP capabilities to improve alongside LLMs.