azure-ai-formrecognizer-java

Rebranding: O Azure AI Form Recognizer agora é Azure AI Document Intelligence. Novos projetos devem usar com.azure:azure-ai-documentintelligence. O pacote legado azure-ai-formrecognizer tem como alvo apenas a versão de API 2023-07-31. Consulte o Guia de Migração.

npx skills add https://github.com/microsoft/skills --skill azure-ai-formrecognizer-java

Azure AI Document Intelligence SDK for Java

Rebranding: Azure AI Form Recognizer is now Azure AI Document Intelligence. New projects should use com.azure:azure-ai-documentintelligence. The legacy azure-ai-formrecognizer package targets API version 2023-07-31 only. See Migration Guide.

Before Implementation

Search microsoft-docs MCP for current API patterns:

  • Query: "azure-ai-documentintelligence Java SDK"
  • Verify: Parameters match installed SDK version (latest GA: 1.0.7)

Installation

<dependency>
    <groupId>com.azure</groupId>
    <artifactId>azure-ai-documentintelligence</artifactId>
    <version>1.0.0</version>
</dependency>

<!-- For DefaultAzureCredential -->
<dependency>
    <groupId>com.azure</groupId>
    <artifactId>azure-identity</artifactId>
    <version>1.14.2</version>
</dependency>

Environment Variables

DOCUMENT_INTELLIGENCE_ENDPOINT=https://<resource>.cognitiveservices.azure.com/ # Required for all auth methods
AZURE_TOKEN_CREDENTIALS=prod  # Required only if DefaultAzureCredential is used in production

Authentication

DefaultAzureCredential (Recommended)

import com.azure.ai.documentintelligence.DocumentIntelligenceClient;
import com.azure.ai.documentintelligence.DocumentIntelligenceClientBuilder;
import com.azure.core.credential.TokenCredential;
import com.azure.identity.AzureIdentityEnvVars;
import com.azure.identity.DefaultAzureCredentialBuilder;
import com.azure.identity.ManagedIdentityCredentialBuilder;

TokenCredential credential = new DefaultAzureCredentialBuilder()
    .requireEnvVars(AzureIdentityEnvVars.AZURE_TOKEN_CREDENTIALS)
    .build();
// Or use a specific credential directly in production:
// See https://learn.microsoft.com/java/api/overview/azure/identity-readme?view=azure-java-stable#credential-classes
// TokenCredential credential = new ManagedIdentityCredentialBuilder().build();

DocumentIntelligenceClient client = new DocumentIntelligenceClientBuilder()
    .endpoint(System.getenv("DOCUMENT_INTELLIGENCE_ENDPOINT"))
    .credential(credential)
    .buildClient();

API Key

import com.azure.core.credential.AzureKeyCredential;

DocumentIntelligenceClient client = new DocumentIntelligenceClientBuilder()
    .endpoint(System.getenv("DOCUMENT_INTELLIGENCE_ENDPOINT"))
    .credential(new AzureKeyCredential(System.getenv("DOCUMENT_INTELLIGENCE_KEY")))
    .buildClient();

Administration Client

import com.azure.ai.documentintelligence.DocumentIntelligenceAdministrationClient;
import com.azure.ai.documentintelligence.DocumentIntelligenceAdministrationClientBuilder;
import com.azure.core.credential.TokenCredential;
import com.azure.identity.AzureIdentityEnvVars;
import com.azure.identity.DefaultAzureCredentialBuilder;
import com.azure.identity.ManagedIdentityCredentialBuilder;

TokenCredential credential = new DefaultAzureCredentialBuilder()
    .requireEnvVars(AzureIdentityEnvVars.AZURE_TOKEN_CREDENTIALS)
    .build();
// Or use a specific credential directly in production:
// See https://learn.microsoft.com/java/api/overview/azure/identity-readme?view=azure-java-stable#credential-classes
// TokenCredential credential = new ManagedIdentityCredentialBuilder().build();

DocumentIntelligenceAdministrationClient adminClient = new DocumentIntelligenceAdministrationClientBuilder()
    .endpoint(System.getenv("DOCUMENT_INTELLIGENCE_ENDPOINT"))
    .credential(credential)
    .buildClient();

Async Client

import com.azure.ai.documentintelligence.DocumentIntelligenceAsyncClient;
import com.azure.core.credential.TokenCredential;
import com.azure.identity.AzureIdentityEnvVars;
import com.azure.identity.DefaultAzureCredentialBuilder;
import com.azure.identity.ManagedIdentityCredentialBuilder;

TokenCredential credential = new DefaultAzureCredentialBuilder()
    .requireEnvVars(AzureIdentityEnvVars.AZURE_TOKEN_CREDENTIALS)
    .build();
// Or use a specific credential directly in production:
// See https://learn.microsoft.com/java/api/overview/azure/identity-readme?view=azure-java-stable#credential-classes
// TokenCredential credential = new ManagedIdentityCredentialBuilder().build();

DocumentIntelligenceAsyncClient asyncClient = new DocumentIntelligenceClientBuilder()
    .endpoint(System.getenv("DOCUMENT_INTELLIGENCE_ENDPOINT"))
    .credential(credential)
    .buildAsyncClient();

Prebuilt Models

Model IDPurpose
prebuilt-readExtract text, lines, words, languages
prebuilt-layoutText, tables, selection marks, structure
prebuilt-receiptReceipt data extraction
prebuilt-invoiceInvoice field extraction
prebuilt-idDocumentID documents (passport, license)
prebuilt-tax.us.w2US W2 tax forms
prebuilt-healthInsuranceCard.usUS health insurance cards
prebuilt-contractContract field extraction

Retired models: prebuilt-businessCard and prebuilt-document are retired in API version 2024-11-30. Use the legacy azure-ai-formrecognizer package for these.

Core Patterns

Analyze from File

import com.azure.ai.documentintelligence.models.*;
import com.azure.core.util.BinaryData;
import com.azure.core.util.polling.SyncPoller;
import java.io.File;

File document = new File("document.pdf");
BinaryData documentData = BinaryData.fromFile(document.toPath(), (int) document.length());

SyncPoller<AnalyzeOperationDetails, AnalyzeResult> poller =
    client.beginAnalyzeDocument("prebuilt-layout",
        new AnalyzeDocumentOptions(documentData));

AnalyzeResult result = poller.getFinalResult();

Analyze from URL

String documentUrl = "https://example.com/invoice.pdf";

SyncPoller<AnalyzeOperationDetails, AnalyzeResult> poller =
    client.beginAnalyzeDocument("prebuilt-invoice",
        new AnalyzeDocumentOptions(documentUrl));

AnalyzeResult result = poller.getFinalResult();

Extract Layout

AnalyzeResult result = poller.getFinalResult();

for (DocumentPage page : result.getPages()) {
    System.out.printf("Page has width: %.2f and height: %.2f, measured with unit: %s%n",
        page.getWidth(), page.getHeight(), page.getUnit());

    // Lines
    for (DocumentLine line : page.getLines()) {
        System.out.printf("Line '%s' is within bounding box %s.%n",
            line.getContent(), line.getPolygon());
    }

    // Selection marks
    for (DocumentSelectionMark mark : page.getSelectionMarks()) {
        System.out.printf("Selection mark is '%s' with confidence %.2f.%n",
            mark.getState(), mark.getConfidence());
    }
}

// Tables
for (DocumentTable table : result.getTables()) {
    System.out.printf("Table: %d rows x %d columns%n",
        table.getRowCount(), table.getColumnCount());
    for (DocumentTableCell cell : table.getCells()) {
        System.out.printf("Cell[%d,%d]: %s%n",
            cell.getRowIndex(), cell.getColumnIndex(), cell.getContent());
    }
}

Extract Document Fields

SyncPoller<AnalyzeOperationDetails, AnalyzeResult> poller =
    client.beginAnalyzeDocument("prebuilt-receipt",
        new AnalyzeDocumentOptions(receiptUrl));

AnalyzeResult result = poller.getFinalResult();

for (AnalyzedDocument doc : result.getDocuments()) {
    Map<String, DocumentField> fields = doc.getFields();

    DocumentField merchantName = fields.get("MerchantName");
    if (merchantName != null && merchantName.getType() == DocumentFieldType.STRING) {
        System.out.printf("Merchant: %s (confidence: %.2f)%n",
            merchantName.getValueString(), merchantName.getConfidence());
    }

    DocumentField transactionDate = fields.get("TransactionDate");
    if (transactionDate != null && transactionDate.getType() == DocumentFieldType.DATE) {
        System.out.printf("Date: %s%n", transactionDate.getValueDate());
    }
}

Analyze with Options

SyncPoller<AnalyzeOperationDetails, AnalyzeResult> poller =
    client.beginAnalyzeDocument("my-custom-model",
        new AnalyzeDocumentOptions(documentUrl)
            .setPages(Collections.singletonList("1-3"))
            .setLocale("en-US")
            .setDocumentAnalysisFeatures(Arrays.asList(DocumentAnalysisFeature.LANGUAGES))
            .setOutputContentFormat(DocumentContentFormat.TEXT));

Custom Models

Build Custom Model

String blobContainerUrl = "{SAS_URL_of_training_data}";

SyncPoller<DocumentModelBuildOperationDetails, DocumentModelDetails> poller =
    adminClient.beginBuildDocumentModel(
        new BuildDocumentModelOptions("my-custom-model", DocumentBuildMode.TEMPLATE)
            .setAzureBlobSource(new AzureBlobContentSource(blobContainerUrl)));

DocumentModelDetails model = poller.getFinalResult();
System.out.printf("Model ID: %s%n", model.getModelId());
System.out.printf("Created: %s%n", model.getCreatedOn());

model.getDocumentTypes().forEach((docType, details) -> {
    details.getFieldSchema().forEach((field, schema) -> {
        System.out.printf("Field: %s (%s)%n", field, schema.getType());
    });
});

Manage Models

// Resource limits
DocumentIntelligenceResourceDetails resourceDetails = adminClient.getResourceDetails();
System.out.printf("Models: %d / %d%n",
    resourceDetails.getCustomDocumentModels().getCount(),
    resourceDetails.getCustomDocumentModels().getLimit());

// List models
PagedIterable<DocumentModelDetails> models = adminClient.listModels();
for (DocumentModelDetails model : models) {
    System.out.printf("Model: %s, Created: %s%n",
        model.getModelId(), model.getCreatedOn());
}

// Get model
DocumentModelDetails model = adminClient.getModel("model-id");

// Delete model
adminClient.deleteModel("model-id");

Document Classification

Build Classifier

Map<String, ClassifierDocumentTypeDetails> docTypes = new HashMap<>();
docTypes.put("invoice", new ClassifierDocumentTypeDetails()
    .setAzureBlobSource(new AzureBlobContentSource(containerUrl).setPrefix("invoices/")));
docTypes.put("receipt", new ClassifierDocumentTypeDetails()
    .setAzureBlobSource(new AzureBlobContentSource(containerUrl).setPrefix("receipts/")));

SyncPoller<DocumentClassifierBuildOperationDetails, DocumentClassifierDetails> poller =
    adminClient.beginBuildClassifier(
        new BuildDocumentClassifierOptions("my-classifier", docTypes));

DocumentClassifierDetails classifier = poller.getFinalResult();

Classify Document

SyncPoller<AnalyzeOperationDetails, AnalyzeResult> poller =
    client.beginClassifyDocument("my-classifier",
        new ClassifyDocumentOptions(documentUrl));

AnalyzeResult result = poller.getFinalResult();
for (AnalyzedDocument doc : result.getDocuments()) {
    System.out.printf("Classified as: %s (confidence: %.2f)%n",
        doc.getDocumentType(), doc.getConfidence());
}

Error Handling

import com.azure.core.exception.HttpResponseException;

try {
    client.beginAnalyzeDocument("prebuilt-receipt",
        new AnalyzeDocumentOptions("invalid-url"));
} catch (HttpResponseException e) {
    System.out.printf("Status: %d, Error: %s%n",
        e.getResponse().getStatusCode(), e.getMessage());
}

Migration from azure-ai-formrecognizer

Old (formrecognizer v4.x)New (documentintelligence v1.x)
DocumentAnalysisClientDocumentIntelligenceClient
DocumentAnalysisClientBuilderDocumentIntelligenceClientBuilder
DocumentModelAdministrationClientDocumentIntelligenceAdministrationClient
beginAnalyzeDocumentFromUrl(modelId, url)beginAnalyzeDocument(modelId, new AnalyzeDocumentOptions(url))
beginAnalyzeDocument(modelId, data)beginAnalyzeDocument(modelId, new AnalyzeDocumentOptions(data))
SyncPoller<OperationResult, AnalyzeResult>SyncPoller<AnalyzeOperationDetails, AnalyzeResult>
field.getValueAsString()field.getValueString()
field.getValueAsDate()field.getValueDate()
field.getValueAsDouble()field.getValueNumber()
field.getValueAsList()field.getValueList()
field.getValueAsMap()field.getValueObject()
mark.getSelectionMarkState()mark.getState()
adminClient.beginBuildDocumentModel(url, mode, prefix, options, ctx)adminClient.beginBuildDocumentModel(new BuildDocumentModelOptions(id, mode).setAzureBlobSource(...))
adminClient.getResourceDetails().getCustomDocumentModelCount()adminClient.getResourceDetails().getCustomDocumentModels().getCount()
FORM_RECOGNIZER_ENDPOINTDOCUMENT_INTELLIGENCE_ENDPOINT

Reference Files

FileContents
references/examples.mdComplete code examples for all scenarios

Mais skills de microsoft

oss-growth
microsoft
Persona de growth hacker OSS
official
microsoft-foundry
microsoft
Implantar, avaliar e gerenciar agentes Foundry de ponta a ponta: build Docker, push ACR, criação de agente hospedado/prompt, inicialização de contêiner, avaliação em lote, avaliação contínua, fluxos de trabalho do otimizador de prompt, agent.yaml, curadoria de conjunto de dados a partir de rastros. USE PARA: implantar agente no Foundry, agente hospedado, criar agente, invocar agente, avaliar agente, executar avaliação em lote, avaliação contínua, monitoramento contínuo, status da avaliação contínua, otimizar prompt, melhorar prompt, otimizador de prompt, otimizar instruções do agente, melhorar agente...
officialdevelopmentdevops
azure-ai
microsoft
Use para Azure AI: Search, Speech, OpenAI, Document Intelligence. Ajuda com pesquisa, busca vetorial/híbrida, fala para texto, texto para fala, transcrição, OCR. QUANDO: AI Search, pesquisa de consulta, busca vetorial, busca híbrida, busca semântica, fala para texto, texto para fala, transcrever, OCR, converter texto em fala.
officialdevelopmentapi
azure-deploy
microsoft
Execute implantações do Azure para aplicativos JÁ PREPARADOS que possuem arquivos .azure/deployment-plan.md e de infraestrutura existentes. NÃO use esta skill quando o usuário pedir para CRIAR um novo aplicativo — use azure-prepare. Esta skill executa comandos azd up, azd deploy, terraform apply e az deployment com recuperação de erros integrada. Requer .azure/deployment-plan.md do azure-prepare e status validado do azure-validate. QUANDO: "executar azd up", "executar azd deploy", "executar implantação",...
officialdevopsaws
azure-storage
microsoft
Serviços de Armazenamento do Azure, incluindo Blob Storage, File Shares, Queue Storage, Table Storage e Data Lake. Responde a perguntas sobre camadas de acesso ao armazenamento (hot, cool, cold, archive), quando usar cada camada e comparação entre elas. Oferece armazenamento de objetos, compartilhamentos de arquivos SMB, mensagens assíncronas, NoSQL chave-valor e análise de big data. Inclui gerenciamento de ciclo de vida. USE PARA: blob storage, file shares, queue storage, table storage, data lake, upload de arquivos, download de blobs, contas de armazenamento, camadas de acesso,...
officialdevelopmentdatabase
azure-diagnostics
microsoft
Depure problemas de produção no Azure usando AppLens, Azure Monitor, integridade de recursos e triagem segura. QUANDO: depurar problemas de produção, solucionar problemas do Serviço de Aplicativo, alto uso de CPU no Serviço de Aplicativo, falha de implantação do Serviço de Aplicativo, solucionar problemas de aplicativos em contêineres, solucionar problemas de funções, solucionar problemas do AKS, kubectl não consegue conectar, falhas do kube-system/CoreDNS, pod pendente, crashloop, nó não pronto, falhas de atualização, analisar logs, KQL, insights, falhas ao puxar imagem, problemas de inicialização a frio, falhas de sonda de integridade,...
officialdevopsdevelopment
azure-prepare
microsoft
Prepare aplicativos do Azure para implantação (infra Bicep/Terraform, azure.yaml, Dockerfiles). Use para criar/modernizar ou criar+implantar; não para migração entre nuvens (use azure-cloud-migrate). NÃO USE PARA: aplicativos copilot-sdk (use azure-hosted-copilot-sdk). QUANDO: "criar aplicativo", "construir aplicativo web", "criar API", "criar API HTTP serverless", "criar frontend", "criar backend", "construir um serviço", "modernizar aplicativo", "atualizar aplicativo", "adicionar autenticação", "adicionar cache", "hospedar no Azure", "criar e...
officialdevelopmentdevops
azure-validate
microsoft
Validação pré-implantação para prontidão do Azure. Execute verificações aprofundadas de configuração, infraestrutura (Bicep ou Terraform), atribuições de função RBAC, permissões de identidade gerenciada e pré-requisitos antes de implantar. QUANDO: validar meu aplicativo, verificar prontidão para implantação, executar verificações de pré-voo, verificar configuração, verificar se está pronto para implantar, validar azure.yaml, validar Bicep, testar antes de implantar, solucionar erros de implantação, validar Azure Functions, validar function app, validar serverless...
officialdevopstesting