azure-eventhub-py
作者: microsoft
用于高吞吐量事件摄取的大数据流式处理平台。
npx skills add https://github.com/microsoft/skills --skill azure-eventhub-pyAzure Event Hubs SDK for Python
Big data streaming platform for high-throughput event ingestion.
Installation
pip install azure-eventhub azure-identity
# For checkpointing with blob storage
pip install azure-eventhub-checkpointstoreblob-aio
Environment Variables
EVENT_HUB_FULLY_QUALIFIED_NAMESPACE=<namespace>.servicebus.windows.net # Required for all auth methods
EVENT_HUB_NAME=my-eventhub # Required for all auth methods
STORAGE_ACCOUNT_URL=https://<account>.blob.core.windows.net # Required for checkpoint storage
CHECKPOINT_CONTAINER=checkpoints # Required for checkpoint storage
AZURE_TOKEN_CREDENTIALS=prod # Required only if DefaultAzureCredential is used in production
Authentication & Lifecycle
🔑 Two rules apply to every code sample below:
- Prefer
DefaultAzureCredential. It works locally (Azure CLI / VS Code / Developer CLI) and in Azure (managed identity, workload identity) with no code change. Avoid connection strings, account/API keys — they bypass Entra audit and rotation.
- Local dev:
DefaultAzureCredentialworks as-is.- Production: set
AZURE_TOKEN_CREDENTIALS=prod(orAZURE_TOKEN_CREDENTIALS=<specific_credential>) to constrain the credential chain to production-safe credentials.- Wrap every client in a context manager so HTTP transports, sockets, and token caches are released deterministically:
- Sync:
with <Client>(...) as client:- Async:
async with <Client>(...) as client:andasync with DefaultAzureCredential() as credential:(fromazure.identity.aio)Snippets may abbreviate this setup, but production code should always follow both rules.
from azure.identity import DefaultAzureCredential, ManagedIdentityCredential
from azure.eventhub import EventHubProducerClient, EventHubConsumerClient
# Local dev: DefaultAzureCredential. Production: set AZURE_TOKEN_CREDENTIALS=prod or AZURE_TOKEN_CREDENTIALS=<specific_credential>
credential = DefaultAzureCredential(require_envvar=True)
# Or use a specific credential directly in production:
# See https://learn.microsoft.com/python/api/overview/azure/identity-readme?view=azure-python#credential-classes
# credential = ManagedIdentityCredential()
namespace = "<namespace>.servicebus.windows.net"
eventhub_name = "my-eventhub"
# Producer
with EventHubProducerClient(
fully_qualified_namespace=namespace,
eventhub_name=eventhub_name,
credential=credential
) as producer:
# Use producer here (see following sections for operations)
...
# Consumer
with EventHubConsumerClient(
fully_qualified_namespace=namespace,
eventhub_name=eventhub_name,
consumer_group="$Default",
credential=credential
) as consumer:
# Use consumer here (see following sections for operations)
...
Client Types
| Client | Purpose |
|---|---|
EventHubProducerClient | Send events to Event Hub |
EventHubConsumerClient | Receive events from Event Hub |
BlobCheckpointStore | Track consumer progress |
Send Events
from azure.eventhub import EventHubProducerClient, EventData
from azure.identity import DefaultAzureCredential
with EventHubProducerClient(
fully_qualified_namespace="<namespace>.servicebus.windows.net",
eventhub_name="my-eventhub",
credential=DefaultAzureCredential()
) as producer:
# Create batch (handles size limits)
event_data_batch = producer.create_batch()
for i in range(10):
try:
event_data_batch.add(EventData(f"Event {i}"))
except ValueError:
# Batch is full, send and create new one
producer.send_batch(event_data_batch)
event_data_batch = producer.create_batch()
event_data_batch.add(EventData(f"Event {i}"))
# Send remaining
producer.send_batch(event_data_batch)
Send to Specific Partition
# By partition ID
event_data_batch = producer.create_batch(partition_id="0")
# By partition key (consistent hashing)
event_data_batch = producer.create_batch(partition_key="user-123")
Receive Events
Simple Receive
from azure.eventhub import EventHubConsumerClient
def on_event(partition_context, event):
print(f"Partition: {partition_context.partition_id}")
print(f"Data: {event.body_as_str()}")
partition_context.update_checkpoint(event)
with EventHubConsumerClient(
fully_qualified_namespace="<namespace>.servicebus.windows.net",
eventhub_name="my-eventhub",
consumer_group="$Default",
credential=DefaultAzureCredential()
) as consumer:
consumer.receive(
on_event=on_event,
starting_position="-1", # Beginning of stream
)
With Blob Checkpoint Store (Production)
from azure.eventhub import EventHubConsumerClient
from azure.eventhub.extensions.checkpointstoreblob import BlobCheckpointStore
from azure.identity import DefaultAzureCredential
checkpoint_store = BlobCheckpointStore(
blob_account_url="https://<account>.blob.core.windows.net",
container_name="checkpoints",
credential=DefaultAzureCredential()
)
with EventHubConsumerClient(
fully_qualified_namespace="<namespace>.servicebus.windows.net",
eventhub_name="my-eventhub",
consumer_group="$Default",
credential=DefaultAzureCredential(),
checkpoint_store=checkpoint_store
) as consumer:
def on_event(partition_context, event):
print(f"Received: {event.body_as_str()}")
# Checkpoint after processing
partition_context.update_checkpoint(event)
consumer.receive(on_event=on_event)
Async Client
from azure.eventhub.aio import EventHubProducerClient, EventHubConsumerClient
from azure.identity.aio import DefaultAzureCredential
import asyncio
async def send_events():
credential = DefaultAzureCredential()
async with EventHubProducerClient(
fully_qualified_namespace="<namespace>.servicebus.windows.net",
eventhub_name="my-eventhub",
credential=credential
) as producer:
batch = await producer.create_batch()
batch.add(EventData("Async event"))
await producer.send_batch(batch)
async def receive_events():
async def on_event(partition_context, event):
print(event.body_as_str())
await partition_context.update_checkpoint(event)
async with EventHubConsumerClient(
fully_qualified_namespace="<namespace>.servicebus.windows.net",
eventhub_name="my-eventhub",
consumer_group="$Default",
credential=DefaultAzureCredential()
) as consumer:
await consumer.receive(on_event=on_event)
asyncio.run(send_events())
Event Properties
event = EventData("My event body")
# Set properties
event.properties = {"custom_property": "value"}
event.content_type = "application/json"
# Read properties (on receive)
print(event.body_as_str())
print(event.sequence_number)
print(event.offset)
print(event.enqueued_time)
print(event.partition_key)
Get Event Hub Info
with producer:
info = producer.get_eventhub_properties()
print(f"Name: {info['name']}")
print(f"Partitions: {info['partition_ids']}")
for partition_id in info['partition_ids']:
partition_info = producer.get_partition_properties(partition_id)
print(f"Partition {partition_id}: {partition_info['last_enqueued_sequence_number']}")
Best Practices
- Pick sync OR async and stay consistent. Do not mix
azure.xxxsync clients withazure.xxx.aioasync clients in the same call path. Choose one mode per module. - Always use context managers for clients and async credentials. Wrap every client in
with Client(...) as client:(sync) orasync with Client(...) as client:(async) for proper cleanup. For asyncDefaultAzureCredentialfromazure.identity.aio, also useasync with credential:so tokens and transports are cleaned up. - Use
DefaultAzureCredentialfor portable auth across local dev and Azure (avoid connection strings / API keys when possible). - Use batches for sending multiple events
- Use checkpoint store in production for reliable processing
- Use async client for high-throughput scenarios
- Use partition keys for ordered delivery within a partition
- Handle batch size limits — catch ValueError when batch is full
- Set appropriate consumer groups for different applications
Reference Files
| File | Contents |
|---|---|
| references/checkpointing.md | Checkpoint store patterns, blob checkpointing, checkpoint strategies |
| references/partitions.md | Partition management, load balancing, starting positions |
| scripts/setup_consumer.py | CLI for Event Hub info, consumer setup, and event sending/receiving |
来自 microsoft 的更多技能
oss-growth
microsoft
OSS增长黑客角色
official
microsoft-foundry
microsoft
端到端部署、评估和管理Foundry代理:Docker构建、ACR推送、托管/提示代理创建、容器启动、批量评估、持续评估、提示优化工作流、agent.yaml、从追踪中整理数据集。用途:将代理部署到Foundry、托管代理、创建代理、调用代理、评估代理、运行批量评估、持续评估、持续监控、持续评估状态、优化提示、改进提示、提示优化器、优化代理指令、改进代理...
officialdevelopmentdevops
azure-ai
microsoft
用于Azure AI:搜索、语音、OpenAI、文档智能。支持搜索、向量/混合搜索、语音转文字、文字转语音、转录、OCR。适用场景:AI搜索、查询搜索、向量搜索、混合搜索、语义搜索、语音转文字、文字转语音、转录、OCR、文字转语音。
officialdevelopmentapi
azure-deploy
microsoft
对已准备好的应用程序执行Azure部署,这些程序需包含现有的.azure/deployment-plan.md和基础设施文件。当用户要求创建新应用程序时,请勿使用此技能——应改用azure-prepare。此技能运行azd up、azd deploy、terraform apply和az deployment命令,并内置错误恢复机制。需要来自azure-prepare的.azure/deployment-plan.md以及来自azure-validate的已验证状态。适用场景:"运行azd up"、"运行azd deploy"、"执行部署"...
officialdevopsaws
azure-storage
microsoft
Azure存储服务,包括Blob存储、文件共享、队列存储、表存储和Data Lake。解答关于存储访问层(热、冷、冷、归档)的问题,说明各层的使用场景及对比。提供对象存储、SMB文件共享、异步消息传递、NoSQL键值存储和大数据分析。包含生命周期管理。用途:Blob存储、文件共享、队列存储、表存储、Data Lake、上传文件、下载Blob、存储账户、访问层等。
officialdevelopmentdatabase
azure-diagnostics
microsoft
使用AppLens、Azure Monitor、资源健康和安全分类调试Azure生产问题。适用场景:调试生产问题、排查应用服务、应用服务CPU过高、应用服务部署失败、排查容器应用、排查函数、排查AKS、kubectl无法连接、kube-system/CoreDNS故障、Pod挂起、CrashLoop、节点未就绪、升级失败、分析日志、KQL、洞察、镜像拉取失败、冷启动问题、健康探测失败……
officialdevopsdevelopment
azure-prepare
microsoft
为Azure应用准备部署(基础设施Bicep/Terraform、azure.yaml、Dockerfile)。用于创建/现代化或创建+部署;不用于跨云迁移(使用azure-cloud-migrate)。请勿用于:copilot-sdk应用(使用azure-hosted-copilot-sdk)。适用场景:"创建应用"、"构建Web应用"、"创建API"、"创建无服务器HTTP API"、"创建前端"、"创建后端"、"构建服务"、"现代化应用"、"更新应用"、"添加身份验证"、"添加缓存"、"托管在Azure上"、"创建并...
officialdevelopmentdevops
azure-validate
microsoft
部署前对Azure就绪状态进行验证。对配置、基础设施(Bicep或Terraform)、RBAC角色分配、托管标识权限及先决条件进行深度检查,然后再部署。适用场景:验证我的应用、检查部署就绪状态、运行预检、验证配置、检查是否可部署、验证azure.yaml、验证Bicep、部署前测试、排查部署错误、验证Azure Functions、验证函数应用、验证无服务器...
officialdevopstesting