| 1 | Overview | 2 | Architecture Overview |
| 3 | Azure Service Topology | 4 | Implementation Guide |
| 5 | Decision Criteria | 6 | Cost Model |
| 7 | Anti-Patterns to Avoid | 8 | References |
Overview
When batch processing is too slow — IoT telemetry that must trigger device alerts within seconds, financial transactions that must be scored before a user sees a response, clickstreams that must personalise an experience before the next page load — streaming is the architecture. Azure offers a unified track: Event Hubs for ingestion with native Kafka-protocol surface, Stream Analytics for managed SQL-based windowed processing, and Data Lake Gen2 for durable storage that outlives retention windows. Teams migrating from on-premises Kafka can repoint brokers to port 9093 with no application changes.
Architecture Overview
%%{init:{'theme':'base','themeVariables':{'fontSize':'14px','fontFamily':'Inter, system-ui, sans-serif','primaryColor':'#DBEAFE','primaryTextColor':'#1e3a5f','primaryBorderColor':'#2563EB','lineColor':'#374151','clusterBkg':'#F9FAFB','clusterBorder':'#D1D5DB','edgeLabelBackground':'#FFFFFF'},'flowchart':{'curve':'orthogonal','padding':30,'nodeSpacing':65,'rankSpacing':75,'useMaxWidth':true}}}%% flowchart TD START([High-velocity Data Source]) START --> INGEST{Ingestion Layer} INGEST -->|High-throughput events| EH[Event Hubs Premium\nKafka surface port 9093\n32 partitions, 7-day retention] INGEST -->|Raw archive required| CAP[Event Hubs Capture\nAvro to Data Lake\nRetention independent of hub] EH --> PROCESS{Processing Layer} PROCESS -->|Windowed analytics| ASA[Stream Analytics\nSAQL tumbling and hopping windows\nManaged scaling in SUs] PROCESS -->|Threshold alerting| FN_A[Azure Functions\nPer-event trigger\nAlert routing] CAP --> DL[Data Lake Gen2\nHierarchical namespace\nRaw and curated zones] ASA & FN_A --> STORE{Storage Layer} STORE -->|Document and state| COSMOS[Cosmos DB\nPre-aggregated results\nLow-latency reads] STORE -->|Async fan-out| SB[Service Bus\nTopic subscriptions\nDownstream consumers] DL --> ADF[Data Factory\nBatch ELT 90+ connectors\nOrchestrated pipelines] ADF --> SYNAPSE[Synapse Analytics\nSQL pools\nBI and reporting] COSMOS & SB & SYNAPSE --> DONE([Insights Available — Real Time and Historical]) style START fill:#4f8ef7,color:#fff style DONE fill:#10b981,color:#fff style ASA fill:#e0f2fe style DL fill:#e0f2fe
Azure Service Topology
%%{init:{'theme':'base','themeVariables':{'fontSize':'14px','fontFamily':'Inter, system-ui, sans-serif','primaryColor':'#DBEAFE','primaryTextColor':'#1e3a5f','primaryBorderColor':'#2563EB','lineColor':'#374151','clusterBkg':'#F9FAFB','clusterBorder':'#D1D5DB','edgeLabelBackground':'#FFFFFF'},'flowchart':{'curve':'orthogonal','padding':30,'nodeSpacing':65,'rankSpacing':75,'useMaxWidth':true}}}%% flowchart TD SOURCES[Data Sources\nIoT devices, Applications\nMicroservices via Kafka SDK] subgraph EH_NS["Event Hubs Premium Namespace"] HUB[device-telemetry Hub\n32 partitions, 7-day retention\nKafka surface on port 9093] CAPTURE[Event Hubs Capture\nAvro format, 300s or 300 MB\nArchive to raw-telemetry container] CG1[Consumer Group\nstream-analytics] CG2[Consumer Group\ndata-lake-writer] CG3[Consumer Group\nreal-time-alerting] end subgraph PROCESS_BLOCK["Real-Time Processing"] ASA_T[Stream Analytics StandardV2\n5-min tumbling window AVG and MAX\n3 SUs, PARTITION BY deviceId] FN_T[Alert Function\nTemperature threshold trigger\nService Bus output] end subgraph DESTINATIONS["Data Destinations"] DL_T[Data Lake Gen2\nRaw and curated zones\nHierarchical namespace] COSMOS_T[Cosmos DB\nAggregated telemetry\nPer-device real-time state] SB_T[Service Bus\nAlert topic\nDownstream subscribers] end ADF_T[Data Factory\nBatch ELT pipelines\nData Lake to Synapse] SYNAPSE_T[Synapse Analytics\nSQL pools and Spark\nBI and historical queries] SOURCES --> HUB HUB --> CAPTURE HUB --> CG1 & CG2 & CG3 CG1 --> ASA_T CG3 --> FN_T CG2 --> DL_T CAPTURE --> DL_T ASA_T --> COSMOS_T FN_T --> SB_T DL_T --> ADF_T ADF_T --> SYNAPSE_T style SOURCES fill:#4f8ef7,color:#fff style CAPTURE fill:#e0f2fe style ASA_T fill:#e0f2fe
Implementation Guide
Bicep — Event Hubs Premium Namespace with Capture
param location string = resourceGroup().location
param namespaceName string = 'evhns-telemetry-prod'
param dataLakeAccountName string
param dataLakeContainerName string = 'raw-telemetry'
resource eventHubNamespace 'Microsoft.EventHub/namespaces@2022-10-01-preview' = {
name: namespaceName
location: location
sku: {
name: 'Premium'
tier: 'Premium'
capacity: 1
}
properties: {
zoneRedundant: true
disableLocalAuth: true
minimumTlsVersion: '1.2'
}
}
resource eventHub 'Microsoft.EventHub/namespaces/eventhubs@2022-10-01-preview' = {
parent: eventHubNamespace
name: 'device-telemetry'
properties: {
partitionCount: 32 // permanent — size >= 2x peak consumer throughput
messageRetentionInDays: 7
captureDescription: {
enabled: true
encoding: 'Avro'
intervalInSeconds: 300
sizeLimitInBytes: 314572800 // 300 MB
destination: {
name: 'EventHubArchive.AzureBlockBlob'
properties: {
storageAccountResourceId: resourceId(
'Microsoft.Storage/storageAccounts', dataLakeAccountName)
blobContainer: dataLakeContainerName
archiveNameFormat: (
'{Namespace}/{EventHub}/{PartitionId}/'
+ '{Year}/{Month}/{Day}/{Hour}/{Minute}/{Second}')
}
}
}
}
}
resource cgStreamAnalytics 'Microsoft.EventHub/namespaces/eventhubs/consumergroups@2022-10-01-preview' = {
parent: eventHub
name: 'stream-analytics'
}
resource cgDataLakeWriter 'Microsoft.EventHub/namespaces/eventhubs/consumergroups@2022-10-01-preview' = {
parent: eventHub
name: 'data-lake-writer'
}
resource cgRealTimeAlerting 'Microsoft.EventHub/namespaces/eventhubs/consumergroups@2022-10-01-preview' = {
parent: eventHub
name: 'real-time-alerting'
}
Terraform equivalent: Use
azurerm_eventhub_namespace(sku ="Premium", zone_redundant = true, local_authentication_enabled = false),azurerm_eventhubwithcapture_descriptionblock (encoding ="Avro", interval_in_seconds = 300), andazurerm_eventhub_consumer_groupfor each of the three consumer groups.
Bicep — Stream Analytics Job with Windowed Query
resource streamAnalyticsJob 'Microsoft.StreamAnalytics/streamingjobs@2021-10-01-preview' = {
name: 'asa-telemetry-prod'
location: location
identity: {
type: 'SystemAssigned'
}
properties: {
sku: {
name: 'StandardV2'
}
eventsLateArrivalMaxDelayInSeconds: 60
eventsOutOfOrderPolicy: 'Adjust'
outputErrorPolicy: 'Drop'
transformation: {
name: 'MainTransformation'
properties: {
streamingUnits: 3 // start here; load test and scale on SU% metric
query: '''
-- 5-minute tumbling window aggregation to Cosmos DB
SELECT
deviceId,
System.Timestamp() AS windowEnd,
AVG(Temperature) AS avgTemperature,
MAX(Temperature) AS maxTemperature,
COUNT(*) AS eventCount
INTO [cosmos-output]
FROM [event-hubs-input] PARTITION BY deviceId
TIMESTAMP BY EventEnqueuedUtcTime
GROUP BY deviceId, TumblingWindow(minute, 5)
-- Per-event alert threshold to Service Bus
SELECT
deviceId,
Temperature,
EventEnqueuedUtcTime AS alertTime
INTO [alerts-output]
FROM [event-hubs-input]
WHERE Temperature > 85
'''
}
}
}
}
Terraform equivalent: Use
azurerm_stream_analytics_jobwithstreaming_units = 3,events_late_arrival_max_delay_in_seconds = 60,events_out_of_order_policy = "Adjust". Define the query in thetransformation_queryfield. Wire inputs and outputs viaazurerm_stream_analytics_stream_input_eventhuband the appropriate output resources.
Schema Registry on Event Hubs Premium
Event Hubs Premium includes a Schema Registry. Register schemas once per namespace; producers and consumers reference by schema ID. Avro, JSON Schema, and Protobuf are supported. Schema evolution uses compatibility modes (BACKWARD, FORWARD, FULL) identical to Confluent Schema Registry semantics — existing Kafka Schema Registry client code requires only a URL change.
Decision Criteria
| Factor | Event Hubs | Stream Analytics | Data Factory |
|---|---|---|---|
| Latency | Sub-second ingest | Sub-second to minutes (window size) | Minutes to hours |
| Protocol | AMQP, Kafka (port 9093), HTTPS | SQL-like SAQL | REST, 90+ connectors |
| Administration | Minimal (namespace + hub) | None (fully managed) | Low (pipeline authoring) |
| Replay | Up to 90-day retention | Not applicable — stateless processing | Full re-run from source |
| Ordering | Per-partition ordered | Preserved within partition | Not guaranteed |
| Use when | High-throughput event ingestion, Kafka migration | Windowed aggregations, real-time routing | Batch ELT to Synapse, scheduled pipelines |
Cost Model
Event Hubs Premium is billed per Processing Unit (PU) at approximately $0.928/PU/hour in East US. One PU handles roughly 1 MB/s ingest and 2 MB/s egress. A single-PU namespace running continuously costs approximately $667/month. Event Hubs Capture adds $0.10 per million events archived. Stream Analytics is billed per Streaming Unit (SU) at approximately $0.081/SU/hour — three SUs costs around $175/month. Data Factory charges per pipeline activity run and data movement volume; a moderate batch workload (1,000 runs/day) costs approximately $30–60/month.
Cost optimisation levers
- Right-size PUs at namespace level, not event hub level — one PU supports multiple event hubs within the namespace.
- Enable Event Hubs Capture only on hubs where durable storage is required; egress charges apply to captured bytes leaving the namespace.
- Stream Analytics SU scaling is opaque to consumers — monitor the SU% utilisation metric; scale up when sustained SU% exceeds 80%.
- Use Dedicated Event Hubs clusters (20 CUs minimum, ~$14,600/month) only when namespace isolation, BYOK, or >10 PUs are required.
- Data Factory costs are dominated by activity runs — batch small payloads into fewer large runs rather than triggering per-event.
Anti-Patterns to Avoid
Two independent services (Stream Analytics and a custom processor) reading from the same consumer group. Only one active reader per consumer group is supported; the second reader will be disconnected or will receive inconsistent offsets, causing event loss or duplication.
Provision one consumer group per consuming application. Consumer groups are free on Event Hubs — create stream-analytics, data-lake-writer, and real-time-alerting as separate groups, each maintaining its own offset independently.
Treating the event hub's retention window (7 days) as the data archive. When the retention window expires, events are gone. Downstream reprocessing, audit, and backfill jobs have no source.
Enable Event Hubs Capture on every production hub. Capture writes Avro files to Data Lake Gen2 at a 300-second or 300 MB interval — whichever is reached first. Retention on the Data Lake is independent of the hub retention window and can be managed via lifecycle policies.
Creating an event hub with the default 4 partitions for a workload that grows to 20 concurrent consumers or 10 MB/s ingest. Partition count is permanent on Event Hubs — it cannot be increased after creation without recreating the hub and its consumer group offsets.
Size partitions at hub creation to at least twice the peak expected consumer parallelism. The rule is partitionCount ≥ 2 × peak consumer throughput in MB/s. For most production workloads, 32 partitions is the safe default.
A 1-SU job processes all data on a single thread. PARTITION BY clauses, which unlock true parallelism, require at least 6 SUs. Under sustained load, a 1-SU job will lag, buffer, and eventually drop late events.
Start production jobs at 3 SUs minimum. Enable PARTITION BY deviceId (or equivalent high-cardinality key) in the query and scale to 6+ SUs when SU% exceeds 80% under load testing. Use the Stream Analytics job diagram in the portal to identify bottleneck steps.
Flowchart
References
- Microsoft — Azure Event Hubs scalability and partition count. https://learn.microsoft.com/azure/event-hubs/event-hubs-scalability
- Microsoft — Event Hubs Capture overview. https://learn.microsoft.com/azure/event-hubs/event-hubs-capture-overview
- Microsoft — Azure Stream Analytics windowing functions. https://learn.microsoft.com/azure/stream-analytics/stream-analytics-window-functions
- Microsoft — Azure Event Hubs for Apache Kafka. https://learn.microsoft.com/azure/event-hubs/azure-event-hubs-kafka-overview
- Microsoft — Azure Stream Analytics scaling with Streaming Units. https://learn.microsoft.com/azure/stream-analytics/stream-analytics-streaming-unit-consumption
- Microsoft — Azure Data Lake Storage Gen2 hierarchical namespace. https://learn.microsoft.com/azure/storage/blobs/data-lake-storage-namespace
- Portal: AWS data streaming comparison