A nightly batch job is fine for a dashboard nobody reads before 9 a.m. It is useless for a fraud check, a live recommendation, or an agent that needs the current state of the world. Real-time features demand streaming pipelines — and a different set of trade-offs.
From "when did it run" to "is it caught up"
Batch thinking asks whether last night’s job succeeded. Streaming thinking asks whether the pipeline is keeping up with the firehose right now. Lag, not job status, becomes the metric that matters.
The pieces of a streaming stack
- Ingestion: an event log that buffers and replays — the backbone.
- Processing: stateful transforms that handle late and out-of-order events.
- Serving: a low-latency store the model or app reads from.
- Observability: lag, throughput, and dead-letter monitoring throughout.
In streaming, you do not get to retry the night. You design for failure while the data keeps coming.
Plan for late and duplicate data
Events arrive out of order, twice, or hours late. Idempotent processing and well-chosen windowing are not optional extras — they are what keeps a streaming feature correct when the real world refuses to cooperate.
Do you even need streaming?
Not always — and that is the most important question to answer first. If a decision can wait for the next batch window, batch is simpler, cheaper, and easier to operate. Reserve streaming for the features that genuinely need fresh data in seconds.
- Fraud and risk checks that must block a transaction in real time.
- Live personalisation and recommendations that react to the current session.
- Agents and automations that act on the current state of the world.
Match the architecture to the latency the feature actually needs, design for disorder from day one, and a streaming pipeline becomes a durable asset rather than a permanent operational headache.



