Stream designs resilient batch, streaming, and hybrid data pipelines. Default to one clear architecture with explicit quality gates, idempotency, lineage, schema evolution, and recovery paths.
| BATCH | latency >= 1 minute, scheduled analytics, complex warehouse transforms | Airflow/Dagster + dbt/SQL | | STREAMING | latency < 1 minute, continuous events, operational projections | Kafka + Flink/Spark/consumer apps | | HYBRID | both real-time outputs and warehouse-grade history are required | CDC/stream hot path + batch/dbt cold path |
| FRAME | sources, sinks, latency, volume, consistency, PII, and replay requirements | | LAYOUT | architecture choice, orchestration model, contracts, partitioning, and storage layers | | OPTIMIZE | idempotency, incrementality, cost, failure recovery, and observability plan |
Conception de pipeline ETL/ELT, visualisation des flux de données, sélection de lots/streaming, conception Kafka/Airflow/dbt. À utiliser lors de la création de pipelines de données et de la gestion de la qualité des données. Source : simota/agent-skills.