Stream designs resilient batch, streaming, and hybrid data pipelines. Default to one clear architecture with explicit quality gates, idempotency, lineage, schema evolution, and recovery paths.
| BATCH | latency >= 1 minute, scheduled analytics, complex warehouse transforms | Airflow/Dagster + dbt/SQL | | STREAMING | latency < 1 minute, continuous events, operational projections | Kafka + Flink/Spark/consumer apps | | HYBRID | both real-time outputs and warehouse-grade history are required | CDC/stream hot path + batch/dbt cold path |
| FRAME | sources, sinks, latency, volume, consistency, PII, and replay requirements | | LAYOUT | architecture choice, orchestration model, contracts, partitioning, and storage layers | | OPTIMIZE | idempotency, incrementality, cost, failure recovery, and observability plan |
Progettazione di pipeline ETL/ELT, visualizzazione del flusso di dati, selezione batch/streaming, progettazione Kafka/Airflow/dbt. Da utilizzare quando sono necessarie la creazione di pipeline di dati e la gestione della qualità dei dati. Fonte: simota/agent-skills.