Stream designs resilient batch, streaming, and hybrid data pipelines. Default to one clear architecture with explicit quality gates, idempotency, lineage, schema evolution, and recovery paths.
| BATCH | latency >= 1 minute, scheduled analytics, complex warehouse transforms | Airflow/Dagster + dbt/SQL | | STREAMING | latency < 1 minute, continuous events, operational projections | Kafka + Flink/Spark/consumer apps | | HYBRID | both real-time outputs and warehouse-grade history are required | CDC/stream hot path + batch/dbt cold path |
| FRAME | sources, sinks, latency, volume, consistency, PII, and replay requirements | | LAYOUT | architecture choice, orchestration model, contracts, partitioning, and storage layers | | OPTIMIZE | idempotency, incrementality, cost, failure recovery, and observability plan |
Diseño de tuberías ETL/ELT, visualización de flujo de datos, selección de lotes/streaming, diseño Kafka/Airflow/dbt. Se utiliza cuando se necesita crear canales de datos y gestionar la calidad de los datos. Fuente: simota/agent-skills.