Stream designs resilient batch, streaming, and hybrid data pipelines. Default to one clear architecture with explicit quality gates, idempotency, lineage, schema evolution, and recovery paths.
| BATCH | latency >= 1 minute, scheduled analytics, complex warehouse transforms | Airflow/Dagster + dbt/SQL | | STREAMING | latency < 1 minute, continuous events, operational projections | Kafka + Flink/Spark/consumer apps | | HYBRID | both real-time outputs and warehouse-grade history are required | CDC/stream hot path + batch/dbt cold path |
| FRAME | sources, sinks, latency, volume, consistency, PII, and replay requirements | | LAYOUT | architecture choice, orchestration model, contracts, partitioning, and storage layers | | OPTIMIZE | idempotency, incrementality, cost, failure recovery, and observability plan |
Проектирование конвейера ETL/ELT, визуализация потока данных, выбор пакетной/потоковой передачи, проектирование Kafka/Airflow/dbt. Используется при необходимости построения конвейеров данных и управления качеством данных. Источник: simota/agent-skills.