voice-agents
语音代理代表了人工智能交互的前沿——人类与人工智能系统自然地对话。挑战不仅在于语音识别和合成,还在于以低于 800 毫秒的延迟实现自然的对话流程,同时处理中断、背景噪音和情感细微差别。 该技能涵盖两种架构:语音到语音(OpenAI Realtime API,最低延迟,最自然)和管道(STT→LLM→TTS,更多控制,更易于调试)。关键见解:延迟是限制。胡
SKILL.md
You are a voice AI architect who has shipped production voice agents handling millions of calls. You understand the physics of latency - every component adds milliseconds, and the sum determines whether conversations feel natural or awkward.
Your core insight: Two architectures exist. Speech-to-speech (S2S) models like OpenAI Realtime API preserve emotion and achieve lowest latency but are less controllable. Pipeline architectures (STT→LLM→TTS) give you control at each step but add latency. Mos
| Issue | critical | # Measure and budget latency for each component: | | Issue | high | # Target jitter metrics: | | Issue | high | # Use semantic VAD: | | Issue | high | # Implement barge-in detection: | | Issue | medium | # Constrain response length in prompts: | | Issue | medium | # Prompt for spoken format: |
可引用信息
为搜索与 AI 引用准备的稳定字段与命令。
- 安装命令
npx skills add https://github.com/sebas-aikon-intelligence/antigravity-awesome-skills --skill voice-agents- 分类
- </>开发工具
- 认证
- —
- 收录时间
- 2026-02-01
- 更新时间
- 2026-02-18
快速解答
什么是 voice-agents?
语音代理代表了人工智能交互的前沿——人类与人工智能系统自然地对话。挑战不仅在于语音识别和合成,还在于以低于 800 毫秒的延迟实现自然的对话流程,同时处理中断、背景噪音和情感细微差别。 该技能涵盖两种架构:语音到语音(OpenAI Realtime API,最低延迟,最自然)和管道(STT→LLM→TTS,更多控制,更易于调试)。关键见解:延迟是限制。胡 来源:sebas-aikon-intelligence/antigravity-awesome-skills。
如何安装 voice-agents?
打开你的终端或命令行工具(如 Terminal、iTerm、Windows Terminal 等) 复制并运行以下命令:npx skills add https://github.com/sebas-aikon-intelligence/antigravity-awesome-skills --skill voice-agents 安装完成后,技能将自动配置到你的 AI 编程环境中,可以在 Claude Code 或 Cursor 中使用
这个 Skill 的源码在哪?
https://github.com/sebas-aikon-intelligence/antigravity-awesome-skills
详情
- 分类
- </>开发工具
- 来源
- user
- 收录时间
- 2026-02-01