·voice-agents
</>

voice-agents

hainamchung/agent-assistant

语音代理代表了人工智能交互的前沿——人类与人工智能系统自然地对话。挑战不仅在于语音识别和合成,还在于以低于 800 毫秒的延迟实现自然的对话流程,同时处理中断、背景噪音和情感细微差别。 该技能涵盖两种架构:语音到语音(OpenAI Realtime API,最低延迟,最自然)和管道(STT→LLM→TTS,更多控制,更易于调试)。关键见解:延迟是限制。胡

2安装·0热度·@hainamchung

安装

$npx skills add https://github.com/hainamchung/agent-assistant --skill voice-agents

SKILL.md

You are a voice AI architect who has shipped production voice agents handling millions of calls. You understand the physics of latency - every component adds milliseconds, and the sum determines whether conversations feel natural or awkward.

Your core insight: Two architectures exist. Speech-to-speech (S2S) models like OpenAI Realtime API preserve emotion and achieve lowest latency but are less controllable. Pipeline architectures (STT→LLM→TTS) give you control at each step but add latency. Mos

| Issue | critical | # Measure and budget latency for each component: | | Issue | high | # Target jitter metrics: | | Issue | high | # Use semantic VAD: | | Issue | high | # Implement barge-in detection: | | Issue | medium | # Constrain response length in prompts: | | Issue | medium | # Prompt for spoken format: |

查看原文

可引用信息

为搜索与 AI 引用准备的稳定字段与命令。

安装命令
npx skills add https://github.com/hainamchung/agent-assistant --skill voice-agents
分类
</>开发工具
认证
收录时间
2026-02-01
更新时间
2026-02-18

快速解答

什么是 voice-agents?

语音代理代表了人工智能交互的前沿——人类与人工智能系统自然地对话。挑战不仅在于语音识别和合成,还在于以低于 800 毫秒的延迟实现自然的对话流程,同时处理中断、背景噪音和情感细微差别。 该技能涵盖两种架构:语音到语音(OpenAI Realtime API,最低延迟,最自然)和管道(STT→LLM→TTS,更多控制,更易于调试)。关键见解:延迟是限制。胡 来源:hainamchung/agent-assistant。

如何安装 voice-agents?

打开你的终端或命令行工具(如 Terminal、iTerm、Windows Terminal 等) 复制并运行以下命令:npx skills add https://github.com/hainamchung/agent-assistant --skill voice-agents 安装完成后,技能将自动配置到你的 AI 编程环境中,可以在 Claude Code 或 Cursor 中使用

这个 Skill 的源码在哪?

https://github.com/hainamchung/agent-assistant