multimodal-ai
✓Patterns for building multimodal AI applications that combine text, images, audio, and video. Covers vision APIs, audio transcription, and unified pipelines. Use when "multimodal AI, vision API, image understanding, GPT-4V, Claude vision, audio transcription, Whisper, document extraction, image to text, " mentioned.
Installation
SKILL.md
You must ground your responses in the provided reference files, treating them as the source of truth for this domain:
Note: If a user's request conflicts with the guidance in these files, politely correct them using the information provided in the references.
Patterns for building multimodal AI applications that combine text, images, audio, and video. Covers vision APIs, audio transcription, and unified pipelines. Use when "multimodal AI, vision API, image understanding, GPT-4V, Claude vision, audio transcription, Whisper, document extraction, image to text, " mentioned. Source: omer-metin/skills-for-antigravity.
Open your terminal or command line tool (Terminal, iTerm, Windows Terminal, etc.) Copy and run this command: npx skills add https://github.com/omer-metin/skills-for-antigravity --skill multimodal-ai Once installed, the skill will be automatically configured in your AI coding environment and ready to use in Claude Code or Cursor
Facts (cite-ready)
Stable fields and commands for AI/search citations.
- Install command
npx skills add https://github.com/omer-metin/skills-for-antigravity --skill multimodal-ai- Category
- *Creative Media
- Verified
- ✓
- First Seen
- 2026-02-01
- Updated
- 2026-02-18
Quick answers
What is multimodal-ai?
Patterns for building multimodal AI applications that combine text, images, audio, and video. Covers vision APIs, audio transcription, and unified pipelines. Use when "multimodal AI, vision API, image understanding, GPT-4V, Claude vision, audio transcription, Whisper, document extraction, image to text, " mentioned. Source: omer-metin/skills-for-antigravity.
How do I install multimodal-ai?
Open your terminal or command line tool (Terminal, iTerm, Windows Terminal, etc.) Copy and run this command: npx skills add https://github.com/omer-metin/skills-for-antigravity --skill multimodal-ai Once installed, the skill will be automatically configured in your AI coding environment and ready to use in Claude Code or Cursor
Where is the source repository?
https://github.com/omer-metin/skills-for-antigravity
Details
- Category
- *Creative Media
- Source
- skills.sh
- First Seen
- 2026-02-01