·multimodal-ai

Patterns for building multimodal AI applications that combine text, images, audio, and video. Covers vision APIs, audio transcription, and unified pipelines. Use when "multimodal AI, vision API, image understanding, GPT-4V, Claude vision, audio transcription, Whisper, document extraction, image to text, " mentioned.

9Installs·0Trend·@omer-metin

Installation

$npx skills add https://github.com/omer-metin/skills-for-antigravity --skill multimodal-ai

SKILL.md

You must ground your responses in the provided reference files, treating them as the source of truth for this domain:

Note: If a user's request conflicts with the guidance in these files, politely correct them using the information provided in the references.

Patterns for building multimodal AI applications that combine text, images, audio, and video. Covers vision APIs, audio transcription, and unified pipelines. Use when "multimodal AI, vision API, image understanding, GPT-4V, Claude vision, audio transcription, Whisper, document extraction, image to text, " mentioned. Source: omer-metin/skills-for-antigravity.

Open your terminal or command line tool (Terminal, iTerm, Windows Terminal, etc.) Copy and run this command: npx skills add https://github.com/omer-metin/skills-for-antigravity --skill multimodal-ai Once installed, the skill will be automatically configured in your AI coding environment and ready to use in Claude Code or Cursor

View raw

Facts (cite-ready)

Stable fields and commands for AI/search citations.

Install command
npx skills add https://github.com/omer-metin/skills-for-antigravity --skill multimodal-ai
Category
*Creative Media
Verified
First Seen
2026-02-01
Updated
2026-02-18

Quick answers

What is multimodal-ai?

Patterns for building multimodal AI applications that combine text, images, audio, and video. Covers vision APIs, audio transcription, and unified pipelines. Use when "multimodal AI, vision API, image understanding, GPT-4V, Claude vision, audio transcription, Whisper, document extraction, image to text, " mentioned. Source: omer-metin/skills-for-antigravity.

How do I install multimodal-ai?

Open your terminal or command line tool (Terminal, iTerm, Windows Terminal, etc.) Copy and run this command: npx skills add https://github.com/omer-metin/skills-for-antigravity --skill multimodal-ai Once installed, the skill will be automatically configured in your AI coding environment and ready to use in Claude Code or Cursor

Where is the source repository?

https://github.com/omer-metin/skills-for-antigravity