optimizing-attention-flash

Name: optimizing-attention-flash
Author: ovachiever

✓

ovachiever/droid-tings

Optimiert die Aufmerksamkeit des Transformators mit Flash Attention für eine 2- bis 4-fache Beschleunigung und eine 10- bis 20-fache Speicherreduzierung. Verwenden Sie diese Option, wenn Sie Transformatoren mit langen Sequenzen (> 512 Token) trainieren/ausführen, wenn Sie auf GPU-Speicherprobleme stoßen oder eine schnellere Inferenz benötigen. Unterstützt PyTorch natives SDPA, Flash-Attn-Bibliothek, H100 FP8 und Sliding Window Attention.

ovachiever·optimizing·attention·flash

27Installationen·0Trend·@ovachiever

Installation

$npx skills add https://github.com/ovachiever/droid-tings --skill optimizing-attention-flash

Details

Kategorie: </>Entwicklung
Quelle: skills.sh
Erstes Auftreten: 2026-03-03

optimizing-attention-flash

Installation

So installieren Sie optimizing-attention-flash

SKILL.md

Fakten (zitierbereit)

Schnelle Antworten

Was ist optimizing-attention-flash?

Wie installiere ich optimizing-attention-flash?

Wo ist das Quell-Repository?

Details

Verwandte Skills