·nemo-curator
*

nemo-curator

ovachiever/droid-tings

GPU-accelerated data curation for LLM training. Supports text/image/video/audio. Features fuzzy deduplication (16× faster), quality filtering (30+ heuristics), semantic deduplication, PII redaction, NSFW detection. Scales across GPUs with RAPIDS. Use for preparing high-quality training datasets, cleaning web data, or deduplicating large corpora.

21Installs·0Trend·@ovachiever

Installation

$npx skills add https://github.com/ovachiever/droid-tings --skill nemo-curator

SKILL.md

| Operation | CPU (16 cores) | GPU (A100) | Speedup |

| Fuzzy dedup (8TB) | 120 hours | 7.5 hours | 16× | | Exact dedup (1TB) | 8 hours | 0.5 hours | 16× | | Quality filtering | 2 hours | 0.2 hours | 10× |

GPU-accelerated data curation for LLM training. Supports text/image/video/audio. Features fuzzy deduplication (16× faster), quality filtering (30+ heuristics), semantic deduplication, PII redaction, NSFW detection. Scales across GPUs with RAPIDS. Use for preparing high-quality training datasets, cleaning web data, or deduplicating large corpora. Source: ovachiever/droid-tings.

Open your terminal or command line tool (Terminal, iTerm, Windows Terminal, etc.) Copy and run this command: npx skills add https://github.com/ovachiever/droid-tings --skill nemo-curator Once installed, the skill will be automatically configured in your AI coding environment and ready to use in Claude Code or Cursor

View raw

Facts (cite-ready)

Stable fields and commands for AI/search citations.

Install command
npx skills add https://github.com/ovachiever/droid-tings --skill nemo-curator
Category
*Creative Media
Verified
First Seen
2026-02-01
Updated
2026-02-18

Quick answers

What is nemo-curator?

GPU-accelerated data curation for LLM training. Supports text/image/video/audio. Features fuzzy deduplication (16× faster), quality filtering (30+ heuristics), semantic deduplication, PII redaction, NSFW detection. Scales across GPUs with RAPIDS. Use for preparing high-quality training datasets, cleaning web data, or deduplicating large corpora. Source: ovachiever/droid-tings.

How do I install nemo-curator?

Open your terminal or command line tool (Terminal, iTerm, Windows Terminal, etc.) Copy and run this command: npx skills add https://github.com/ovachiever/droid-tings --skill nemo-curator Once installed, the skill will be automatically configured in your AI coding environment and ready to use in Claude Code or Cursor

Where is the source repository?

https://github.com/ovachiever/droid-tings