·sentencepiece
</>

sentencepiece

ovachiever/droid-tings

Language-independent tokenizer treating text as raw Unicode. Supports BPE and Unigram algorithms. Fast (50k sentences/sec), lightweight (6MB memory), deterministic vocabulary. Used by T5, ALBERT, XLNet, mBART. Train on raw text without pre-tokenization. Use when you need multilingual support, CJK languages, or reproducible tokenization.

22Installs·0Trend·@ovachiever

Installation

$npx skills add https://github.com/ovachiever/droid-tings --skill sentencepiece

SKILL.md

Unsupervised tokenizer that works on raw text without language-specific preprocessing.

Key principle: Treat text as raw Unicode, whitespace = ▁ (meta symbol)

| English | 0.9995 | Most common chars | | CJK (Chinese) | 1.0 | All characters needed | | Multilingual | 0.9995 | Balance |

Language-independent tokenizer treating text as raw Unicode. Supports BPE and Unigram algorithms. Fast (50k sentences/sec), lightweight (6MB memory), deterministic vocabulary. Used by T5, ALBERT, XLNet, mBART. Train on raw text without pre-tokenization. Use when you need multilingual support, CJK languages, or reproducible tokenization. Source: ovachiever/droid-tings.

View raw

Facts (cite-ready)

Stable fields and commands for AI/search citations.

Install command
npx skills add https://github.com/ovachiever/droid-tings --skill sentencepiece
Category
</>Dev Tools
Verified
First Seen
2026-02-01
Updated
2026-02-18

Quick answers

What is sentencepiece?

Language-independent tokenizer treating text as raw Unicode. Supports BPE and Unigram algorithms. Fast (50k sentences/sec), lightweight (6MB memory), deterministic vocabulary. Used by T5, ALBERT, XLNet, mBART. Train on raw text without pre-tokenization. Use when you need multilingual support, CJK languages, or reproducible tokenization. Source: ovachiever/droid-tings.

How do I install sentencepiece?

Open your terminal or command line tool (Terminal, iTerm, Windows Terminal, etc.) Copy and run this command: npx skills add https://github.com/ovachiever/droid-tings --skill sentencepiece Once installed, the skill will be automatically configured in your AI coding environment and ready to use in Claude Code or Cursor

Where is the source repository?

https://github.com/ovachiever/droid-tings