Expert guidance for optimizing machine learning models for Apple's CoreML framework on iOS and macOS devices.
Many "slow" models are accidentally CPU-bound. Configure via MLModelConfiguration.computeUnits:
| .all | Uses all available compute units including Neural Engine (default, recommended) | | .cpuAndNeuralEngine | CPU + Neural Engine, excludes GPU | | .cpuAndGPU | CPU + GPU, excludes Neural Engine | | .cpuOnly | Forces CPU-only execution (for debugging/consistency) |
Optimize CoreML models for iOS and macOS deployment. Covers quantization, palettization, pruning, Neural Engine targeting, compute unit selection, and performance profiling. Use when converting ML models to CoreML, optimizing model size/latency, debugging Neural Engine issues, or benchmarking on-device inference. Source: ckorhonen/claude-skills.