What Your GPU Actually Does (and Doesn't Do) in Traditional Editors
In Premiere Pro or DaVinci Resolve, your GPU handles:
- Playback acceleration: Decoding and displaying video on the timeline
- Effects rendering: Lumetri color, transitions, and GPU-accelerated effects
- Export encoding: NVENC hardware encoding (if enabled) during final render
What your GPU does not do in traditional editors:
- Understand your footage: No AI inference — your GPU doesn't analyze what's in your clips
- Make editing decisions: Clip selection, ordering, and timing are 100% manual
- Search your library: Finding specific clips requires manual scrubbing or filename guessing
In other words, you're using maybe 15% of your GPU's capability. The CUDA cores that could run transformer models sit idle while you manually scrub through footage.
What Cloud AI Services Are Selling You
When you use RunwayML, Descript, or CapCut's AI features, here's what's actually happening:
- Your footage is uploaded to their servers
- Their GPUs (identical to yours) run AI inference on your footage
- They charge you $0.05–$0.50 per minute of processed video
- Results are sent back over the network
You're paying a monthly fee for remote access to the same CUDA architecture sitting under your desk. The only difference is the software layer.
The Shortcut: Full GPU Utilization with Onset Engine
Onset Engine is built ground-up for local GPU execution. Here's what each component of your GPU actually does:
- CUDA cores → AI inference: OpenCLIP ViT-L/14 runs on your CUDA cores via PyTorch. Every clip gets a 768-dimensional semantic embedding. Your GPU understands your footage
- NVENC encoder → hardware rendering: Final video output is encoded by NVIDIA's dedicated hardware encoder — not the CPU. A 3-minute 4K video renders in ~90 seconds
- VRAM → model hosting: The CLIP model loads once into VRAM (~2GB) and stays resident. Inference runs at GPU memory bandwidth, not disk speed
- Tensor cores → mixed precision: FP16 inference on RTX tensor cores doubles throughput on supported cards
GPU utilization during an Onset Engine ingest: 60–85% GPU, which is what that hardware was designed for. Cloud AI charges you $0.50/min for this. Onset Engine does it for $0/min after a one-time $119 purchase.
Hardware Requirements
- Minimum: NVIDIA GTX 1060 6GB (CUDA 6.1) — functional but slow
- Recommended: NVIDIA RTX 3060 12GB — fast inference, hardware NVENC
- Optimal: NVIDIA RTX 4070+ 12GB — tensor core acceleration, fastest NVENC gen
More VRAM = faster inference. More CUDA cores = faster batch processing. But even a mid-range card runs the full pipeline — the same pipeline that cloud services charge monthly for.