Your RTX Is at 2%. Let's Fix That.
Your RTX card is sitting at 2% utilization right now. That GPU runs the same transformer architectures that billion-dollar AI services charge per-image for. Onset Engine turns your local CUDA cores and NVENC encoder into a full video intelligence pipeline — zero cloud overhead, zero upload wait, zero recurring fees.
The Cloud Tax
Cloud AI video tools rent you compute time on the same hardware you already own. RunwayML charges $0.50/minute of generated video. Descript renders on shared GPU instances with unpredictable queue times. CapCut uploads your footage to remote servers for processing.
Meanwhile, your RTX 3070 sits at 5% utilization, perfectly capable of running the same inference workloads — if only the software was designed to use it. You're paying a cloud company to do worse work on worse hardware, slower.
The Full Hardware Stack
Onset Engine is designed from the ground up to maximize your local hardware. Every stage of the pipeline — from ingest to render — uses your hardware to its full potential.
- ✓ CUDA — AI Inference: OpenCLIP ViT-L/14 runs on CUDA. 768-dim embeddings computed at high speed for every clip
- ✓ NVENC — Hardware Encoding: Final renders use NVIDIA's hardware encoder. p6/CQ17 for maximum quality, p4/CQ21 for fast turnaround
- ✓ Decord 224px: Video frames decoded at native 224×224 resolution during ingest — no 4K→224px resize bottleneck
- ✓ Multithreaded Rendering: MoviePy write with threads=max(4, cpu_count-2) — 3–4× speedup over single-threaded
- ✓ SQLite Vector Search: CLIP embeddings stored locally. Cosine similarity search over 100k clips in milliseconds
- ✓ MPV Hardware Decode: DJ Mode uses MPV's hardware-accelerated video decode with keyframe seeking for instant transitions
Performance Benchmarks
Ingest 1 Hour of 4K
CLIP analysis at 224px native decode via Decord. Embedding-based scene detection eliminates the second full-resolution read.
Final Render (1080p)
Full VFX pipeline: zoom pulses, transitions, color grading, letterboxing. NVENC p6 hardware encoding. Chunked for memory safety.
Upload Wait Time
Nothing leaves your machine. No upload progress bars. No cloud queue. No "estimated time remaining: 47 minutes."
Memory-Safe by Design
Rendering a 30-minute video with full VFX would normally crash most tools. Onset Engine uses an EDL-based two-phase architecture:
Phase 1 (fast): The selection loop builds an edit decision list (EDL) — clip paths, start/end times, energy assignments. No video memory used. Takes ~30 seconds.
Phase 2 (chunked): The renderer loads clips in chunks of 10–20, applies VFX, encodes with NVENC, and flushes RAM between batches. The final stitch is a single FFmpeg concat. No OOM crashes. No 32GB RAM requirements.
Ready to Try It?
Download the free demo and see the results on your own footage. One-time purchase, no subscriptions.