Use Case

Your RTX Is at 2%. Let's Fix That.

Your RTX card is sitting at 2% utilization right now. That GPU runs the same transformer architectures that billion-dollar AI services charge per-image for. Onset Engine turns your local CUDA cores and NVENC encoder into a full video intelligence pipeline — zero cloud overhead, zero upload wait, zero recurring fees.

The Cloud Tax

Cloud AI video tools rent you compute time on the same hardware you already own. RunwayML charges $0.50/minute of generated video. Descript renders on shared GPU instances with unpredictable queue times. CapCut uploads your footage to remote servers for processing.

Meanwhile, your RTX 3070 sits at 5% utilization, perfectly capable of running the same inference workloads — if only the software was designed to use it. You're paying a cloud company to do worse work on worse hardware, slower.

Studio Mode UI illustrating a comparison chart of GPU utilization between cloud-based rendering and efficient local processing

The Full Hardware Stack

Onset Engine is designed from the ground up to maximize your local hardware. Every stage of the pipeline — from ingest to render — uses your hardware to its full potential.

  • CUDA — AI Inference: OpenCLIP ViT-L/14 runs on CUDA. 768-dim embeddings computed at high speed for every clip
  • NVENC — Hardware Encoding: Final renders use NVIDIA's hardware encoder. p6/CQ17 for maximum quality, p4/CQ21 for fast turnaround
  • Decord 224px: Video frames decoded at native 224×224 resolution during ingest — no 4K→224px resize bottleneck
  • Multithreaded Rendering: MoviePy write with threads=max(4, cpu_count-2) — 3–4× speedup over single-threaded
  • SQLite Vector Search: CLIP embeddings stored locally. Cosine similarity search over 100k clips in milliseconds
  • MPV Hardware Decode: DJ Mode uses MPV's hardware-accelerated video decode with keyframe seeking for instant transitions
The hardware acceleration pipeline utilizing CUDA processing, NVENC encoding, and fast NVMe storage

Performance Benchmarks

~3 min

Ingest 1 Hour of 4K

CLIP analysis at 224px native decode via Decord. Embedding-based scene detection eliminates the second full-resolution read.

~5 min

Final Render (1080p)

Full VFX pipeline: zoom pulses, transitions, color grading, letterboxing. NVENC p6 hardware encoding. Chunked for memory safety.

0 ms

Upload Wait Time

Nothing leaves your machine. No upload progress bars. No cloud queue. No "estimated time remaining: 47 minutes."

Memory-Safe by Design

Rendering a 30-minute video with full VFX would normally crash most tools. Onset Engine uses an EDL-based two-phase architecture:

Phase 1 (fast): The selection loop builds an edit decision list (EDL) — clip paths, start/end times, energy assignments. No video memory used. Takes ~30 seconds.

Phase 2 (chunked): The renderer loads clips in chunks of 10–20, applies VFX, encodes with NVENC, and flushes RAM between batches. The final stitch is a single FFmpeg concat. No OOM crashes. No 32GB RAM requirements.

A progress indicator showing a successful chunked rendering process operating smoothly on local hardware without crashes

Ready to Try It?

Download the free demo and see the results on your own footage. One-time purchase, no subscriptions.

Get Onset Engine Explore All Features