We introduce ASTRAEA, an automatic framework that searches for near-optimal configurations for vDiT-based video generation.At its core, ASTRAEA proposes a lightweight token selection mechanism and a memory-efficient, GPU-parallel sparse attention strategy, enabling linear reductions in execution time with minimal impact on generation quality.To determine optimal token reduction for different timesteps, we further design a search framework that leverages a classic evolutionary algorithm to automatically determine the distribution of the token budget effectively.Together, ASTRAEA achieves up to 2.4x inference speedup on a single GPU with great scalability (up to 13.2x speedup on 8 GPUs) while retaining better video quality compared to the state-of-the-art methods (<0.5% loss on the VBench score compared to the baseline vDiT models).
PSNR: 29.21 Speedup: 1.36x
PSNR: 22.45 Speedup: 1.86x
PSNR: 22.35 Speedup: 2.29x
PSNR: 18.21 Speedup: 1.37x
PSNR: 19.77 Speedup: 1.60x
Prompt: a dog running happily
PSNR: 29.37 Speedup: 1.36x
PSNR: 24.95 Speedup: 1.86x
PSNR: 24.29 Speedup: 2.29x
PSNR: 18.55 Speedup: 1.37x
PSNR: 17.26 Speedup: 1.60x
Prompt: A jellyfish floating through the ocean, with bioluminescent tentacles
PSNR: 27.58 Speedup: 1.36x
PSNR: 23.51 Speedup: 1.86x
PSNR: 22.33 Speedup: 2.29x
PSNR: 16.37 Speedup: 1.37x
PSNR: 18.87 Speedup: 1.60x
Prompt: A robot DJ is playing the turntable, in heavy raining futuristic tokyo rooftop cyberpunk night, sci-fi, fantasy
PSNR: 35.00 Speedup: 1.36x
PSNR: 25.76 Speedup: 1.86x
PSNR: 25.28 Speedup: 2.29x
PSNR: 18.28 Speedup: 1.37x
PSNR: 24.08 Speedup: 1.60x
Prompt: A raccoon dressed in suit playing the trumpet, stage background
PSNR: 24.11 Speedup: 1.46x
PSNR: 23.12 Speedup: 1.91x
PSNR: 18.14 Speedup: 2.35x
PSNR: 18.57 Speedup: 1.23x
PSNR: 16.18 Speedup: 1.69x
Prompt: A couple in formal evening wear going home get caught in a heavy downpour with umbrellas by Hokusai, in the style of Ukiyo
PSNR: 24.03 Speedup: 1.46x
PSNR: 22.46 Speedup: 1.91x
PSNR: 20.74 Speedup: 2.35x
PSNR: 19.48 Speedup: 1.23x
PSNR: 15.69 Speedup: 1.69x
Prompt: an elephant running to join a herd of its kind
PSNR: 22.14 Speedup: 1.46x
PSNR: 20.74 Speedup: 1.91x
PSNR: 18.89 Speedup: 2.35x
PSNR: 18.64 Speedup: 1.23x
PSNR: 12.37 Speedup: 1.69x
Prompt: /A happy fuzzy panda playing guitar nearby a campfire, snow mountain in the background
PSNR: 22.89 Speedup: 1.46x
PSNR: 21.67 Speedup: 1.91x
PSNR: 19.77 Speedup: 2.35x
PSNR: 18.24 Speedup: 1.23x
PSNR: 17.56 Speedup: 1.69x
Prompt: botanical garden