ECCV 2026

Event-Driven Video Generation

EVD turns frame-first video generation into event-grounded state transitions, improving causal interactions without sacrificing appearance.

Chika Maduabuchi Jindong Wang

Paper PDF arXiv

Representative EVD generations across four interaction settings.

Core Idea

Events should decide when state is allowed to change.

Causal Initiation

Suppresses motion before contact or interaction evidence appears.

Event Realization

Concentrates updates where the interaction should actually happen.

Stable Postconditions

Reduces late drift after objects settle, stop, close, or land.

Contact Stability

Improves support, placement, and physically plausible interaction outcomes.

Curated Comparisons

EVD vs. strong video generation baselines.

Each row uses the same prompt across Sora, Movie Gen, DiT, and EVD. EVD is highlighted to make the method comparison easy to scan.

More Samples

More Event-Driven Samples.

Browse EVD generations grouped by the EVD-Bench taxonomy: state persistence, spatial accuracy, support relations, and contact stability. These samples focus on the final EVD output so the breadth of event-grounded behavior is easy to scan.

Inspect all 4-way comparison clips

Method Overview

A lightweight event pathway gates video DiT updates.

EVD predicts token-aligned event activity, forms a stable event gate, and applies that gate to the denoising update so only event-supported regions are allowed to change state.

Overview diagram of Event-Driven Video Generation.

Citation

BibTeX

@inproceedings{maduabuchi2026eventdriven,
  title     = {Event-Driven Video Generation},
  author    = {Maduabuchi, Chika and Wang, Jindong},
  booktitle = {European Conference on Computer Vision (ECCV)},
  year      = {2026}
}