Flow Matching vs DDPM: Why ODE Beats SDE in Diffusion Models
DDPM needs 1000 steps, Flow Matching needs 10. The mathematics of straight-line generation. Comparing SDE curved paths vs ODE straight paths.

Flow Matching vs DDPM: Why ODE Beats SDE in Diffusion Models
DDPM needs 1000 steps, Flow Matching needs 10. The mathematics of straight-line generation.
TL;DR
- DDPM: Remove noise gradually via stochastic process. Random perturbations at each step create curved paths
- Flow Matching: Move directly toward data via deterministic process. Straight paths enable fast generation
- Key Difference: DDPM predicts "noise", Flow Matching predicts "velocity field"
1. Problem Setup: The Path from Noise to Data
The goal of generative models is simple:
How do we achieve this transformation? Two paradigms emerge.
DDPM's Approach: "Slowly and Stochastically"
DDPM defines a Markov chain:
where $\bar{\alpha}_t = \prod_{s=1}^{t} \alpha_s$ and $\alpha_t = 1 - \beta_t$.
As time progresses ($t \to T$), information about data $x_0$ vanishes, leaving only pure noise $\epsilon$.
Flow Matching's Approach: "Straight and Deterministic"
Flow Matching uses linear interpolation:
At $t=0$, we have $x_0$ (data). At $t=1$, we have $\epsilon$ (noise). Everything in between is a straight line.
2. Different Training Objectives
DDPM: Noise Prediction
DDPM trains a neural network $\epsilon_\theta$ to predict the added noise:
Why predict noise? To recover $x_{t-1}$ in the reverse process:
where $z \sim \mathcal{N}(0, I)$ is fresh randomness added at each step. This is what curves the path.
Flow Matching: Velocity Prediction
Flow Matching trains a neural network $v_\theta$ to predict the velocity field:
The target velocity field is the time derivative of the conditional path:
This velocity is constant! Regardless of time $t$, we always move in the $\epsilon - x_0$ direction at constant speed.
3. Sampling: SDE vs ODE
DDPM Sampling: SDE-Based
DDPM's reverse process follows a Stochastic Differential Equation (SDE):
where:
- $f(x, t)$: drift coefficient
- $g(t)$: diffusion coefficient (noise magnitude)
- $d\bar{w}$: reverse-time Brownian motion
The Problem: The $g(t) d\bar{w}$ term adds randomness at every step. The path meanders like Brownian motion, requiring many small steps to reach the target.
Flow Matching Sampling: ODE-Based
Flow Matching follows an Ordinary Differential Equation (ODE):
No stochastic term. We move deterministically along the learned velocity field.
Sampling:
# Euler method
x = torch.randn(batch_size, dim) # Start: pure noise
dt = 1.0 / num_steps
for t in torch.linspace(1, 0, num_steps):
v = model(x, t) # Predict velocity
x = x - v * dt # Move along straight line4. Why is Flow Matching Faster?
Mathematical Intuition
Consider the expected path length for DDPM. Due to Brownian motion characteristics:
where $T$ is the number of steps. More steps mean longer paths.
For Flow Matching's straight-line path:
Independent of step count. Shortest possible distance.
Empirical Evidence
| Method | Steps | FID (CIFAR-10) |
|---|---|---|
| DDPM | 1000 | 3.17 |
| DDPM | 100 | 13.51 |
| DDIM | 50 | 4.67 |
| Flow Matching | 10 | 3.42 |
DDPM requires 1000 steps for quality results; Flow Matching achieves comparable quality with just 10.
5. Rectified Flow: Evolution of Flow Matching
Rectified Flow advances Flow Matching further.
Core Idea: Reflow
The learned flow may not be perfectly straight. Reflow "straightens" it:
- Generate $(z, x_0)$ pairs using the learned model
- Train a new straight-line path on these pairs
- Repeat to progressively straighten the trajectory
Combined with Distillation
For 1-step generation, apply distillation:
where $G_\theta(z)$ generates data in a single forward pass.
6. Implementation Comparison
DDPM Forward Process
def ddpm_forward(x0, t, noise_schedule):
"""
x_t = sqrt(alpha_bar_t) * x0 + sqrt(1 - alpha_bar_t) * epsilon
"""
alpha_bar = noise_schedule.alpha_bar[t]
epsilon = torch.randn_like(x0)
x_t = torch.sqrt(alpha_bar) * x0 + torch.sqrt(1 - alpha_bar) * epsilon
return x_t, epsilonFlow Matching Forward Process
def flow_matching_forward(x0, t):
"""
x_t = (1 - t) * x0 + t * epsilon
"""
epsilon = torch.randn_like(x0)
# Reshape t for broadcasting
t = t.view(-1, 1, 1, 1) # for images
x_t = (1 - t) * x0 + t * epsilon
velocity = epsilon - x0 # target velocity
return x_t, velocityTraining Loop Comparison
# DDPM
for x0 in dataloader:
t = torch.randint(0, T, (batch_size,))
x_t, epsilon = ddpm_forward(x0, t, noise_schedule)
epsilon_pred = model(x_t, t)
loss = F.mse_loss(epsilon_pred, epsilon)
# Flow Matching
for x0 in dataloader:
t = torch.rand(batch_size) # uniform [0, 1]
x_t, velocity = flow_matching_forward(x0, t)
velocity_pred = model(x_t, t)
loss = F.mse_loss(velocity_pred, velocity)7. When to Use What?
Choose DDPM/DDIM When:
- Leveraging existing pretrained models (Stable Diffusion, etc.)
- High diversity is critical
- Stochastic sampling is required
Choose Flow Matching When:
- Fast inference is the priority
- Training from scratch
- Simple, intuitive implementation is valued
Choose Rectified Flow When:
- 1-step or few-step generation is the goal
- Real-time applications
- Mobile/edge device deployment
8. Mathematical Connection: Score and Velocity
DDPM's score function and Flow Matching's velocity are closely related.
The score function is the gradient of log probability:
Relationship between score and noise prediction in DDPM:
Relationship between velocity and score via probability flow ODE:
Therefore, a well-trained DDPM model can be converted to a Flow Matching model. This is one reason Stable Diffusion 3 transitioned to Rectified Flow.
Conclusion
| Property | DDPM (SDE) | Flow Matching (ODE) |
|---|---|---|
| Path | Curved (Brownian) | Straight |
| Prediction Target | Noise $\epsilon$ | Velocity $v$ |
| Sampling | Stochastic | Deterministic |
| Required Steps | 100-1000 | 5-50 |
| Implementation | Moderate | Simple |
Flow Matching emerged from asking "Why take the long way?" The simple insight that straight lines are shortest has dramatically improved generation efficiency.
References
- Ho, J., et al. "Denoising Diffusion Probabilistic Models" (NeurIPS 2020)
- Lipman, Y., et al. "Flow Matching for Generative Modeling" (ICLR 2023)
- Liu, X., et al. "Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow" (ICLR 2023)
- Song, Y., et al. "Score-Based Generative Modeling through Stochastic Differential Equations" (ICLR 2021)