TL;DR: RadarGen generates sparse radar point clouds from multi-view camera images.
Given multi-view camera images, RadarGen generates radar point clouds that align with real-world statistics and can be consumed by downstream perception models. The generated point clouds preserve scene geometry and handle occlusions. For example, modifying the input scene with an off-the-shelf image editing tool (e.g., replacing a distant car with a closer truck) updates the radar response, removing returns from newly occluded regions and reflecting the new object geometry.
We present RadarGen, a diffusion model for synthesizing realistic automotive radar point clouds from multi-view camera imagery. RadarGen adapts efficient image-latent diffusion to the radar domain by representing radar measurements in bird’s-eye-view form that encodes spatial structure together with radar cross section (RCS) and Doppler attributes. A lightweight recovery step reconstructs point clouds from the generated maps. To better align generation with the visual scene, RadarGen incorporates BEV-aligned depth, semantic, and motion cues extracted from pretrained foundation models, which guide the stochastic generation process toward physically plausible radar patterns. Conditioning on images makes the approach broadly compatible, in principle, with existing visual datasets and simulation frameworks, offering a scalable direction for multimodal generative simulation. Evaluations on large-scale driving data show that RadarGen captures characteristic radar measurement distributions and reduces the gap to perception models trained on real data, marking a step toward unified generative simulation across sensing modalities.
| Method | Entire Area | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| CD Loc. (↓) |
CD Full (↓) |
IoU@1m (↑) |
DA Rec. (↑) |
DA Prec. (↑) |
DA F1 (↑) |
MMD Loc. (↓) |
MMD RCS (↓) |
MMD Dopp. (↓) |
|
| Baseline | 1.84 ± 0.48 | 0.038 ± 0.009 | 0.23 ± 0.10 | 0.15 ± 0.10 | 0.14 ± 0.10 | 0.14 ± 0.09 | 0.368 ± 0.151 | 0.36 ± 0.25 | 0.65 ± 0.64 |
| RadarGen | 1.68 ± 0.39 | 0.040 ± 0.008 | 0.31 ± 0.11 | 0.23 ± 0.12 | 0.26 ± 0.12 | 0.24 ± 0.12 | 0.056 ± 0.062 | 0.09 ± 0.15 | 0.31 ± 0.74 |
| Method | Foreground | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| CD Loc. (↓) |
CD Full (↓) |
Dens. Sim. (↑) |
Hit Rate (↑) |
MMD Car (↓) | MMD Truck (↓) | MMD Trailer (↓) | |||||||
| Loc. | RCS | Dopp. | Loc. | RCS | Dopp. | Loc. | RCS | Dopp. | |||||
| Baseline | 1.32 ± 0.79 | 0.075 ± 0.049 | 0.35 ± 0.43 | 0.37 | 0.035 | 0.753 | 0.549 | 0.167 | 0.202 | 0.485 | 0.0459 | 0.064 | 0.607 |
| RadarGen | 0.95 ± 0.65 | 0.069 ± 0.049 | 0.51 ± 0.41 | 0.66 | 0.037 | 0.006 | 0.014 | 0.024 | 0.031 | 0.060 | 0.0069 | 0.022 | 0.046 |
RadarGen broadly outperforms the baseline on geometric fidelity (CD, IoU, Density Similarity, Hit Rate), radar attribute fidelity (DA Recall, Precision, F1), and distribution similarity (MMD).
RadarGen's generated point clouds closely match the ground truth in shape, distribution, and count, demonstrating a significant advantage over the baseline. RadarGen uses inputs t and t + ∆t, while the baseline uses only t. Ground truth bounding boxes are highlighted in color.
Modifying the input images using an off-the-shelf image editing tool updates the radar response, demonstrating object removal (left) and insertion (right).
@article{borreda2025radargen,
title={RadarGen: Automotive Radar Point Cloud Generation from Cameras},
author={Borreda, Tomer and Ding, Fangqiang and Fidler, Sanja and Huang, Shengyu and Litany, Or},
journal={arXiv preprint arXiv:2512.17897},
year={2025}
}