Generative Detail Enhancement for Physically Based Materials

1University of Maryland, College Park
2NVIDIA
SIGGRAPH 2025 Conference Proceedings

Supplemenatry video

Abstract

We present a tool for enhancing the detail of physically based materials using an off-the-shelf diffusion model and inverse rendering. Our goal is to enhance the visual fidelity of materials with detail that is often tedious to author, by adding signs of wear, aging, weathering, etc. As these appearance details are often rooted in real-world processes, we leverage a generative image model trained on a large dataset of natural images with corresponding visuals in context. Starting with a given geometry, UV mapping, and basic appearance, we render multiple views of the object. We use these views, together with an appearance-defining text prompt, to condition a diffusion model. The details it generates are then backpropagated from the enhanced images to the material parameters via inverse differentiable rendering. For inverse rendering to be successful, the generated appearance has to be consistent across all the images. We propose two priors to address the multi-view consistency of the diffusion model. First, we ensure that the initial noise that seeds the diffusion process is itself consistent across views by integrating it from a view-independent UV space. Second, we enforce geometric consistency by biasing the attention mechanism via a projective constraint so that pixels attend strongly to their corresponding pixel locations in other views. Our approach does not require any training or finetuning of the diffusion model, is agnostic of the material model used, and the enhanced material properties, i.e., 2D PBR textures, can be further edited by artists.

Teaser Image

(Left) We embed the input noise within the UV space of the object to obtain a multi-view correlated noise field that could be rendered from any view. (Right) we bias the attentions based on projections in 3D. We project every latent pixel in view B, to the corresponding point in view A to obtain correspondences. These correspondences then are used to bias the self-attention module.




36 view generation

Diffusion images (left) - UV-warped (right)

0
Original View UV Warped View

Attention Biasing Strength (Figure 4)

Strength: (in this case 1.2 and 1.8 is recommended)

initial
Image 0 Image 1 Image 2 Image 3

Results

0

Initial

Initial

Diffused

Diffused

Recovered

Recovered

Relit

Relit

Albedo (Init)

Albedo Initial

Albedo (Rec)

Albedo Recovered

Roughness (Init)

Roughness Initial

Roughness (Rec)

Roughness Recovered

Normals (Init)

Normal Initial

Normals (Rec)

Normal Recovered

Baseline Comparison

0

Ours (Diffused)

Ours Diffused

TexPainter (Diffused)

TexPainter Diffused

RGB↔X (Diffused)

RGBX Diffused

Ours (Recovered)

Ours Recovered

Paint-it

Paint-it

DreamMat

DreamMat

TexPainter (Recovered)

TexPainter Recovered

Additional hyper-parameters (Figures 21 and 22)

14.5
0.4
0.4
View 001 View 006

Ablation (Figures 8 and 23)

1

Conditioning images

Conditioning

Baseline: M-view prompting

Baseline

+ ControlNet tile

Tile

+ View-correlated noise

Noise

+ Attention bias (full model)

Attention Bias
Crop 0
Crop 1
Crop 2
Crop 3
Crop 4

Effect of consistency on inverse rendering (Figure 9)

1

Controlnet tile + normal

Baseline 1

+Multi-view prompting

Baseline 2

Ours (+attention bias & view correlated noise)

Ours

BibTeX


        @misc{hadadan2025generativeenhancementphysicallybased,
          title={Generative Detail Enhancement for Physically Based Materials}, 
          author={Saeed Hadadan and Benedikt Bitterli and Tizian Zeltner and Jan Novák and Fabrice Rousselle and Jacob Munkberg and Jon Hasselgren and Bartlomiej Wronski and Matthias Zwicker},
          year={2025},
          eprint={2502.13994},
          archivePrefix={arXiv},
          primaryClass={cs.GR},
          url={https://arxiv.org/abs/2502.13994}, 
    }