The contribution of the paper is..
1.
First to introduce adversarial denoising method for MC denoisng using generative adversarial network (GAN)
2.
Auxiliary feature modulation
The paper introduces the intuition of using GAN is that the goal of GAN and rendering is similar : To produce realistic image from data-specific inputs. Also, previous denoising methods based on auxiliary buffer is fit for training.
The paper points the problems of the state-of-the-art methods as...
1.
Objectives are handcrafted → Not realistic and leads to lack of details
Current loss functions like MSE MAPe are not fit to fully reflect the quality of denoised image. Using metrics like L1, L2 leads to over-smoothed image. So by using GAN, the paper expects to solve this problem.
2.
Auxiliary features are not used wisely.
Previous methods simply use all auxiliary features as channel inputs, resulting to point-wise biasing. This results not to handle both high & low frequency effects. Using GAN will provide a better usage of these features.
The paper proposes a Adversarial Monte Carlo Denoising.
The framework is similar to KPCN → It seperately deals with diffuse and specular component since the rendering equation and the BRDF factors allows it to. It allows auxiliary feature buffer of normals (3 channels), depth, albedo(3 channels). Others are optional.
Recent learning-based model just concatenates the auxiliary features which leads to the limited influence of the features throughout the deep features.
The paper introduces a conditioned feature modulation (CFM) inspired by conditional nomalization methods. For instance, the conditional batch normalization uses two trainable parameter each for shift and scaling. Before normalization, a single MLP is used to predict the paramters that can give an zero-mean & small variance normalization result. Like this, the CFM introduces learnable paramters that scales & shifts the feature map according to the auxiliary features.
Introduced adversarial loss plays a huge role. Original per-pixel metrics like L1 L2 losses do not capture wide relations of pixels when adversarial loss can.
By ablation study on auxiliary feature buffer, the paper concluded that albedo is the most significant factor when normal is less and depth is least significant. Also, with more texture and contour information the model was able get more guidance on reconstructing edges. But still non-geometrics such as shading is still unknown.
Conditioned modulation helps to find the relationship between the auxiliary features and input image.
Limitations are...
1.
Common black-box limitation
2.
Feature buffer might not capture the details (hair, blurry specular textures)
3.
Limited smaples lead to over-fitting for some complex effects