Dynamic Graph CNN for Learning on Point Cloud [ToG19]

Theme
논문
Type
테마글
Status
Post
분야
3D Vision
중요도
2 more properties

My Thoughts

The most interesting contribution is to interpret their model as a general representation of previous architectures for point-cloud tasks.
Also it was interesting to view their design of EdgeConv as balance the effect of local and global information of vertices.
Drawing graphs & K-NN for every layer might be exhaustive. Yet, these are (or can be) easily done on CUDA based on some Torch functions. But how could they achieve way faster time that PointNet++ and PCNN?
While their dynamicity can bring rich context for various tasks, it might harm the convergence of the training on the first step. There could have been extensive experiments to get the optimal training environment.

Applicability to PBR

The graph's dynamicity can be extended by using path graphs for temporal rendering. The model can still handle new vertices and edges from the next sample or the next frame.
Their final model (Eq.(8)) resembles the model of post-processing for MC denoising. The difference is that the inputs (features) are optimized for this work while post-processing (pixel radiances) doesn’t. Still, such method could have contributed to a more optimal result by reducing any bias.

Problems to Solve

Recent PointNet-based methods treat points independently → Fail to capture local features
Even though variations include locality (i.e. using neighbor points for embedding)
Grid-like representations to overcome the irregular distribution of point clouds requires lots of memory and leads to artifacts (aliasing).

Constructing Graph

1.
Construct a graph with points as vertices xi\mathbf{x}_i.
2.
Draw edges based on the K-nearest neighbor based on the distance eij\mathbf{e}_{ij} .

Edge Convolution

Overall architecture of DGCNN for classification & segmentation
1.
Calculate the edge features as eij=hΘ(xi,xj)\mathbf{e}_{ij}=h_\Theta(\mathbf{x}_i, \mathbf{x}_j), where hΘ()h_\Theta(\cdot) is a trainable function
eijm=hΘm(xi,xj)=ReLU(θm(xjxi)+ϕmxi)\mathbf{e}_{ijm}=h_{\Theta_m} (\mathbf{x}_i, \mathbf{x}_j)= \text{ReLU}(\theta_m\cdot(\mathbf{x}_j-\mathbf{x}_i)+\phi_m\cdot \mathbf{x}_i)
The asymmetric structure resembles balancing the local information (xjxi)(\mathbf{x}_j-\mathbf{x}_i) and the global information xi\mathbf{x}_i during training.
This is implemented as MLP and applied to vertices via shared MLP.
2.
Update the vertices as xim=maxj:(i,j)Eeijmx'_{im}=\max_{j:(i,j)\in\mathcal{E}}e'_{ijm}.
3.
Update edge features
4.
And so on…