Paper-Hub

2022, arXiv. DOI: 10.48550/arXiv.2203.15216 arXiv ID: 2203.15216

Affine Medical Image Registration with Coarse-to-Fine Vision Transformer

Tony C. W. Mok, Albert C. S. Chung

Abstract:

Affine registration is indispensable in a comprehensive medical image registration pipeline. However, only a few studies focus on fast and robust affine registration algorithms. Most of these studies utilize convolutional neural networks (CNNs) to learn joint affine and non-parametric registration, while the standalone performance of the affine subnetwork is less explored. Moreover, existing CNN-based affine registration approaches focus either on the local misalignment or the global orientation and position of the input to predict the affine transformation matrix, which are sensitive to spatial initialization and exhibit limited generalizability apart from the training dataset. In this paper, we present a fast and robust learning-based algorithm, Coarse-to-Fine Vision Transformer (C2FViT), for 3D affine medical image registration. Our method naturally leverages the global connectivity and locality of the convolutional vision transformer and the multi-resolution strategy to learn the global affine registration. We evaluate our method on 3D brain atlas registration and template-matching normalization. Comprehensive results demonstrate that our method is superior to the existing CNNs-based affine registration methods in terms of registration accuracy, robustness and generalizability while preserving the runtime advantage of the learning-based methods. The source code is available at this https URL.

2022-09-29 12:12:00

前进:

#paper Affine Medical Image Registration with Coarse-to-Fine Vision Transformer Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 20835-20844 仿射配准是综合医学图像配准中不可缺少的一部分。然而，关于快速、鲁棒的仿射配准算法的研究很少。这些研究大多都是联合仿射和变形配准的CNN模型，而对仿射子网络的独立性能研究较少。此外，现有的基于CNN的仿射配准方法要么关注输入的局部错位，要么关注输入的全局方向和位置，以预测仿射变换矩阵，这种方法对空间初始化敏感，泛化能力有限。这篇论文提出了一种快速、鲁棒的基于学习的三维仿射医学图像配准算法C2FViT。该方法自然地利用Transformer的全局连通性和CNN的局部性以及多分辨率策略来学习全局仿射配准，并且在3D脑图谱配准中评估了该方法。结果表明该方法在配准精度、鲁棒性、配准速度和泛化性都表现良好。