来自用户 前进 的文献。
当前共找到 22 篇文献分享,本页显示第 1 - 20 篇。
1.
前进 (2024-10-31 15:09):
#paper arXiv:2408.05839v2 Deep Learning in Medical Image Registration: Magic or Mirage? 38th Conference on Neural Information Processing Systems (NeurIPS 2024) 这篇论文深入探讨了医学图像配准领域中,基于深度学习的图像配准(DLIR)与传统优化方法的性能对比。论文比较了传统优化方法和基于学习的学习方法在DIR中的性能,指出传统方法在跨模态的泛化能力和稳健性能方面具有优势,而基于学习的方法则通过弱监督来实现更优的性能。通过一系列实验,论文验证了在无监督设置下,基于学习的方法在标签匹配性能上并没有显著超越传统方法,并提出了一个假设,即学习方法中的架构设计不太可能影响像素强度分布和标签之间的互信息,因此也不太可能显著提升基于学习的方法的性能。此外,论文还展示了在弱监督下,基于学习的方法具有更高的配准精度,这是传统方法难以实现的。然而,基于学习的方法对数据分布的变化较为敏感,并且未能展现出对数据分布变化的鲁棒性。论文最后给出结论,如果没有大型标记数据集,传统优化方法仍然是更优的选择。
arXiv, 2024-08-11T18:20:08Z. DOI: 10.48550/arXiv.2408.05839
Abstract:
Classical optimization and learning-based methods are the two reigningparadigms in deformable image registration. While optimization-based methodsboast generalizability across modalities and robust performance, learning-basedmethods promise peak performance, incorporating weak supervision and … >>>
Classical optimization and learning-based methods are the two reigningparadigms in deformable image registration. While optimization-based methodsboast generalizability across modalities and robust performance, learning-basedmethods promise peak performance, incorporating weak supervision and amortizedoptimization. However, the exact conditions for either paradigm to perform wellover the other are shrouded and not explicitly outlined in the existingliterature. In this paper, we make an explicit correspondence between themutual information of the distribution of per-pixel intensity and labels, andthe performance of classical registration methods. This strong correlationhints to the fact that architectural designs in learning-based methods isunlikely to affect this correlation, and therefore, the performance oflearning-based methods. This hypothesis is thoroughly validated withstate-of-the-art classical and learning-based methods. However, learning-basedmethods with weak supervision can perform high-fidelity intensity and labelregistration, which is not possible with classical methods. Next, we show thatthis high-fidelity feature learning does not translate to invariance to domainshift, and learning-based methods are sensitive to such changes in the datadistribution. Finally, we propose a general recipe to choose the best paradigmfor a given registration problem, based on these observations. <<<
翻译
2.
前进 (2024-09-30 16:31):
#paper DOI 10.1186/1471-2105-12-451 Frazer Meacham, Dario Boffelli, Joseph , Identification and correction of systematic error in high-throughput sequence data 这篇论文主要研究了高通量测序数据中系统性错误的问题。系统性错误是指在基因组(或转录组)特定位置的测序读段中,以统计上不太可能的方式累积出现的错误。作者们通过使用高覆盖率数据中的重叠配对读段来表征和描述系统性错误,发现这类错误大约每1000个碱基对中发生一次,并且在不同实验中高度可复制。他们识别了在系统性错误位点频繁出现的序列,并设计了一个分类器,用于区分杂合位点和系统性错误。这个分类器可以用于处理杂合位点等位基因频率不一定为0.5的实验数据,并且可以用于单端数据集。论文的结论是,系统性错误可能很容易被误认为是个体中的杂合位点,或者是群体分析中的SNPs。作者们通过系统性错误的特征描述,开发了一个名为SysCall的程序,用于识别和纠正这类错误,并得出结论认为,在设计和解释高通量测序实验时,考虑纠正系统性错误是很重要的。
Abstract:
Abstract Background A feature common to all DNA sequencing technologies is the presence of base-call errors in the sequenced reads. The implications of such errors are application specific, ranging from … >>>
Abstract Background A feature common to all DNA sequencing technologies is the presence of base-call errors in the sequenced reads. The implications of such errors are application specific, ranging from minor informatics nuisances to major problems affecting biological inferences. Recently developed "next-gen" sequencing technologies have greatly reduced the cost of sequencing, but have been shown to be more error prone than previous technologies. Both position specific (depending on the location in the read) and sequence specific (depending on the sequence in the read) errors have been identified in Illumina and Life Technology sequencing platforms. We describe a new type of systematic error that manifests as statistically unlikely accumulations of errors at specific genome (or transcriptome) locations. Results We characterize and describe systematic errors using overlapping paired reads from high-coverage data. We show that such errors occur in approximately 1 in 1000 base pairs, and that they are highly replicable across experiments. We identify motifs that are frequent at systematic error sites, and describe a classifier that distinguishes heterozygous sites from systematic error. Our classifier is designed to accommodate data from experiments in which the allele frequencies at heterozygous sites are not necessarily 0.5 (such as in the case of RNA-Seq), and can be used with single-end datasets. Conclusions Systematic errors can easily be mistaken for heterozygous sites in individuals, or for SNPs in population analyses. Systematic errors are particularly problematic in low coverage experiments, or in estimates of allele-specific expression from RNA-Seq data. Our characterization of systematic error has allowed us to develop a program, called SysCall, for identifying and correcting such errors. We conclude that correction of systematic errors is important to consider in the design and interpretation of high-throughput sequencing experiments. <<<
翻译
3.
前进 (2024-08-31 14:29):
#paper https://doi.org/10.15326/jcopdf.2023.0399 Chen, J., Xu, Z., Sun, L., Yu, K., Hersh, C. P., Boueiz, A., ... Batmanghelich, K. (2023). Deep learning integration of chest computed tomography and gene expression identifies novel aspects of COPD. Chronic Obstructive Pulmonary Diseases: Journal of the COPD Foundation, 10(4), 355-368. 这篇论文通过深度学习的方法,联合分析了慢性阻塞性肺病(COPD)患者的胸部CT扫描图像和血液RNA测序数据,以探索肺部结构变化与血液转录组模式之间的新型关系。研究识别出了两种图像-表达轴(IEAs),分别与肺气肿和气道疾病相关,揭示了它们与COPD的不同临床测量和健康预后的关联。此外,研究还通过生物信息学分析,确定了与这两种IEAs相关的生物学通路。这项研究为理解COPD的异质性提供了新的视角,并可能有助于开发针对性的治疗方法。
Abstract:
Rationale: Chronic obstructive pulmonary disease (COPD) is characterized by pathologic changes in the airways, lung parenchyma, and persistent inflammation, but the links between lung structural changes and blood transcriptome patterns … >>>
Rationale: Chronic obstructive pulmonary disease (COPD) is characterized by pathologic changes in the airways, lung parenchyma, and persistent inflammation, but the links between lung structural changes and blood transcriptome patterns have not been fully described.Objections: The objective of this study was to identify novel relationships between lung structural changes measured by chest computed tomography (CT) and blood transcriptome patterns measured by blood RNA sequencing (RNA-seq).Methods: CT scan images and blood RNA-seq gene expression from 1223 participants in the COPD Genetic Epidemiology (COPDGene®) study were jointly analyzed using deep learning to identify shared aspects of inflammation and lung structural changes that we labeled image-expression axes (IEAs). We related IEAs to COPD-related measurements and prospective health outcomes through regression and Cox proportional hazards models and tested them for biological pathway enrichment.Results: We identified 2 distinct IEAs: IEAemph which captures an emphysema-predominant process with a strong positive correlation to CT emphysema and a negative correlation to forced expiratory volume in 1 second and body mass index (BMI); and IEAairway which captures an airway-predominant process with a positive correlation to BMI and airway wall thickness and a negative correlation to emphysema. Pathway enrichment analysis identified 29 and 13 pathways significantly associated with IEAemph and IEAairway, respectively (adjusted p<0.001).Conclusions: Integration of CT scans and blood RNA-seq data identified 2 IEAs that capture distinct inflammatory processes associated with emphysema and airway-predominant COPD. <<<
翻译
4.
前进 (2024-07-31 11:35):
#paper DOI:https://doi.org/10.48550/arXiv.2006.16236 Katharopoulos A, Vyas A, Pappas N, et al. Transformers are rnns: Fast autoregressive transformers with linear attention[C]//International conference on machine learning. PMLR, 2020: 5156-5165. 这篇论文提出了一种新型的线性Transformer模型,该模型通过将自注意力机制表达为线性点积的核特征映射,并利用矩阵乘法的结合性质,显著降低了传统Transformer在处理长序列时的计算复杂度,从O(N^2)降低到O(N)。作者展示了这种新模型不仅能够实现与标准Transformer相似的性能,而且在自回归预测长序列时速度提升了多达4000倍。此外,论文还探讨了Transformer与循环神经网络(RNN)之间的关系,证明了通过适当的转换,Transformer可以像RNN一样高效地进行自回归预测。
arXiv, 2020-06-29T17:55:38Z. DOI: 10.48550/arXiv.2006.16236
Abstract:
Transformers achieve remarkable performance in several tasks but due to theirquadratic complexity, with respect to the input's length, they areprohibitively slow for very long sequences. To address this limitation, weexpress … >>>
Transformers achieve remarkable performance in several tasks but due to theirquadratic complexity, with respect to the input's length, they areprohibitively slow for very long sequences. To address this limitation, weexpress the self-attention as a linear dot-product of kernel feature maps andmake use of the associativity property of matrix products to reduce thecomplexity from $\mathcal{O}\left(N^2\right)$ to $\mathcal{O}\left(N\right)$,where $N$ is the sequence length. We show that this formulation permits aniterative implementation that dramatically accelerates autoregressivetransformers and reveals their relationship to recurrent neural networks. Ourlinear transformers achieve similar performance to vanilla transformers andthey are up to 4000x faster on autoregressive prediction of very longsequences. <<<
翻译
5.
前进 (2024-06-30 22:29):
#paper Liu R , Li Z , Fan X ,et al.Learning Deformable Image Registration from Optimization: Perspective, Modules, Bilevel Training and Beyond[J]. 2020.DOI:10.48550/arXiv.2004.14557. 论文提出了一个新的基于深度学习的框架,旨在通过多尺度传播优化微分同胚模型来整合传统变形配准方法和基于深度学习的方法的优势,并避免它们的局限性。具体来说,作者提出了一个通用的优化模型来解决微分同胚配准问题,并开发了一系列可学习的架构,以从粗到细的学习图像特征完成配准。此外,论文还提出了一种新颖的双层自调整训练策略,允许高效地搜索任务特定的超参数,这增加了对各种类型数据的灵活性,同时减少了计算和人力负担。 作者多种数据集上进行了配准实验,包括大脑MRI数据的图像到图谱配准和肝脏CT数据的图像到图像配准。实验结果表明,所提出的方法在保持微分同胚的同时,达到了最先进的性能。此外,作者还将其框架应用于多模态图像配准,并研究了其配准如何支持医学图像分析的下游任务,包括多模态融合和图像分割。
Abstract:
Conventional deformable registration methods aim at solving an optimizationmodel carefully designed on image pairs and their computational costs areexceptionally high. In contrast, recent deep learning based approaches canprovide fast deformation … >>>
Conventional deformable registration methods aim at solving an optimizationmodel carefully designed on image pairs and their computational costs areexceptionally high. In contrast, recent deep learning based approaches canprovide fast deformation estimation. These heuristic network architectures arefully data-driven and thus lack explicit geometric constraints, e.g.,topology-preserving, which are indispensable to generate plausibledeformations. We design a new deep learning based framework to optimize adiffeomorphic model via multi-scale propagation in order to integrateadvantages and avoid limitations of these two categories of approaches.Specifically, we introduce a generic optimization model to formulatediffeomorphic registration and develop a series of learnable architectures toobtain propagative updating in the coarse-to-fine feature space. Moreover, wepropose a novel bilevel self-tuned training strategy, allowing efficient searchof task-specific hyper-parameters. This training strategy increases theflexibility to various types of data while reduces computational and humanburdens. We conduct two groups of image registration experiments on 3D volumedatasets including image-to-atlas registration on brain MRI data andimage-to-image registration on liver CT data. Extensive results demonstrate thestate-of-the-art performance of the proposed method with diffeomorphicguarantee and extreme efficiency. We also apply our framework to challengingmulti-modal image registration, and investigate how our registration to supportthe down-streaming tasks for medical image analysis including multi-modalfusion and image segmentation. <<<
翻译
6.
前进 (2024-05-30 13:53):
#paper Luo S, Xie Z, Chen G, et al. Hierarchical DNN with Heterogeneous Computing Enabled High-Performance DNA Sequencing[C]//2022 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS). IEEE, 2022: 35-40. 这篇论文采用深度学习算法进行第二代基因测序。算法AYB是所有测序算法中精度最高的,但是随着推移荧光信号减弱,AYB算法处理效果并不好,并且它也难以解决DNA的phasing效应。而深度学习方法则能很高的解决上述问题。它首先通过前5个循环的采集到的荧光图像检测cluster的位置,提取后续cluster强度,再通过传统通道校正算法校正强度色差,然后将校正后的结果输入到DNN中判断碱基类别。实验结果表明,深度学习的方案相比于传统算法能够多检测出12.18%的reads数量,且碱基的分类错误率从0.1432% 降到0.0175%
7.
前进 (2024-04-30 11:44):
#paper Han D, Pan X, Han Y, et al. Flatten transformer: Vision transformer using focused linear attention[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023: 5961-5971. 自注意力(self-attention)在计算机视觉任务中应用时面临的主要挑战是其二次计算复杂度,这使得处理视觉任务变得非常昂贵。作为Softmax注意力的一种替代方案,线性注意力通过精心设计的映射函数来近似Softmax操作,从而将计算复杂度从二次降低到线性。尽管线性注意力在理论上更加高效,但现有的线性注意力方法要么性能显著下降,要么需要额外的计算开销,这限制了它们的实际应用。为了克服这些限制,论文提出了FLA模块,它通过两个主要的改进来提高效率和表达能力:焦点能力:1 通过一个简单的映射函数,增强了自注意力对最信息特征的聚焦能力。特征多样性:引入了一个高效的秩恢复模块,通过深度卷积(DWC)来恢复注意力矩阵的秩,增加了特征的多样性。通过在多个先进的视觉Transformer模型上的广泛实验,FLA模块在多个基准测试中均显示出了一致的性能提升。
8.
前进 (2024-03-31 12:44):
#paper [1] Hu X , Kang M , Huang W ,et al.Dual-Stream Pyramid Registration Network[J].Springer, Cham, 2019.DOI:10.1007/978-3-030-32245-8_43. 这篇论文主要用于无监督的3D大脑医学图像配准。与以往的基于卷积神经网络(CNN)的配准方法不同,例如VoxelMorph,Dual-PRNet设计了一个双流架构,能够从一对3D体积图像中顺序估计多级配准场。 主要贡献包括: 设计了一个双流3D编码器-解码器网络,分别从两个输入体积计算两个卷积特征金字塔。 提出了一种顺序金字塔配准方法,设计了一系列金字塔配准(PR)模块,直接从解码特征金字塔预测多级配准场。通过顺序变形,逐渐以粗到细的方式细化配准场,使模型具有处理大变形的强大能力。 通过计算特征金字塔之间的局部3D相关性,可以进一步增强PR模块,从而得到改进的Dual-PRNet++,能够聚合丰富的详细解剖结构。 将Dual-PRNet++集成到3D分割框架中,通过精确变形体素级注释,实现联合配准和分割。 论文还介绍了相关工作,包括基于深度学习的医学图像配准方法,并对提出的方法进行了评估。在Mindboggle101数据集上,Dual-PRNet++在Dice得分上从0.511提高到0.748,大幅度超过了现有的最先进方法。此外,论文还展示了该方法在有限注释的联合学习框架中,如何通过利用有限的注释极大地促进分割任务的完成。
9.
前进 (2024-02-28 10:57):
#paper Mckenzie E M , Santhanam A , Ruan D ,et al.Multimodality image registration in the head‐and‐neck using a deep learning‐derived synthetic CT as a bridge[J].Medical Physics, 2020, 47(3).DOI:10.1002/mp.13976. 本文提出并验证一种利用深度学习驱动的跨模态综合技术的头颈多模式图像配准方法。 采用CycleGAN将MRI 转化为合成CT(sCT),将头颈部的MRI-CT多模态配准转化为sCT-CT的单模态配准。配准方法采用传统的B-spline方法。实验结果表明sCT→CT 配准精度好于MRI→CT。平均配准误差从9.8mm下降到6.0mm
IF:3.200Q1 Medical physics, 2020-Mar. DOI: 10.1002/mp.13976 PMID: 31853975
Abstract:
PURPOSE: To develop and demonstrate the efficacy of a novel head-and-neck multimodality image registration technique using deep-learning-based cross-modality synthesis.METHODS AND MATERIALS: Twenty-five head-and-neck patients received magnetic resonance (MR) and computed … >>>
PURPOSE: To develop and demonstrate the efficacy of a novel head-and-neck multimodality image registration technique using deep-learning-based cross-modality synthesis.METHODS AND MATERIALS: Twenty-five head-and-neck patients received magnetic resonance (MR) and computed tomography (CT) (CTaligned ) scans on the same day with the same immobilization. Fivefold cross validation was used with all of the MR-CT pairs to train a neural network to generate synthetic CTs from MR images. Twenty-four of 25 patients also had a separate CT without immobilization (CTnon-aligned ) and were used for testing. CTnon-aligned 's were deformed to the synthetic CT, and compared to CTnon-aligned registered to MR. The same registrations were performed from MR to CTnon-aligned and from synthetic CT to CTnon-aligned . All registrations used B-splines for modeling the deformation, and mutual information for the objective. Results were evaluated using the 95% Hausdorff distance among spinal cord contours, landmark error, inverse consistency, and Jacobian determinant of the estimated deformation fields.RESULTS: When large initial rigid misalignment is present, registering CT to MRI-derived synthetic CT aligns the cord better than a direct registration. The average landmark error decreased from 9.8 ± 3.1 mm in MR→CTnon-aligned to 6.0 ± 2.1 mm in CTsynth →CTnon-aligned deformable registrations. In the CT to MR direction, the landmark error decreased from 10.0 ± 4.3 mm in CTnon-aligned →MR deformable registrations to 6.6 ± 2.0 mm in CTnon-aligned →CTsynth deformable registrations. The Jacobian determinant had an average value of 0.98. The proposed method also demonstrated improved inverse consistency over the direct method.CONCLUSIONS: We showed that using a deep learning-derived synthetic CT in lieu of an MR for MR→CT and CT→MR deformable registration offers superior results to direct multimodal registration. <<<
翻译
10.
前进 (2024-01-31 22:50):
#paper arxiv.org//pdf/2311.026 2023 Exploring Grounding Potential of VQA-oriented GPT-4V for Zero-shot Anomaly Detection. 大型多模态模型 (LMM) GPT-4V(ision) 赋予 GPT-4 视觉grounding能力,使得通过视觉问答 (VQA) 范式处理某些任务成为可能。本文探讨了面向 VQA 的 GPT-4V 在最近流行的视觉异常检测(AD)中的潜力,并首次对流行的 MVTec AD 和 VisA 数据集进行定性和定量评估。 考虑到该任务需要图像/像素级评估,提出的 GPT-4V-AD 框架包含三个组成部分:1)粒度区域划分,2)提示设计,3)用于轻松定量评估的 Text2Segmentation,并做了一些不同的 尝试进行比较分析。 结果表明,GPT-4V可以通过VQA范式在零样本AD任务中取得一定的结果,例如在MVTec AD和VisA数据集上分别实现图像级77.1/88.0和像素级68.0/76.6 AU-ROC 。 然而,其性能与最先进的零样本方法(例如WinCLIP和CLIP-AD)相比仍然存在一定差距,需要进一步研究。 这项研究为零样本 AD 任务中面向 VQA 的 LMM 的研究提供了基线参考
Abstract:
Large Multimodal Model (LMM) GPT-4V(ision) endows GPT-4 with visual groundingcapabilities, making it possible to handle certain tasks through the VisualQuestion Answering (VQA) paradigm. This paper explores the potential ofVQA-oriented GPT-4V … >>>
Large Multimodal Model (LMM) GPT-4V(ision) endows GPT-4 with visual groundingcapabilities, making it possible to handle certain tasks through the VisualQuestion Answering (VQA) paradigm. This paper explores the potential ofVQA-oriented GPT-4V in the recently popular visual Anomaly Detection (AD) andis the first to conduct qualitative and quantitative evaluations on the popularMVTec AD and VisA datasets. Considering that this task requires bothimage-/pixel-level evaluations, the proposed GPT-4V-AD framework contains threecomponents: 1) Granular Region Division, 2) Prompt Designing, 3)Text2Segmentation for easy quantitative evaluation, and have made somedifferent attempts for comparative analysis. The results show that GPT-4V canachieve certain results in the zero-shot AD task through a VQA paradigm, suchas achieving image-level 77.1/88.0 and pixel-level 68.0/76.6 AU-ROCs on MVTecAD and VisA datasets, respectively. However, its performance still has acertain gap compared to the state-of-the-art zero-shot method, e.g., WinCLIPann CLIP-AD, and further research is needed. This study provides a baselinereference for the research of VQA-oriented LMM in the zero-shot AD task, and wealso post several possible future works. Code is available at\url{https://github.com/zhangzjn/GPT-4V-AD}. <<<
翻译
11.
前进 (2023-12-27 15:11):
#paper arXiv:2312.11514v1 ,2023, LLM in a flash: Efficient Large Language Model Inference with Limited Memory 大型语言模型(LLMs)在现代自然语言处理中具有重要作用,但其高昂的计算和内存需求对于内存有限的设备构成了挑战。为了高效运行超过可用DRAM容量的LLMs,该论文采用了存储模型参数在闪存上,并按需将其调入DRAM的方法。研究方法包括构建与闪存行为协调的推理模型,并在两个关键领域进行优化:减少闪存传输的数据量和以更大、更连续的块来读取数据。在这个框架下,引入了两种主要技术:“windowing”策略通过重复使用先前激活的神经元减少数据传输,“row-column bunding”则充分利用了闪存的顺序数据访问特性,增加了从闪存中读取的数据块的大小。这些方法使得可以在有限DRAM上运行比原先两倍大的模型,相较于朴素的加载方法,在CPU和GPU上推断速度分别提高了4-5倍和20-25倍。
Abstract:
Large language models (LLMs) are central to modern natural languageprocessing, delivering exceptional performance in various tasks. However, theirintensive computational and memory requirements present challenges, especiallyfor devices with limited DRAM capacity. … >>>
Large language models (LLMs) are central to modern natural languageprocessing, delivering exceptional performance in various tasks. However, theirintensive computational and memory requirements present challenges, especiallyfor devices with limited DRAM capacity. This paper tackles the challenge ofefficiently running LLMs that exceed the available DRAM capacity by storing themodel parameters on flash memory but bringing them on demand to DRAM. Ourmethod involves constructing an inference cost model that harmonizes with theflash memory behavior, guiding us to optimize in two critical areas: reducingthe volume of data transferred from flash and reading data in larger, morecontiguous chunks. Within this flash memory-informed framework, we introducetwo principal techniques. First, "windowing'" strategically reduces datatransfer by reusing previously activated neurons, and second, "row-columnbundling", tailored to the sequential data access strengths of flash memory,increases the size of data chunks read from flash memory. These methodscollectively enable running models up to twice the size of the available DRAM,with a 4-5x and 20-25x increase in inference speed compared to naive loadingapproaches in CPU and GPU, respectively. Our integration of sparsity awareness,context-adaptive loading, and a hardware-oriented design paves the way foreffective inference of LLMs on devices with limited memory. <<<
翻译
12.
前进 (2023-11-30 10:22):
#paper GraformerDIR: Graph convolution transformer for deformable image registration Computers in Biology and Medicine 30 june 2022 https://doi.org/10.1016/j.compbiomed.2022.105799 这是一篇用图卷积来进行图像配准的论文,通过将图卷积变换器(Graformer)层放在 在特征提取网络中,提出了一个基于Graformer的DIR框架,命名为GraformerDIR。Graformer层由Graformer模块和Cheby-shev图卷积模块组成。其中 Graformer模块旨在捕获高质量的长期依赖关系。Cheby-shev图卷积模块用于进一步扩大感受野。GraformerDIR的性能已经在公开的大脑数据集中进行了评估,包括OASIS、LPBA40和MGH10数据集。与VoxelMorph相比,GraformerDIR在DSC方面获得4.6%的性能改进,在平均值方面获得0.055mm的性能改进,同时折叠率更低。
Abstract:
PURPOSE: Deformable image registration (DIR) plays an important role in assisting disease diagnosis. The emergence of the Transformer enables the DIR framework to extract long-range dependencies, which relieves the limitations … >>>
PURPOSE: Deformable image registration (DIR) plays an important role in assisting disease diagnosis. The emergence of the Transformer enables the DIR framework to extract long-range dependencies, which relieves the limitations of intrinsic locality caused by convolution operation. However, suffering from the interference of missing or spurious connections, it is a challenging task for Transformer-based methods to capture the high-quality long-range dependencies.METHODS: In this paper, by staking the graph convolution Transformer (Graformer) layer at the bottom of the feature extraction network, we propose a Graformer-based DIR framework, named GraformerDIR. The Graformer layer is consist of the Graformer module and the Cheby-shev graph convolution module. Among them, the Graformer module is designed to capture high-quality long-range dependencies. Cheby-shev graph convolution module is employed to further enlarge the receptive field.RESULTS: The performance and generalizability of GraformerDIR have been evaluated on publicly available brain datasets including the OASIS, LPBA40, and MGH10 datasets. Compared with VoxelMorph, the GraformerDIR has obtained performance improvements of 4.6% in Dice similarity coefficient (DSC) and 0.055 mm in the average symmetric surface distance (ASD) while reducing the non-positive rate of Jacobin determinant (Npr.Jac) index about 60 times on publicly available OASIS dataset. On unseen dataset MGH10, the GraformerDIR has obtained the performance improvements of 4.1% in DSC and 0.084 mm in ASD compared with VoxelMorph, which demonstrates the GraformerDIR with better generalizability. The promising performance on the clinical cardiac dataset ACDC indicates the GraformerDIR is practicable.CONCLUSION: With the advantage of Transformer and graph convolution, the GraformerDIR has obtained comparable performance with the state-of-the-art method VoxelMorph. <<<
翻译
13.
前进 (2023-10-30 13:57):
#paper https://doi.org/10.1088/1361-6560/ac5f70 Training low dose CT denoising network without high quality reference data 低剂量CT(LDCT)去噪领域主要是基于监督学习的方法,需要完全配准的LDCT对及其相应的干净参考图像(normal-dose CT)。然而,无干净标签的训练更具有实际意义,因为在临床上不可能获得大量的这些配对样本。本文提出了一种用于LDCT成像的自监督去噪方法。方法该方法不需要任何干净的图像。此外,在去噪过程中,利用感知损失来实现特征域的数据一致性。在解码阶段使用的注意块可以帮助进一步提高图像质量。在实验中横向对比了3种方法,并进行了6个消融实验,验证了提出的自监督框架的有效性,以及自注意模块和感知损失的有效性。
Abstract:
Currently, the field of low-dose CT (LDCT) denoising is dominated by supervised learning based methods, which need perfectly registered pairs of LDCT and its corresponding clean reference image (normal-dose CT). … >>>
Currently, the field of low-dose CT (LDCT) denoising is dominated by supervised learning based methods, which need perfectly registered pairs of LDCT and its corresponding clean reference image (normal-dose CT). However, training without clean labels is more practically feasible and significant, since it is clinically impossible to acquire a large amount of these paired samples. In this paper, a self-supervised denoising method is proposed for LDCT imaging.The proposed method does not require any clean images. In addition, the perceptual loss is used to achieve data consistency in feature domain during the denoising process. Attention blocks used in decoding phase can help further improve the image quality.In the experiments, we validate the effectiveness of our proposed self-supervised framework and compare our method with several state-of-the-art supervised and unsupervised methods. The results show that our proposed model achieves competitive performance in both qualitative and quantitative aspects to other methods.Our framework can be directly applied to most denoising scenarios without collecting pairs of training data, which is more flexible for real clinical scenario. <<<
翻译
14.
前进 (2023-09-27 10:56):
#paper doi:10.1109/cvpr.2019.00223  2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).  Noise2Void - Learning Denoising From Single Noisy Images. 基于深度学习的图像去噪一般是通过干净图像和噪声图像组成的图相对来进行训练的。目前也有一些做法可以无需干净图像,仅需多张噪声图像来完成模型的训练(N2N)。而本文提出了一种基于单张噪声图像的去噪方法。基于Patch去噪的观点认为,结果图像中的每一个像素点由于感受野的限制只取决于输入图像中的一部分区域。基于这个观点衍生出许多去噪的方法,例如Noise2Noise的方法,它不再需要干净的图像作为target。而本文提出了一种只需要单张噪声图像就能完成去噪的方法。作者认为,如果对于单张图像,以其中的一个patch作为网络的input,以这个patch中心位置的像素作为target,那么网络将会学习到直接将输入patch中心的像素映射到网络的输出这这种identity map。因此,作者设计了有一种特殊的感受野,将感受野中心的像素“抹去”,再要求网络去预测中心位置的信息。这种做法基于两个假设:1、不同位置的噪声像素之间是相互独立的 2、噪声的均值为0 。因此预测出来的中心像素点更有可能是信号而非噪声。
15.
前进 (2023-01-31 23:30):
#paper Rethinking 1x1 Convolutions: Can we train CNNs with Frozen Random Filters? arXiv:2301.11360 本文引入了一种新的卷积块,计算(冻结随机)滤波器的可学习线性组合(LC),并由此提出 LCResNets,还提出一种新的权重共享机制,可大幅减少权重的数量。在本文中,即使在仅随机初始化且从不更新空间滤波器的极端情况下,某些CNN架构也可以被训练以超过标准训练的精度。通过将逐点(1x1)卷积的概念重新解释为学习冻结(随机)空间滤波器的线性组合(LC)的算子,这种方法不仅可以在CIFAR和ImageNet上达到较高的测试精度,而且在模型鲁棒性、泛化、稀疏 性和所需权重的总数方面具有良好。此外本文提出了一种新的权重共享机制,该机制允许在所有空间卷积层之间共享单个权重张量,以大幅减少权重的数量。
arXiv, 2023.
Abstract:
Modern CNNs are learning the weights of vast numbers of convolutional operators. In this paper, we raise the fundamental question if this is actually necessary. We show that even in … >>>
Modern CNNs are learning the weights of vast numbers of convolutional operators. In this paper, we raise the fundamental question if this is actually necessary. We show that even in the extreme case of only randomly initializing and never updating spatial filters, certain CNN architectures can be trained to surpass the accuracy of standard training. By reinterpreting the notion of pointwise (1×1) convolutions as an operator to learn linear combinations (LC) of frozen (random) spatial filters, we are able to analyze these effects and propose a generic LC convolution block that allows tuning of the linear combination rate. Empirically, we show that this approach not only allows us to reach high test accuracies on CIFAR and ImageNet but also has favorable properties regarding model robustness, generalization, sparsity, and the total number of necessary weights. Additionally, we propose a novel weight sharing mechanism, which allows sharing of a single weight tensor between all spatial convolution layers to massively reduce the number of weights. <<<
翻译
16.
前进 (2022-12-31 11:39):
#paper Liu Y, Chen J, Wei S, et al. On Finite Difference Jacobian Computation in Deformable Image Registration[J]. arXiv preprint arXiv:2212.06060, 2022. 产生微分同胚的空间变换一直是变形图像配准的中心问题。作为一个微分同胚变换,应在任何位置都具有正的雅可比行列式|J|。|J|<0的体素数已被用于测试微分同胚性,也用于测量变换的不规则性。 对于数字变换,|J|通常使用中心差来近似,但是对于即使在体素分辨率级别上也明显不具有差分同胚性的变换,这种策略可以产生正的|J|。为了证明这一点,论文首先研究了|J|的不同有限差分近似的几何意义。为了确定数字图像的微分同胚性,使用任何单独的有限差分逼近|J|是不够的。论文证明对于2D变换,|J|的四个唯一的有限差分近似必须是正的,以确保整个域是可逆的,并且在像素级没有折叠。在3D中,|J|的十个唯一的有限差分近似值需要是正的。论文提出的数字微分同胚准则解决了|J|的中心差分近似中固有的几个误差,并准确地检测非微分同胚数字变换。
Abstract:
Producing spatial transformations that are diffeomorphic has been a central problem in deformable image registration. As a diffeomorphic transformation should have positive Jacobian determinant |J| everywhere, the number of voxels … >>>
Producing spatial transformations that are diffeomorphic has been a central problem in deformable image registration. As a diffeomorphic transformation should have positive Jacobian determinant |J| everywhere, the number of voxels with |J|<0 has been used to test for diffeomorphism and also to measure the irregularity of the transformation. For digital transformations, |J| is commonly approximated using central difference, but this strategy can yield positive |J|'s for transformations that are clearly not diffeomorphic -- even at the voxel resolution level. To show this, we first investigate the geometric meaning of different finite difference approximations of |J|. We show that to determine diffeomorphism for digital images, use of any individual finite difference approximations of |J| is insufficient. We show that for a 2D transformation, four unique finite difference approximations of |J|'s must be positive to ensure the entire domain is invertible and free of folding at the pixel level. We also show that in 3D, ten unique finite differences approximations of |J|'s are required to be positive. Our proposed digital diffeomorphism criteria solves several errors inherent in the central difference approximation of |J| and accurately detects non-diffeomorphic digital transformations. <<<
翻译
17.
前进 (2022-11-28 10:25):
#paper Zhu Y , Lu S . Swin-VoxelMorph: A Symmetric Unsupervised Learning Model forDeformable Medical Image Registration Using Swin Transformer[C]// International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, Cham, 2022. 可变形医学图像配准广泛应用于医学图像处理中,具有可逆一对一的映射。虽然最先进的图像配准方法是基于卷积神经网络,但很少有人尝试用Transformer的方法。现有的模型忽略了在嵌入学习中使用注意机制来处理远程交叉图,限制了这种方法来识别解剖结构的语义上有意义的对应关系。这些方法虽然实现了快速的图像配准,但也忽略了变换的拓扑保存和可逆性。在本文中,提出了一种新的基于Swin Transformer对称无监督学习网络,它可以最小化图像之间的差异,并同时估计正变换和逆变换像相关性.具体地说,本文提出了三维Swin-UNet,它应用具有Shfited window的分层Swin Transformer作为编码器来提取上下文特征。设计了一种基于patch expanding的symmetric swin Transformer解码器,进行上采样操作,估计配准场。此外,目标损失函数可以保证预测变换的实质性微分性质。本文在ADNI和PPMI两个数据集上验证了该方法,并在保持理想的微分性质的同时实现了最先进的配准精度。
Abstract:
Deformable medical image registration is widely used in medical image processing with the invertible and one-to-one mapping between images. While state-of-the-art image registration methods are based on convolutional neural networks, … >>>
Deformable medical image registration is widely used in medical image processing with the invertible and one-to-one mapping between images. While state-of-the-art image registration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on computer vision tasks. Existing models neglect to employ attention mechanisms to handle the long-range cross-image relevance in embedding learning, limiting such approaches to identify the semantically meaningful correspondence of anatomical structures. These methods also ignore the topology preservation and invertibility of the transformation although they achieve fast image registration. In this paper, we propose a novel, symmetric unsupervised learning network Swin-VoxelMorph based on the Swin Transformer which minimizes the dissimilarity between images and estimates both forward and inverse transformations simultaneously. Specifically, we propose 3D Swin-UNet, which applies hierarchical Swin Transformer with shifted windows as the encoder to extract context features. And a symmetric Swin Transformer-based decoder with patch expanding layer is designed to perform the up-sampling operation to estimate the registration fields. Besides, our objective loss functions can guarantee substantial diffeomorphic properties of the predicted transformations. We verify our method on two datasets including ADNI and PPMI, and it achieves state-of-the-art registration accuracy while maintaining desirable diffeomorphic properties. <<<
翻译
18.
前进 (2022-10-30 21:26):
#paper Shi J, He Y, Kong Y, et al. XMorpher: Full Transformer for Deformable Medical Image Registration via Cross Attention[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, Cham, 2022: 217-226.现有的深度网络专注于单个图像的特征提取,并且在对成对图像执行的配准任务方面受到限制。因此,本文提出了一种新的骨干网络XMorpher,有效地对变形配准中成对特征进行表示。1) 它提出了一种新的Transformer架构,包括双并行特征提取网络,该网络通过Cross Attention来改变信息,从而发现多级语义对应关系,同时逐步提取各自的特征,以实现最终的有效配准。2) 它提出了Cross Attention Transformer(CAT)块,以建立图像之间的注意力机制,该机制能够自动找到对应关系,并促使特征在网络中有效融合。3) 它限制了不同大小的基本窗口和搜索窗口之间的计算,从而集中于可变形配准的局部变换,同时提高了计算效率。XMorpher使Voxelmorph在DSC上提高了2.8%,证明了其在变形配准中对配对图像的特征的有效表示。
Abstract:
An effective backbone network is important to deep learning-based Deformable Medical Image Registration (DMIR), because it extracts and matches the features between two images to discover the mutual correspondence for … >>>
An effective backbone network is important to deep learning-based Deformable Medical Image Registration (DMIR), because it extracts and matches the features between two images to discover the mutual correspondence for fine registration. However, the existing deep networks focus on single image situation and are limited in registration task which is performed on paired images. Therefore, we advance a novel backbone network, XMorpher, for the effective corresponding feature representation in DMIR. 1) It proposes a novel full transformer architecture including dual parallel feature extraction networks which exchange information through cross attention, thus discovering multi-level semantic correspondence while extracting respective features gradually for final effective registration. 2) It advances the Cross Attention Transformer (CAT) blocks to establish the attention mechanism between images which is able to find the correspondence automatically and prompts the features to fuse efficiently in the network. 3) It constrains the attention computation between base windows and searching windows with different sizes, and thus focuses on the local transformation of deformable registration and enhances the computing efficiency at the same time. Without any bells and whistles, our XMorpher gives Voxelmorph 2.8% improvement on DSC, demonstrating its effective representation of the features from the paired images in DMIR. We believe that our XMorpher has great application potential in more paired medical images. Our XMorpher is open on https://github.com/Solemoon/XMorpher <<<
翻译
19.
前进 (2022-09-29 12:12):
#paper Affine Medical Image Registration with Coarse-to-Fine Vision Transformer Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 20835-20844 仿射配准是综合医学图像配准中不可缺少的一部分。然而,关于快速、鲁棒的仿射配准算法的研究很少。这些研究大多都是联合仿射和变形配准的CNN模型,而对仿射子网络的独立性能研究较少。此外,现有的基于CNN的仿射配准方法要么关注输入的局部错位,要么关注输入的全局方向和位置,以预测仿射变换矩阵,这种方法对空间初始化敏感,泛化能力有限。这篇论文提出了一种快速、鲁棒的基于学习的三维仿射医学图像配准算法C2FViT。该方法自然地利用Transformer的全局连通性和CNN的局部性以及多分辨率策略来学习全局仿射配准,并且在3D脑图谱配准中评估了该方法。结果表明该方法在配准精度、鲁棒性、配准速度和泛化性都表现良好。
Abstract:
Affine registration is indispensable in a comprehensive medical image registration pipeline. However, only a few studies focus on fast and robust affine registration algorithms. Most of these studies utilize convolutional … >>>
Affine registration is indispensable in a comprehensive medical image registration pipeline. However, only a few studies focus on fast and robust affine registration algorithms. Most of these studies utilize convolutional neural networks (CNNs) to learn joint affine and non-parametric registration, while the standalone performance of the affine subnetwork is less explored. Moreover, existing CNN-based affine registration approaches focus either on the local misalignment or the global orientation and position of the input to predict the affine transformation matrix, which are sensitive to spatial initialization and exhibit limited generalizability apart from the training dataset. In this paper, we present a fast and robust learning-based algorithm, Coarse-to-Fine Vision Transformer (C2FViT), for 3D affine medical image registration. Our method naturally leverages the global connectivity and locality of the convolutional vision transformer and the multi-resolution strategy to learn the global affine registration. We evaluate our method on 3D brain atlas registration and template-matching normalization. Comprehensive results demonstrate that our method is superior to the existing CNNs-based affine registration methods in terms of registration accuracy, robustness and generalizability while preserving the runtime advantage of the learning-based methods. The source code is available at this https URL. <<<
翻译
20.
前进 (2022-08-24 22:22):
#paper arXiv:2208.04939v1 ,2022,U-Net vs Transformer: Is U-Net Outdated in Medical Image Registration? 基于Transformer的网络由于其长距离建模能力,在可变形图像配准中越来越流行。然而本文认为,一个具有5层卷积Unet网络的感受野足以在不需要依赖长距离建模能力的情况下捕捉精确的图像形变。本文想要探究UNet网络在应用于医学图像配准时,与现代基于Transformer的方法相比是否已经过时?为此,作者提出了一个具有大的卷积核的UNet网络(LKU-Net),即通过在一个普通的UNet网络内嵌入平行的卷积块来争强网络的感受野。在公用3D IXI 大脑数据集上进行基于atlas的配准实验,作者证明了LKU-Net的变现依旧可以和如今最先进的基于Transformer的方法相当甚至超越,而且只用了TransMorph 1.12%的参数量和10.8%的计算量。作者进一步将算法应用在MICCAI 2021的配准比赛中,同样超越了Transmorph,目前排在第一。只对UNet进行了简单的改造,基于Unet的配准算法依旧可以达到最先进的效果,证明基于UNet的配准网络并未过时。
Abstract:
Due to their extreme long-range modeling capability, vision transformer-based networks have become increasingly popular in deformable image registration. We believe, however, that the receptive field of a 5-layer convolutional U-Net … >>>
Due to their extreme long-range modeling capability, vision transformer-based networks have become increasingly popular in deformable image registration. We believe, however, that the receptive field of a 5-layer convolutional U-Net is sufficient to capture accurate deformations without needing long-range dependencies. The purpose of this study is therefore to investigate whether U-Net-based methods are outdated compared to modern transformer-based approaches when applied to medical image registration. For this, we propose a large kernel U-Net (LKU-Net) by embedding a parallel convolutional block to a vanilla U-Net in order to enhance the effective receptive field. On the public 3D IXI brain dataset for atlas-based registration, we show that the performance of the vanilla U-Net is already comparable with that of state-of-the-art transformer-based networks (such as TransMorph), and that the proposed LKU-Net outperforms TransMorph by using only 1.12% of its parameters and 10.8% of its mult-adds operations. We further evaluate LKU-Net on a MICCAI Learn2Reg 2021 challenge dataset for inter-subject registration, our LKU-Net also outperforms TransMorph on this dataset and ranks first on the public leaderboard as of the submission of this work. With only modest modifications to the vanilla U-Net, we show that U-Net can outperform transformer-based architectures on inter-subject and atlas-based 3D medical image registration. Code is available at this https URL. <<<
翻译
回到顶部