Paper-Hub

2023, arXiv. arXiv ID: 2301.11360

Rethinking 1x1 Convolutions: Can we train CNNs with Frozen Random Filters?

Paul Gavrikov, Janis Keuper

Abstract:

Modern CNNs are learning the weights of vast numbers of convolutional operators. In this paper, we raise the fundamental question if this is actually necessary. We show that even in the extreme case of only randomly initializing and never updating spatial filters, certain CNN architectures can be trained to surpass the accuracy of standard training. By reinterpreting the notion of pointwise (1×1) convolutions as an operator to learn linear combinations (LC) of frozen (random) spatial filters, we are able to analyze these effects and propose a generic LC convolution block that allows tuning of the linear combination rate. Empirically, we show that this approach not only allows us to reach high test accuracies on CIFAR and ImageNet but also has favorable properties regarding model robustness, generalization, sparsity, and the total number of necessary weights. Additionally, we propose a novel weight sharing mechanism, which allows sharing of a single weight tensor between all spatial convolution layers to massively reduce the number of weights.

2023-01-31 23:30:00

前进:

#paper Rethinking 1x1 Convolutions: Can we train CNNs with Frozen Random Filters? arXiv:2301.11360 本文引入了一种新的卷积块，计算(冻结随机)滤波器的可学习线性组合(LC)，并由此提出 LCResNets，还提出一种新的权重共享机制，可大幅减少权重的数量。在本文中，即使在仅随机初始化且从不更新空间滤波器的极端情况下，某些CNN架构也可以被训练以超过标准训练的精度。通过将逐点(1x1)卷积的概念重新解释为学习冻结(随机)空间滤波器的线性组合(LC)的算子，这种方法不仅可以在CIFAR和ImageNet上达到较高的测试精度，而且在模型鲁棒性、泛化、稀疏性和所需权重的总数方面具有良好。此外本文提出了一种新的权重共享机制，该机制允许在所有空间卷积层之间共享单个权重张量，以大幅减少权重的数量。