基于特征變換和度量網(wǎng)絡(luò )的小樣本學(xué)習算法
doi: 10.16383/j.aas.c210903
-
1.
中國科學(xué)技術(shù)大學(xué) 合肥 230026
-
2.
中國科學(xué)院自動(dòng)化研究所模式識別國家重點(diǎn)實(shí)驗室 北京100190
-
3.
阿里巴巴科技(北京)有限公司 北京 100016
Feature Transformation and Metric Networks for Few-shot Learning
-
1.
University of Science and Technology of China, Hefei 230026
-
2.
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190
-
3.
Alibaba (Beijing) Group, Beijing 100016
-
摘要: 在小樣本分類(lèi)任務(wù)中, 每個(gè)類(lèi)別可供訓練的樣本數量非常有限. 因此在特征空間中同類(lèi)樣本分布稀疏, 異類(lèi)樣本間邊界模糊. 提出一種新的基于特征變換和度量網(wǎng)絡(luò )(Feature transformation and metric networks, FTMN)的小樣本學(xué)習算法用于小樣本分類(lèi)任務(wù). 算法通過(guò)嵌入函數將樣本映射到特征空間, 并計算輸入該樣本與所屬類(lèi)別中心的特征殘差. 構造一個(gè)特征變換函數對該殘差進(jìn)行學(xué)習, 使特征空間內的樣本特征經(jīng)過(guò)該函數后向同類(lèi)樣本中心靠攏. 利用變換后的樣本特征更新類(lèi)別中心, 使各類(lèi)別中心間的距離增大. 算法進(jìn)一步構造了一種新的度量函數, 對樣本特征中每個(gè)局部特征點(diǎn)的度量距離進(jìn)行聯(lián)合表達, 該函數能夠同時(shí)對樣本特征間的夾角和歐氏距離進(jìn)行優(yōu)化. 算法在小樣本分類(lèi)任務(wù)常用數據集上的優(yōu)秀表現證明了算法的有效性和泛化性.
-
關(guān)鍵詞:
- 特征變換 /
- 度量學(xué)習 /
- 小樣本學(xué)習 /
- 殘差學(xué)習
Abstract: For few-shot classification, training samples for each class are highly limited. Consequently, samples from the same class tend to distribute sparsely while boundaries between different classes are indistinct in the feature space. Therefore, a novel few-shot learning algorithm based on feature transformation and metric networks (FTMN) is proposed for few-shot learning. The algorithm maps samples to the feature space through an embedding function and calculates the residual between the input features and their class center. A feature transformation function is then constructed to learn from the residual, enabling input features to move closer to their class center after transformation. The transformed features are used to update the class centers, increasing the distance between centers of different classes. Furthermore, the algorithm introduces a novel metric function that jointly expresses the metric distances of each point within the features. The metric function simultaneously optimizes both cosine similarity and Euclidean distance. The performance of the algorithm on commonly used datasets for few-shot classification validates its effectiveness and generalization ability.-
Key words:
- Feature transformation /
- metric learning /
- few-shot learning /
- residual learning
-
表 1 網(wǎng)絡(luò )模型的嵌入函數與重要結構
Table 1 Embedding function and important structures of networks
模型名稱(chēng) 嵌入函數 重要結構 MN 4層卷積網(wǎng)絡(luò ) 注意力長(cháng)短時(shí)記憶網(wǎng)絡(luò ) ProtoNet[12] 4層卷積網(wǎng)絡(luò ) “原型”概念、使用歐氏距離進(jìn)行度量 RN 4層卷積網(wǎng)絡(luò ) 卷積神經(jīng)網(wǎng)絡(luò )作為度量函數 EGNN 4層卷積網(wǎng)絡(luò ) 邊標簽預測節點(diǎn)類(lèi)別 EGNN + Transduction[22] ResNet-12 邊標簽預測節點(diǎn)類(lèi)別、轉導和標簽傳遞 DN4[24] ResNet-12 局部描述子、圖像與類(lèi)別間的相似性度量 DC[25] 4層卷積網(wǎng)絡(luò ) 稠密分類(lèi) DC + IMP[25] 4層卷積網(wǎng)絡(luò ) 稠密分類(lèi)、神經(jīng)網(wǎng)絡(luò )遷移 FTMN 4層卷積網(wǎng)絡(luò ) 特征變換模塊、特征度量模塊 FTMN-R12 ResNet-12 特征變換模塊、特征度量模塊 下載: 導出CSV表 2 在Omniglot數據集上的小樣本分類(lèi)性能(%)
Table 2 Few-shot classification performance on Omniglot dataset (%)
模型 5-類(lèi) 20-類(lèi) 1-樣本 5-樣本 1-樣本 5-樣本 MN 98.1 98.9 93.8 98.5 ProtoNet[12] 98.8 99.7 96.0 98.9 SN 97.3 98.4 88.2 97.0 RN 99.6 ± 0.2 99.8 ± 0.1 97.6 ± 0.2 99.1 ± 0.1 SM[15] 98.4 99.6 95.0 98.6 MetaNet[16] 98.95 — 97.00 — MANN[17] 82.8 94.9 — — MAML[18] 98.7 ± 0.4 99.9 ± 0.1 95.8 ± 0.3 98.9 ± 0.2 MMNet[26] 99.28 ± 0.08 99.77 ± 0.04 97.16 ± 0.10 98.93 ± 0.05 FTMN 99.7 ± 0.1 99.9 ± 0.1 98.3 ± 0.1 99.5 ± 0.1 下載: 導出CSV表 3 在miniImageNet數據集上的小樣本分類(lèi)性能 (%)
Table 3 Few-shot classification performance on miniImageNet dataset (%)
模型 5-類(lèi) 1-樣本 5-樣本 MN 43.40 ± 0.78 51.09 ± 0.71 ML-LSTM[11] 43.56 ± 0.84 55.31 ± 0.73 ProtoNet[12] 49.42 ± 0.78 68.20 ± 0.66 RN 50.44 ± 0.82 65.32 ± 0.70 MetaNet[16] 49.21 ± 0.96 — MAML[18] 48.70 ± 1.84 63.11 ± 0.92 EGNN — 66.85 EGNN + Transduction[22] — 76.37 DN4[24] 51.24 ± 0.74 71.02 ± 0.64 DC[25] 62.53 ± 0.19 78.95 ± 0.13 DC + IMP[25] — 79.77 ± 0.19 MMNet[26] 53.37 ± 0.08 66.97 ± 0.09 PredictNet[27] 54.53 ± 0.40 67.87 ± 0.20 DynamicNet[28] 56.20 ± 0.86 72.81 ± 0.62 MN-FCE[29] 43.44 ± 0.77 60.60 ± 0.71 MetaOptNet[30] 60.64 ± 0.61 78.63 ± 0.46 FTMN 59.86 ± 0.91 75.96 ± 0.82 FTMN-R12 61.33 ± 0.21 79.59 ± 0.47 下載: 導出CSV表 4 在CUB-200、CIFAR-FS和tieredImageNet數據集上的小樣本分類(lèi)性能(%)
Table 4 Few-shot classification performance on CUB-200, CIFAR-FS and tieredImageNet datasets (%)
模型 CUB-200 5-類(lèi) CIFAR-FS 5-類(lèi) tieredImageNet 5-類(lèi) 1-樣本 5-樣本 1-樣本 5-樣本 1-樣本 5-樣本 MN 61.16 ± 0.89 72.86 ± 0.70 — — — — ProtoNet[12] 51.31 ± 0.91 70.77 ± 0.69 55.5 ± 0.7 72.0 ± 0.6 53.31 ± 0.89 72.69 ± 0.74 RN 62.45 ± 0.98 76.11 ± 0.69 55.0 ± 1.0 69.3 ± 0.8 54.48 ± 0.93 71.32 ± 0.78 MAML[18] 55.92 ± 0.95 72.09 ± 0.76 58.9 ± 1.9 71.5 ± 1.0 51.67 ± 1.81 70.30 ± 1.75 EGNN — — — — 63.52 ± 0.52 80.24 ± 0.49 DN4[24] 53.15 ± 0.84 81.90 ± 0.60 — — — — MetaOptNet[30] — — 72.0 ± 0.7 84.2 ± 0.5 65.99 ± 0.72 81.56 ± 0.53 FTMN-R12 69.58 ± 0.36 85.46 ± 0.79 70.3 ± 0.5 82.6 ± 0.3 62.14 ± 0.63 81.74 ± 0.33 下載: 導出CSV表 5 消融實(shí)驗結果 (%)
Table 5 Results of ablation study (%)
模型 5-類(lèi) 1-樣本 5-樣本 ProtoNet-4C 49.42 ± 0.78 68.20 ± 0.66 ProtoNet-8C 51.18 ± 0.73 70.23 ± 0.46 ProtoNet-Trans-4C 53.47 ± 0.46 71.33 ± 0.23 ProtoNet-M-4C 56.54 ± 0.57 73.46 ± 0.53 ProtoNet-VLAD-4C 52.46 ± 0.67 70.83 ± 0.62 Trans*-M-4C 59.86 ± 0.91 67.86 ± 0.56 僅使用余弦相似度 54.62 ± 0.57 72.58 ± 0.38 僅使用歐氏距離 55.66 ± 0.67 73.34 ± 0.74 FTMN 59.86 ± 0.91 75.96 ± 0.82 下載: 導出CSV亚洲第一网址_国产国产人精品视频69_久久久久精品视频_国产精品第九页 -
[1] Szegedy C, Liu W, Jia Y Q, Sermanet P, Reed S, Anguelov D, et al. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015. 1?9 [2] He K M, Zhang X Y, Ren S Q, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 770?778 [3] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In: Proceedings of the 26th International Conference on Neural Information Processing Systems. Lake Tahoe, USA: NIPS, 2012. 1106?1114 [4] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: Proceedings of the 3rd International Conference on Learning Representations. San Diego, USA: ICLR, 2015. [5] 劉穎, 雷研博, 范九倫, 王富平, 公衍超, 田奇. 基于小樣本學(xué)習的圖像分類(lèi)技術(shù)綜述. 自動(dòng)化學(xué)報, 2021, 47(2): 297?315 doi: 10.16383/j.aas.c190720Liu Ying, Lei Yan-Bo, Fan Jiu-Lun, Wang Fu-Ping, Gong Yan-Chao, Tian Qi. Survey on image classification technology based on small sample learning. Acta Automatica Sinica, 2021, 47(2): 297?315 doi: 10.16383/j.aas.c190720 [6] Miller E G, Matsakis N E, Viola P A. Learning from one example through shared densities on transforms. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Hilton Head Island, USA: IEEE, 2000. 464?471 [7] Li F F, Fergus R, Perona P. One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(4): 594?611 [8] Lake B M, Salakhutdinov R, Gross J, Tenenbaum J B. One shot learning of simple visual concepts. In: Proceedings of the 33rd Annual Meeting of the Cognitive Science Society. Boston, USA: CogSci, 2011. 2568?2573 [9] Lake B M, Salakhutdinov R, Tenenbaum J B. Human-level concept learning through probabilistic program induction. Science, 2015, 350(11): 1332?1338 [10] Edwards H, Storkey A J. Towards a neural statistician. In: Proceedings of the 5th International Conference on Learning Representations. Toulon, France: ICLR, 2017. [11] Vinyals O, Blundell C, Lillicrap T, Kavukcuoglu K, Wierstra D. Matching networks for one shot learning. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain: 2016. 3637?3645 [12] Snell J, Swersky K, Zemel R. Prototypical networks for few-shot learning. In: Proceedings of the 31th International Conference on Neural Information Processing Systems. Long Beach, USA: 2017. 4080?4090 [13] Koch G, Zemel R, Salakhutdinov R. Siamese neural networks for one-shot image recognition. In: Proceedings of the 32nd International Conference on Machine Learning. Lille, France: JMLR, 2015. [14] Sung F, Yang Y X, Zhang L, Xiang T, Torr P H S, Hospedales T M. Learning to compare: Relation network for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 1199?1208 [15] Kaiser L, Nachum O, Roy A, Bengio S. Learning to remember rare events. In: Proceedings of the 5th International Conference on Learning Representations. Toulon, France: ICLR, 2017. [16] Munkhdalai T, Yu H. Meta networks. In: Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia: JMLR.org, 2017. 2554?2563 [17] Santoro A, Bartunov S, Botvinick M, Wierstra D, Lillicrap T. Meta-learning with memory-augmented neural networks. In: Proceedings of the 33rd International Conference on Machine Learning. New York, USA: PMLR, 2016. 1842?1850 [18] Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia: JMLR.org, 2017. 1126?1135 [19] Arandjelovic R, Gronat P, Torii A, Pajdla T, Sivic J. Net-VLAD: CNN architecture for weakly supervised place recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 5297?5307 [20] Jégou H, Douze M, Schmid C, Pérez P. Aggregating local descriptors into a compact image representation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, USA: IEEE, 2010. 3304?3311 [21] Bertinetto L, Henriques J F, Torr P H, Vedaldi A. Meta-learning with differentiable closed-form solvers. In: Proceedings of the 7th International Conference on Learning Representations. New Orleans, USA: ICLR, 2019. [22] Kim J, Kim T, Kim S, Yoo C D. Edge-labeling graph neural network for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 11?20 [23] Yue Z Q, Zhang H W, Sun Q R, Hua X S. Interventional few-shot learning. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Incorporated, 2020. Article No. 230 [24] Li W B, Wang L, Xu J L, Huo J, Gao Y, Luo J B. Revisiting local descriptor based image-to-class measure for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 7253?7260 [25] Lifchitz Y, Avrithis Y, Picard S, Bursuc A. Dense classification and implanting for few-shot learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 9250?9259 [26] Cai Q, Pan Y W, Yao T, Yan C G, Mei T. Memory matching networks for one-shot image recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 4080?4088 [27] Qiao S Y, Liu C X, Shen W, Yuille A L. Few-shot image recognition by predicting parameters from activations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 7229?7238 [28] Gidaris S, Komodakis N. Dynamic few-shot visual learning without forgetting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 4367?4375 [29] Ravi S, Larochelle H. Optimization as a model for few-shot learning. In: Proceedings of the 5th International Conference on Learning Representations. Toulon, France: ICLR, 2017. [30] Lee K, Maji S, Ravichandran A, Soatto S. Meta-learning with differentiable convex optimization. In: Proceedings of the IEEE/ CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 10649?10657