1. <button id="qm3rj"><thead id="qm3rj"></thead></button>
      <samp id="qm3rj"></samp>
      <source id="qm3rj"><menu id="qm3rj"><pre id="qm3rj"></pre></menu></source>

      <video id="qm3rj"><code id="qm3rj"></code></video>

        1. <tt id="qm3rj"><track id="qm3rj"></track></tt>
            1. 2.845

              2023影響因子

              (CJCR)

              • 中文核心
              • EI
              • 中國科技核心
              • Scopus
              • CSCD
              • 英國科學(xué)文摘

              留言板

              尊敬的讀者、作者、審稿人, 關(guān)于本刊的投稿、審稿、編輯和出版的任何問(wèn)題, 您可以本頁(yè)添加留言。我們將盡快給您答復。謝謝您的支持!

              姓名
              郵箱
              手機號碼
              標題
              留言?xún)热?/th>
              驗證碼

              多階段注意力膠囊網(wǎng)絡(luò )的圖像分類(lèi)

              宋燕 王勇

              宋燕, 王勇. 多階段注意力膠囊網(wǎng)絡(luò )的圖像分類(lèi). 自動(dòng)化學(xué)報, 2024, 50(9): 1804?1817 doi: 10.16383/j.aas.c210012
              引用本文: 宋燕, 王勇. 多階段注意力膠囊網(wǎng)絡(luò )的圖像分類(lèi). 自動(dòng)化學(xué)報, 2024, 50(9): 1804?1817 doi: 10.16383/j.aas.c210012
              Song Yan, Wang Yong. Multi-stage attention-based capsule networks for image classification. Acta Automatica Sinica, 2024, 50(9): 1804?1817 doi: 10.16383/j.aas.c210012
              Citation: Song Yan, Wang Yong. Multi-stage attention-based capsule networks for image classification. Acta Automatica Sinica, 2024, 50(9): 1804?1817 doi: 10.16383/j.aas.c210012

              多階段注意力膠囊網(wǎng)絡(luò )的圖像分類(lèi)

              doi: 10.16383/j.aas.c210012 cstr: 32138.14.j.aas.c210012
              基金項目: 國家自然科學(xué)基金 (62073223), 上海市自然科學(xué)基金 (22ZR1443400), 航天飛行動(dòng)力學(xué)技術(shù)國防科技重點(diǎn)實(shí)驗室開(kāi)放課題 (6142210200304)資助
              詳細信息
                作者簡(jiǎn)介:

                宋燕:上海理工大學(xué)教授. 2001年獲得吉林大學(xué)學(xué)士學(xué)位, 2005年獲得電子科技大學(xué)碩士學(xué)位, 2013年獲得上海交通大學(xué)博士學(xué)位. 主要研究方向為模式識別, 數據分析和預測控制. 本文通信作者. E-mail: sonya@usst.edu.cn

                王勇:上海理工大學(xué)碩士研究生. 2019年獲得皖西學(xué)院學(xué)士學(xué)位. 主要研究方向為圖像處理. E-mail: 18856496454@163.com

              Multi-stage Attention-based Capsule Networks for Image Classification

              Funds: Supported by National Natural Science Foundation of China (62073223), Natural Science Foundation of Shanghai (22ZR1443400), and Open Project of Key Laboratory of Aerospace Flight Dynamics and National Defense Science and Technology (6142210200304)
              More Information
                Author Bio:

                SONG Yan Professor at University of Shanghai for Science and Technology. She received her bachelor degree from Jilin University in 2001, the master degree from University of Electronic Science and Technology of China in 2005, and the Ph.D. degree from Shanghai Jiao Tong University in 2013. Her research interest covers pattern recognition, data analysis, and predictive control. Corresponding author of this paper

                WANG Yong Master student at University of Shanghai for Science and Technology. He received his bachelor degree from Western Anhui University in 2019. His main research interest is image processing

              • 摘要: 針對傳統的膠囊網(wǎng)絡(luò )(Capsule network, CapsNet)特征提取不充分的問(wèn)題, 提出一種圖像分類(lèi)的多階段注意力膠囊網(wǎng)絡(luò )模型. 首先, 在卷積層對低層特征和高層特征分別采用注意力(Spatial attention, SA)和通道注意力(Channel attention, CA)來(lái)提取有效特征; 然后, 提出基于向量的注意力(Vector attention, VA)機制作用于動(dòng)態(tài)路由層, 增加對重要膠囊的關(guān)注, 進(jìn)而提高低層膠囊對高層膠囊預測的準確性; 最后, 在五個(gè)公共數據集上進(jìn)行圖像分類(lèi)的對比實(shí)驗. 結果表明, 所提出的CapsNet模型在分類(lèi)精度和魯棒性上優(yōu)于其他膠囊網(wǎng)絡(luò )模型, 在仿射變換圖像重構方面也表現良好.
              • 圖  1  膠囊網(wǎng)絡(luò )結構圖

                Fig.  1  The structure of CapsNet

                圖  2  多階段注意力的膠囊網(wǎng)絡(luò )模型

                Fig.  2  A capsule network model of multi-stage attention

                圖  3  CA和SA機制

                Fig.  3  CA mechanism and SA mechanism

                圖  4  向量注意力機制

                Fig.  4  Vector attention mechanism

                圖  5  圖像重構

                Fig.  5  Image reconstruction

                圖  6  不同改進(jìn)模塊在五個(gè)數據集上的迭代曲線(xiàn)

                Fig.  6  Iteration curves of different improvement modules on five datasets

                圖  7  原圖和仿射變換圖

                Fig.  7  Raw image and affine transformation image

                圖  8  不同模型的魯棒性對比實(shí)驗

                Fig.  8  Comparison of robustness of different models

                圖  9  比較MNIST數據集中的真實(shí)圖像、傳統膠囊網(wǎng)絡(luò )的重構圖像以及本文模型的重構圖像

                Fig.  9  Comparison of the real images from the MNIST dataset, the reconstructions from a conventional capsule network, and the reconstructions from our model

                圖  10  比較Fashion-MNIST 數據集中的真實(shí)圖像、傳統膠囊網(wǎng)絡(luò )的重構圖像以及本文模型的重構圖像

                Fig.  10  Comparison of the real images from the Fashion-MNIST dataset, the reconstructions from a conventional capsule network, and the reconstructions from our model

                圖  11  比較CIFAR-10 數據集中的真實(shí)圖像、傳統膠囊網(wǎng)絡(luò )的重構圖像以及本文模型的重構圖像

                Fig.  11  Comparison of the real images from the CIFAR-10 dataset, the reconstructions from a conventional capsule network, and the reconstructions from our model

                圖  12  比較SVHN 數據集中的真實(shí)圖像、傳統膠囊網(wǎng)絡(luò )的重構圖像以及本文模型的重構圖像

                Fig.  12  Comparison of the real images from the SVHN dataset, the reconstructions from a conventional capsule network, and the reconstructions from our model

                圖  13  比較smallNORB數據集中的真實(shí)圖像、傳統膠囊網(wǎng)絡(luò )的重構圖像以及本文模型的重構圖像

                Fig.  13  Comparison of the real images from the smallNORB dataset, the reconstructions from a conventional capsule network, and the reconstructions from our model

                圖  14  MINST數據集原圖和仿射變換圖

                Fig.  14  Original image and affine transformations images of MINST dataset

                圖  15  圖14(b)的重構實(shí)驗對比圖

                Fig.  15  Comparison of reconstructions to Fig. 14(b)

                圖  16  圖14(c)的重構實(shí)驗對比圖

                Fig.  16  Comparison of reconstructions to Fig. 14(c)

                圖  17  本文模型與文獻[10]的CapsNet重構損失對比曲線(xiàn)

                Fig.  17  Comparison of reconstruction loss curves between our model and CapsNet in [10]

                表  1  不同改進(jìn)模塊在五個(gè)數據集上的分類(lèi)錯誤率(%)

                Table  1  Classification error rates of different improvement modules on five datasets (%)

                模型MNISTFashion-MNISTCIFAR-10SVHNsmallNORB
                Baseline0.387.1121.215.125.62
                Baseline + (SA + CA)0.325.5411.694.615.07
                Baseline + VA0.285.5314.654.995.21
                Baseline + (SA + CA + VA)0.224.639.994.084.89
                下載: 導出CSV

                表  2  不同模型在五個(gè)數據集上的分類(lèi)錯誤率(%)

                Table  2  Classification error rates of different models on five datasets (%)

                模型MNISTFashion-MNISTCIFAR-10SVHNsmallNORB
                Prem Nair et al.'s CapsNet[5]0.5010.2031.478.94
                HitNet[7]0.327.7026.705.50
                Matrix Capsule EM-routing[9]0.705.9716.799.645.20
                SACN[10]0.505.9816.655.017.79
                AR-CapsNet[11]0.5412.71
                DCNet[30]0.255.3617.374.425.57
                MS-CapsNet[31]6.0118.81
                VB-routing[32]5.2011.204.751.60
                Aff-CapsNets[33]0.467.4723.727.85
                本文模型0.224.639.994.084.89
                下載: 導出CSV

                表  3  不同模型的魯棒性對比實(shí)驗(%)

                Table  3  Robustness comparison test of different models (%)

                模型MNISTMNIST-rotation
                CNN0.745.52
                CapsNet[6]0.382.11
                EM-routing[9]0.432.65
                本文模型0.220.63
                下載: 導出CSV
                1. <button id="qm3rj"><thead id="qm3rj"></thead></button>
                  <samp id="qm3rj"></samp>
                  <source id="qm3rj"><menu id="qm3rj"><pre id="qm3rj"></pre></menu></source>

                  <video id="qm3rj"><code id="qm3rj"></code></video>

                    1. <tt id="qm3rj"><track id="qm3rj"></track></tt>
                        亚洲第一网址_国产国产人精品视频69_久久久久精品视频_国产精品第九页
                      1. [1] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In: Proceedings of the Conference on Neural Information Processing Systems. Lake Tahoe, USA: NIPS, 2012. 1097−1105
                        [2] Simonyan K, Zissweman A. Very deep convolutional networks for large-scale image recognition. In: Proceedings of the International Conference on Learning Representations. San Diego, USA: ICLR, 2015. 1?14
                        [3] Howard A G, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, et al. Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv: 1704.04861, 2017.
                        [4] Huang G, Liu Z, Van Der Maaten L, Weinberger K Q. Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 2261?2269
                        [5] Nair P, Doshi R, Keselj S. Pushing the limits of capsule networks. arXiv preprint arXiv: 2103.08074, 2021.
                        [6] Sabour S, Frosst N, Hinton G E. Dynamic routing between capsules. In: Proceedings of the Neural Information Processing Systems. Long Beach, USA: NIPS, 2017. 3856?3866
                        [7] Deliege A, Cioppa A, Van Droogenbroeck M. HitNet: A neural network with capsules embedded in a hit-or-miss layer, extended with hybrid data augmentation and ghost capsules. arXiv preprint arXiv: 1806.06519, 2018.
                        [8] Xi E, Bing S, Jin Y. Capsule network performance on complex data. arXiv preprint arXiv: 1712.03480, 2017.
                        [9] Hinton G E, Sabour S, Frosst N. Matrix capsules with EM routing. In: Proceedings of the International Conference on Learning Representations. Vancouver, Canada: ICLR, 2018. 1?15
                        [10] Hoogi A, Wilcox B, Gupta Y, Rubin D L. Self-attention capsule networks for object classification. arXiv preprint arXiv: 1904.12483, 2019.
                        [11] Choi J, Seo H, Im S, Kang M. Attention routing between capsules. In: Proceedings of the IEEE International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 1981?1989
                        [12] Wang X, Tu Z, Zhang M. Incorporating statistical machine translation word knowledge into neural machine translation. IEEE/ACM Transactions on Audio, Speech, and Language Proceeding, 2018, 26(12): 2255?2266 doi: 10.1109/TASLP.2018.2860287
                        [13] Zhang B, Xiong D, Su J. Neural machine translation with deep attention. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 42(1): 154?163
                        [14] Zhang B, Xiong D, Xie J, Su J. Neural machine translation with gru-gated attention model. IEEE Transactions on Neural Networks and Learning Systems, 2020, 31(11): 4688?4698 doi: 10.1109/TNNLS.2019.2957276
                        [15] 王金甲, 紀紹男, 崔琳, 夏靜, 楊倩. 基于注意力膠囊網(wǎng)絡(luò )的家庭活動(dòng)識別. 自動(dòng)化學(xué)報, 2019, 45(11): 2199?2204

                        Wang Jin-Jia, Ji Shao-Nan, Cui Lin, Xia Jing, Yang Qian. Identification of family activities based on attention capsule network. Acta Automatica Sinica, 2019, 45(11): 2199?2204
                        [16] Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhutdinov R, et al. Show, attend and tell: Neural image caption generation with visual attention. In: Proceedings of the International Conference on Machine Learning. Lugano, Switzerland: ICML, 2015. 2048?2057
                        [17] Gao L, Li X, Song J, Shen H T. Hierarchical lstms with adaptive attention for visual captioning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 42(5): 1112?1131
                        [18] Lu X, Wang B, Zheng X. Sound active attention framework for remote sensing image captioning. IEEE Transactions on Geoscience and Remote Sensing, 2019, 58(3): 1985?2000
                        [19] Wang X, Duan H. Hierarchical visual attention model for saliency detection inspired by avian pathways. IEEE/CAA Journal of Automatica Sinica, 2017, 6(2): 540?552
                        [20] Xu H, Saenko K. Ask, attend and answer: Exploring question-guided spatial attention for visual question answering. In: Proceedings of the European Conference on Computer Vision. Amsterdam, The Netherlands: ECCV, 2016. 451?466
                        [21] Liang J, Jiang L, Cao L, Kalantidis Y, Li L J, Hauptmann A G. Focal visual-text attention for memex question answering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(8): 1893?1908 doi: 10.1109/TPAMI.2018.2890628
                        [22] 肖進(jìn)勝, 申夢(mèng)瑤, 江明俊, 雷俊峰, 包振宇. 融合包注意力機制的監控視頻異常行為檢測. 自動(dòng)化學(xué)報, 2022, 48(12): 2951?2959

                        Xiao Jin-Sheng, Shen Meng-Yao, Jiang Ming-Jun, Lei Jun-Feng, Bao Zhen-Yu. Abnormal behavior detection algorithm with video-bag attention mechanism in surveillance video. Acta Automatica Sinica, 2022, 48(12): 2951?2959
                        [23] Zhao X, Chen Y, Guo J, Zhao D. A spatial-temporal attention model for human trajectory prediction. IEEE/CAA Journal of Automatica Sinica, 2020, 7(4): 965?974 doi: 10.1109/JAS.2020.1003228
                        [24] 王亞珅, 黃河燕, 馮沖, 周強. 基于注意力機制的概念化句嵌入研究. 自動(dòng)化學(xué)報, 2020, 46(7): 1390?1400

                        Wang Ya-Kun, Huang He-Yan, Feng Chong, Zhou Qiang. A study of conceptual sentence embedding based on attentional mechanism. Acta Automatica Sinica, 2020, 46(7): 1390?1400
                        [25] 馮建周, 馬祥聰. 基于遷移學(xué)習的細粒度實(shí)體分類(lèi)方法的研究. 自動(dòng)化學(xué)報, 2020, 46(8): 1759?1766

                        Feng Jian-Zhou, Ma Xiang-Cong. Research on fine-grained entity classification method based on transfer learning. Acta Automatica Sinica, 2020, 46(8): 1759?1766
                        [26] 王縣縣, 禹龍, 田生偉, 王瑞錦. 獨立RNN和膠囊網(wǎng)絡(luò )的維吾爾語(yǔ)事件缺失元素填充. 自動(dòng)化學(xué)報, 2021, 47(4): 903?912

                        Wang Xian-Xian, Yu Long, Tian Sheng-Wei, Wang Rui-Jin. Independent RNN and CAPE networks were populated with missing elements of Uyghur events. Acta Automatica Sinica, 2021, 47(4): 903?912
                        [27] Wang X, Girshick R, Gupta A, He K. Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake, USA: IEEE, 2018. 7794?7803
                        [28] Woo S, Park J, Lee J Y, Kweon I S. Cbam: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision. Munich, Germany: ECCV, 2018. 3?19
                        [29] Hu J, Shen L, Sun G, Wu E. Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011?2023 doi: 10.1109/TPAMI.2019.2913372
                        [30] Phaye S S R, Sikka A, Dhall A, Bathula D. Dense and diverse capsule networks: Making the capsules learn better. arXiv preprint arXiv: 1805.04001, 2018.
                        [31] Xiang C, Zhang L, Tang Y, Zou W, Xu C. MS-CapsNet: A novel multi-scale capsule network. IEEE Signal Processing Letters, 2018, 25(12): 1850?1854 doi: 10.1109/LSP.2018.2873892
                        [32] Ribeiro F D S, Leontidis G, Kollias S. Capsule routing via variational bayes. In: Proceedings of the AAAI Conference on Artificial Intelligence. New York, USA: AAAI, 2020. 3749?3756
                        [33] Gu J, Tresp V. Improving the robustness of capsule networks to image affine transformation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020. 7283?7291
                      2. 加載中
                      3. 圖(17) / 表(3)
                        計量
                        • 文章訪(fǎng)問(wèn)數:  1755
                        • HTML全文瀏覽量:  1168
                        • PDF下載量:  215
                        • 被引次數: 0
                        出版歷程
                        • 收稿日期:  2021-01-05
                        • 錄用日期:  2021-05-12
                        • 網(wǎng)絡(luò )出版日期:  2021-06-20
                        • 刊出日期:  2024-09-19

                        目錄

                          /

                          返回文章
                          返回