1. <button id="qm3rj"><thead id="qm3rj"></thead></button>
      <samp id="qm3rj"></samp>
      <source id="qm3rj"><menu id="qm3rj"><pre id="qm3rj"></pre></menu></source>

      <video id="qm3rj"><code id="qm3rj"></code></video>

        1. <tt id="qm3rj"><track id="qm3rj"></track></tt>
            1. 2.845

              2023影響因子

              (CJCR)

              • 中文核心
              • EI
              • 中國科技核心
              • Scopus
              • CSCD
              • 英國科學(xué)文摘

              留言板

              尊敬的讀者、作者、審稿人, 關(guān)于本刊的投稿、審稿、編輯和出版的任何問(wèn)題, 您可以本頁(yè)添加留言。我們將盡快給您答復。謝謝您的支持!

              姓名
              郵箱
              手機號碼
              標題
              留言?xún)热?/th>
              驗證碼

              一種基于信息熵遷移的文本檢測模型自蒸餾方法

              陳建煒 楊帆 賴(lài)永炫

              陳建煒, 楊帆, 賴(lài)永炫. 一種基于信息熵遷移的文本檢測模型自蒸餾方法. 自動(dòng)化學(xué)報, 2024, 50(11): 2128?2139 doi: 10.16383/j.aas.c210598
              引用本文: 陳建煒, 楊帆, 賴(lài)永炫. 一種基于信息熵遷移的文本檢測模型自蒸餾方法. 自動(dòng)化學(xué)報, 2024, 50(11): 2128?2139 doi: 10.16383/j.aas.c210598
              Chen Jian-Wei, Yang Fan, Lai Yong-Xuan. Self-distillation via entropy transfer for scene text detection. Acta Automatica Sinica, 2024, 50(11): 2128?2139 doi: 10.16383/j.aas.c210598
              Citation: Chen Jian-Wei, Yang Fan, Lai Yong-Xuan. Self-distillation via entropy transfer for scene text detection. Acta Automatica Sinica, 2024, 50(11): 2128?2139 doi: 10.16383/j.aas.c210598

              一種基于信息熵遷移的文本檢測模型自蒸餾方法

              doi: 10.16383/j.aas.c210598 cstr: 32138.14.j.aas.c210598
              基金項目: 科技創(chuàng )新2030——“新一代人工智能”重大項目(2021ZD0112600), 國家自然科學(xué)基金委員會(huì )面上項目(62173282, 61872154), 廣東省自然科學(xué)基金(2021A1515011578), 深圳市基礎研究專(zhuān)項面上項目(JCYJ20190809161603551)資助
              詳細信息
                作者簡(jiǎn)介:

                陳建煒:廈門(mén)大學(xué)航空航天學(xué)院碩士研究生. 主要研究方向為計算機視覺(jué), 圖像處理. E-mail: jianweichen@ stu.xmu.edu.cn

                楊帆:廈門(mén)大學(xué)航空航天學(xué)院副教授. 主要研究方向為機器學(xué)習, 數據挖掘和生物信息學(xué). 本文通信作者. E-mail: yang@xmu.edu.cn

                賴(lài)永炫:廈門(mén)大學(xué)信息學(xué)院教授. 主要研究方向為大數據分析和管理, 智能交通系統, 深度學(xué)習和車(chē)載網(wǎng)絡(luò ). E-mail: laiyx@xmu.edu.cn

              Self-distillation via Entropy Transfer for Scene Text Detection

              Funds: Supported by National Key Research and Development Program of China (2021ZD0112600), National Natural Science Foundation of China (62173282, 61872154), Natural Science Foundation of Guangdong Province (2021A1515011578), and Shenzhen Fundamental Research Program (JCYJ20190809161603551)
              More Information
                Author Bio:

                CHEN Jian-Wei Master student at the School of Aerospace Engineering, Xiamen University. His research interest covers computer vision and image processing

                YANG Fan Associate professor at the School of Aerospace Engineering, Xiamen University. His research interest covers machine learning, data mining, and bio-informatics. Corresponding author of this paper

                LAI Yong-Xuan Professor at the School of Informatics, Xiamen University. His research interest covers big data analysis and management, intelligent transportation systems, deep learning, and vehicular networks

              • 摘要: 前沿的自然場(chǎng)景文本檢測方法大多基于全卷積語(yǔ)義分割網(wǎng)絡(luò ), 利用像素級分類(lèi)結果有效檢測任意形狀的文本, 其主要缺點(diǎn)是模型大、推理時(shí)間長(cháng)、內存占用高, 這在實(shí)際應用中限制了其部署. 提出一種基于信息熵遷移的自蒸餾訓練方法(Self-distillation via entropy transfer, SDET), 利用文本檢測網(wǎng)絡(luò )深層網(wǎng)絡(luò )輸出的分割圖(Segmentation map, SM)信息熵作為待遷移知識, 通過(guò)輔助網(wǎng)絡(luò )將信息熵反饋給淺層網(wǎng)絡(luò ). 與依賴(lài)教師網(wǎng)絡(luò )的知識蒸餾 (Knowledge distillation, KD)不同, SDET僅在訓練階段增加一個(gè)輔助網(wǎng)絡(luò ), 以微小的額外訓練代價(jià)實(shí)現無(wú)需教師網(wǎng)絡(luò )的自蒸餾(Self-distillation, SD). 在多個(gè)自然場(chǎng)景文本檢測的標準數據集上的實(shí)驗結果表明, SDET在基線(xiàn)文本檢測網(wǎng)絡(luò )的召回率和F1得分上, 能顯著(zhù)優(yōu)于其他蒸餾方法.
              • 圖  1  可微二值化文本檢測網(wǎng)絡(luò )的分割圖和信息熵圖可視化

                Fig.  1  Segmentation map and entropy map visualization of differentiable binarization text detection network

                圖  2  不同知識蒸餾方法對比

                Fig.  2  Comparison of different knowledge distillation methods

                圖  3  SDET訓練框架

                Fig.  3  SDET training framework

                圖  4  輔助網(wǎng)絡(luò )的3種結構形式

                Fig.  4  The three types of auxiliary networks

                圖  5  SDET與基線(xiàn)模型的檢測結果對比((a)真實(shí)標簽; (b)基線(xiàn)模型檢測結果; (c) SDET訓練后的模型檢測結果)

                Fig.  5  Comparison of detection results between SDET and baseline models ((a) Ground-truth; (b) Detection results of baseline models; (c) Detection results of models trained with SDET)

                表  1  不同輔助分類(lèi)器對SDET的影響 (%)

                Table  1  The impact of different auxiliary classifiers on SDET (%)

                模型方法ICDAR2013ICDAR2015
                PRFPRF
                MV3-EAST基線(xiàn)81.764.472.080.975.478.0
                A型78.865.971.878.876.377.5
                B型84.466.574.481.377.079.1
                C型81.467.473.778.977.778.3
                MV3-DB基線(xiàn)83.766.073.887.171.878.7
                A型84.168.875.786.573.979.7
                B型81.167.373.687.871.778.9
                C型84.967.975.487.873.079.7
                下載: 導出CSV

                表  2  不同特征金字塔位置對B型的影響 (%)

                Table  2  The impact of different feature pyramid positions on type B (%)

                方法特征圖尺寸(像素)PRF
                基線(xiàn)80.975.478.0
                P0${\text{16}} \times {\text{16}}$79.175.877.4
                P1${\text{32}} \times {\text{32}}$79.576.578.0
                P2${\text{64}} \times {\text{64}}$80.777.479.0
                P3${\text{128}} \times {\text{128}}$81.377.079.1
                下載: 導出CSV

                表  3  MV3-DB在不同數據集上的知識蒸餾實(shí)驗結果(%)

                Table  3  Experimental results of knowledge distillation of MV3-DB on different datasets (%)

                方法ICDAR2013TD500TD-TRICDAR2015Total-textCASIA-10K
                PRFPRFPRFPRFPRFPRF
                基線(xiàn)83.766.073.878.771.474.983.674.478.787.171.878.787.266.975.788.151.965.3
                ST82.565.873.277.073.074.984.673.578.785.472.278.287.465.374.888.849.463.5
                KA82.566.873.879.571.375.286.372.578.885.073.378.785.966.875.287.851.464.8
                FitNets84.765.473.878.673.375.885.374.079.285.373.378.887.467.576.288.052.365.6
                SKD82.468.875.081.270.675.584.874.579.387.471.678.787.467.075.988.651.665.2
                SD83.567.874.879.472.275.685.074.079.185.173.078.687.067.676.187.152.065.1
                SAD82.866.773.978.772.375.487.372.078.986.772.779.186.567.175.688.450.764.4
                本文方法84.168.875.780.672.276.285.674.679.786.573.979.787.568.476.887.453.466.3
                下載: 導出CSV

                表  4  MV3-EAST在不同數據集上的知識蒸餾實(shí)驗結果(%)

                Table  4  Experimental results of knowledge distillation of MV3-EAST on different datasets (%)

                方法ICDAR2013ICDAR2015CASIA-10K
                PRFPRFPRF
                基線(xiàn)81.764.472.080.975.478.066.164.965.5
                ST77.864.970.880.975.177.964.765.164.9
                KA78.664.070.578.276.477.367.763.065.3
                FitNets82.465.873.278.077.877.965.464.264.8
                SKD79.566.372.381.975.678.666.664.765.6
                SD80.263.871.179.674.777.166.263.564.8
                SAD81.465.672.680.276.578.365.764.164.9
                本文方法84.466.574.481.377.079.170.863.066.7
                下載: 導出CSV

                表  5  SDET與DSN在不同數據集上的對比(%)

                Table  5  Comparison of SDET and DSN on different datasets (%)

                方法ICDAR2013TD500TD-TRICDAR2015Total-textCASIA-10K
                PRFPRFPRFPRFPRFPRF
                基線(xiàn)83.766.073.878.771.474.983.674.478.787.171.878.787.266.975.788.151.965.3
                DSN84.468.075.379.771.575.486.472.278.785.873.479.186.167.975.987.952.365.6
                本文方法84.168.875.780.672.276.285.674.679.786.573.979.787.568.476.887.453.466.3
                下載: 導出CSV

                表  6  SDET在不同數據集上提升ResNet50-DB的效果(%)

                Table  6  The effect of SDET on improving ResNet50-DB on different datasets (%)

                方法ICDAR2013TD500TD-TRICDAR2015Total-textCASIA-10K
                PRFPRFPRFPRFPRFPRF
                基線(xiàn)86.372.979.084.175.979.887.380.483.790.380.184.987.779.483.390.164.775.3
                本文方法82.777.279.979.981.580.787.283.085.090.382.186.087.481.884.586.068.776.4
                下載: 導出CSV
                1. <button id="qm3rj"><thead id="qm3rj"></thead></button>
                  <samp id="qm3rj"></samp>
                  <source id="qm3rj"><menu id="qm3rj"><pre id="qm3rj"></pre></menu></source>

                  <video id="qm3rj"><code id="qm3rj"></code></video>

                    1. <tt id="qm3rj"><track id="qm3rj"></track></tt>
                        亚洲第一网址_国产国产人精品视频69_久久久久精品视频_国产精品第九页
                      1. [1] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015. 3431?3440
                        [2] Yuan Y H, Chen X L, Wang J D. Object-contextual representations for semantic segmentation. arXiv preprint arXiv: 1909.11065, 2019.
                        [3] Lv P Y, Liao M H, Yao C, Wu W H, Bai X. Mask textspotter: An end-to-end trainable neural network for spotting text with arbitrary shapes. In: Proceedings of the European Conference on Computer Vision. Munich, Germany: Springer, 2018. 67?83
                        [4] He K M, Gkioxari G, Dollár P, Girshick R. Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 2961?2969
                        [5] Ye J, Chen Z, Liu J H, Du B. TextFuseNet: Scene text detection with richer fused features. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence. Yokohama, Japan: 2020. 516?522
                        [6] He K M, Zhang X Y, Ren S Q, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 770?778
                        [7] Hinton G E, Vinyals O, Dean J. Distilling the knowledge in a neural network. arXiv preprint arXiv: 1503.02531, 2015.
                        [8] 賴(lài)軒, 曲延云, 謝源, 裴玉龍. 基于拓撲一致性對抗互學(xué)習的知識蒸餾. 自動(dòng)化學(xué)報, 2023, 49(1): 102?110 doi: 10.16383/j.aas.200665

                        Lai Xuan, Qu Yan-Yun, Xie Yuan, Pei Yu-Long. Topology-guided adversarial deep mutual learning for knowledge distillation. Acta Automatica Sinica, 2023, 49(1): 102?110 doi: 10.16383/j.aas.200665
                        [9] Romero A, Ballas N, Kahou S E, Chassang A, Gatta C, Bengio Y. FitNets: Hints for thin deep nets. arXiv preprint arXiv: 1412.6550, 2014.
                        [10] Zagoruyko S, Komodakis N. Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. arXiv preprint arXiv: 1612.03928, 2016.
                        [11] Karatzas D, Gomez-Bigorda L, Nicolaou A, Ghosh S, Bagdanov A, Iwamura M, et al. ICDAR2015 competition on robust reading. In: Proceedings of the 13th International Conference on Document Analysis and Recognition. Nancy, France: IEEE, 2015. 1156?1160
                        [12] Chng C K, Chan C S. Total-text: A comprehensive data-set for scene text detection and recognition. In: Proceedings of the 14th International Conference on Document Analysis and Recognition. Kyoto, Japan: IEEE, 2017. 935?942
                        [13] Cho J H, Hariharan B. On the efficacy of knowledge distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 4794?4802
                        [14] Yang P, Yang G W, Gong X, Wu P P, Han X, Wu J S, et al. Instance segmentation network with self-distillation for scene text detection. IEEE Access, 2020, 8: 45825?45836 doi: 10.1109/ACCESS.2020.2978225
                        [15] Vu T H, Jain H, Bucher M, Cord M, Pérez P. Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 2517?2526
                        [16] Lee C Y, Xie S N, Gallagher P, Zhang Z Y, Tu Z W. Deeply-supervised nets. In: Proceedings of the 18th International Conference on Artificial Intelligence and Statistics. San Diego, USA: PMLR, 2015. 562?570
                        [17] Hou Y N, Ma Z, Liu C X, Loy C C. Learning lightweight lane detection CNNs by self attention distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 1013?1021
                        [18] 王潤民, 桑農, 丁丁, 陳杰, 葉齊祥, 高常鑫, 等. 自然場(chǎng)景圖像中的文本檢測綜述. 自動(dòng)化學(xué)報, 2018, 44(12): 2113?2141

                        Wang Run-Min, Sang Nong, Ding Ding, Chen Jie, Ye Qi-Xiang, Gao Chang-Xin, et al. Text detection in natural scene image: A survey. Acta Automatica Sinica, 2018, 44(12): 2113?2141
                        [19] Ren S Q, He K M, Girshick R, Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. arXiv preprint arXiv: 1506.01497, 2015.
                        [20] Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C Y, et al. SSD: Single shot multi-box detector. In: Proceedings of the European Conference on Computer Vision. Amsterdam, Netherlands: 2016. 21?37
                        [21] Liao M H, Shi B G, Bai X, Wang X G, Liu W Y. Textboxes: A fast text detector with a single deep neural network. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. San Francisco, USA: AAAI, 2017. 4161?4167
                        [22] Tian Z, Huang W L, He T, He P, Qiao Y. Detecting text in natural image with connectionist text proposal network. In: Proce-edings of the European Conference on Computer Vision. Amsterdam, Netherlands: Springer, 2016. 56?72
                        [23] Zhou X Y, Yao C, Wen H, Wang Y Z, Zhou S C, He W R, et al. East: An efficient and accurate scene text detector. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 5551?5560
                        [24] Ronneberger O, Fischer P, Brox T. U-Net: Convolutional networks for biomedical image segmentation. In: Proceedings of the Medical Image Computing and Computer Assisted Intervention. Munich, Germany: Springer, 2015. 234?241
                        [25] Liao M H, Wan Z Y, Yao C, Chen K, Bai X. Real-time scene text detection with differentiable binarization. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York, USA: AAAI, 2020. 11474?11481
                        [26] Wang W H, Xie E Z, Li X, Hou W B, Lu T, Yu G, et al. Shape robust text detection with progressive scale expansion network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 9336?9345
                        [27] Wang W H, Xie E Z, Song X G, Zang Y H, Wang W J, Lu T, et al. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 8440?8449
                        [28] Xu Y C, Wang Y K, Zhou W, Wang Y P, Yang Z B, Bai X. Textfield: Learning a deep direction field for irregular scene text detection. IEEE Transactions on Image Processing, 2019, 28(11): 5566?5579 doi: 10.1109/TIP.2019.2900589
                        [29] He T, Shen C H, Tian Z, Gong D, Sun C M, Yan Y L. Knowledge adaptation for efficient semantic segmentation. In: Proce-edings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 578?587
                        [30] Liu Y F, Chen K, Liu C, Qin Z C, Luo Z B, Wang J D. Structured knowledge distillation for semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 2604?2613
                        [31] Wang Y K, Zhou W, Jiang T, Bai X, Xu Y C. Intra-class feature variation distillation for semantic segmentation. In: Proce-edings of the European Conference on Computer Vision. Glasg-ow, UK: Springer, 2020. 346?362
                        [32] Zhang L F, Song J B, Gao A, Chen J W, Bao C L, Ma K S. Be your own teacher: Improve the performance of convolutional neural networks via self distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Seo-ul, South Korea: IEEE, 2019. 3713?3722
                        [33] Howard A, Sandler M, Chu G, Chen L C, Chen B, Tan M X, et al. Searching for MobileNetV3. In: Proceedings of the IEEE/ CVF International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 1314?1324
                        [34] Lin T Y, Dollár P, Girshick R, He K M, Hariharan B, Belongie S. Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 2117?2125
                        [35] Chen Z Y, Xu Q Q, Cong R M, Huang Q M. Global context-aware progressive aggregation network for salient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence. New York, USA: AAAI, 2020. 10599?10606
                        [36] Karatzas D, Shafait F, Uchida S, Iwamura M I, Bigorda L G, Mestre S R, et al. ICDAR2013 robust reading competition. In: Proceedings of the 12th International Conference on Document Analysis and Recognition. Washington DC, USA: IEEE, 2013. 1484?1493
                        [37] Yao C, Bai X, Liu W Y, Ma Y, Tu Z W. Detecting texts of arbitrary orientations in natural images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA: IEEE, 2012. 1083?1090
                        [38] Xue C H, Lu S J, Zhan F N. Accurate scene text detection through border semantics awareness and bootstrapping. In: Proceedings of the European Conference on Computer Vision. Munich, Germany: IEEE, 2018. 355?372
                        [39] He W H, Zhang X Y, Yin F, Liu C L. Multi-oriented and multi-lingual scene text detection with direct regression. IEEE Transactions on Image Processing, 2018, 27(11): 5406?5419 doi: 10.1109/TIP.2018.2855399
                      2. 加載中
                      3. 圖(5) / 表(6)
                        計量
                        • 文章訪(fǎng)問(wèn)數:  403
                        • HTML全文瀏覽量:  162
                        • PDF下載量:  43
                        • 被引次數: 0
                        出版歷程
                        • 收稿日期:  2021-06-29
                        • 錄用日期:  2022-02-10
                        • 網(wǎng)絡(luò )出版日期:  2023-10-12
                        • 刊出日期:  2024-11-26

                        目錄

                          /

                          返回文章
                          返回