1. <button id="qm3rj"><thead id="qm3rj"></thead></button>
      <samp id="qm3rj"></samp>
      <source id="qm3rj"><menu id="qm3rj"><pre id="qm3rj"></pre></menu></source>

      <video id="qm3rj"><code id="qm3rj"></code></video>

        1. <tt id="qm3rj"><track id="qm3rj"></track></tt>
            1. 2.765

              2022影響因子

              (CJCR)

              • 中文核心
              • EI
              • 中國科技核心
              • Scopus
              • CSCD
              • 英國科學文摘

              留言板

              尊敬的讀者、作者、審稿人, 關于本刊的投稿、審稿、編輯和出版的任何問題, 您可以本頁添加留言。我們將盡快給您答復。謝謝您的支持!

              姓名
              郵箱
              手機號碼
              標題
              留言內容
              驗證碼

              一種基于信息熵遷移的文本檢測模型自蒸餾方法

              陳建煒 楊帆 賴永炫

              陳建煒, 楊帆, 賴永炫. 一種基于信息熵遷移的文本檢測模型自蒸餾方法. 自動化學報, 2023, 49(11): 1?12 doi: 10.16383/j.aas.c210598
              引用本文: 陳建煒, 楊帆, 賴永炫. 一種基于信息熵遷移的文本檢測模型自蒸餾方法. 自動化學報, 2023, 49(11): 1?12 doi: 10.16383/j.aas.c210598
              Chen Jian-Wei, Yang Fan, Lai Yong-Xuan. A self-distillation approach via entropy transfer for scene text detection. Acta Automatica Sinica, 2023, 49(11): 1?12 doi: 10.16383/j.aas.c210598
              Citation: Chen Jian-Wei, Yang Fan, Lai Yong-Xuan. A self-distillation approach via entropy transfer for scene text detection. Acta Automatica Sinica, 2023, 49(11): 1?12 doi: 10.16383/j.aas.c210598

              一種基于信息熵遷移的文本檢測模型自蒸餾方法

              doi: 10.16383/j.aas.c210598
              基金項目: 科技創新2030 —— “新一代人工智能”重大項目(2021ZD-0112600), 國家自然科學基金委員會面上項目(62173282, 61872154), 廣東省自然科學基金(2021A1515011578), 深圳市基礎研究專項面上項目(JCYJ20190809161603551)資助
              詳細信息
                作者簡介:

                陳建煒:廈門大學航空航天學院碩士研究生. 主要研究方向為計算機視覺, 圖像處理. E-mail: jianweichen@ stu.xmu.edu.cn

                楊帆:廈門大學航空航天學院副教授. 主要研究方向為機器學習, 數據挖掘和生物信息學. 本文通信作者. E-mail: yang@xmu.edu.cn

                賴永炫:廈門大學信息學院教授. 主要研究方向為大數據分析和管理, 智能交通系統, 深度學習和車載網絡. E-mail: laiyx@xmu.edu.cn

              Self-distillation via Entropy Transfer for Scene Text Detection

              Funds: Supported by National Key Research and Development Program of China (2021ZD0112600), National Natural Science Foundation of China (62173282, 61872154), Natural Science Foundation of Guangdong Province (2021A1515011578), and Shenzhen Fundamental Research Program (JCYJ2019-0809161603551)
              More Information
                Author Bio:

                CHEN Jian-Wei Master student at the School of Aerospace Engineering, Xiamen University. His research interest covers computer vision and image processing

                YANG Fan Associate professor at the School of Aerospace Engineering, Xiamen University. His research interest covers machine learning, data mining, and bio-informatics. Corresponding author of this paper

                LAI Yong-Xuan Professor at the School of Informatics, Xiamen University. His research interest covers big data analysis and management, intelligent transportation systems, deep learning, and vehicular networks

              • 摘要: 前沿的自然場景文本檢測方法大多基于全卷積語義分割網絡, 利用像素級分類結果有效檢測任意形狀的文本, 其缺點是模型大、推理時間長、內存占用高, 這在實際應用中限制了其部署. 提出一種基于信息熵遷移的自蒸餾訓練方法(Self-distillation via entropy transfer, SDET), 利用文本檢測網絡深層網絡輸出的分割圖(Segmentation map, SM)信息熵作為待遷移知識, 通過輔助網絡將信息熵反饋給淺層網絡. 與依賴教師網絡的知識蒸餾 (Knowledge distillation, KD)不同, 自蒸餾訓練方法僅在訓練階段增加一個輔助網絡, 以微小的額外訓練代價實現無需教師網絡的自蒸餾(Self-distillation, SD). 在多個自然場景文本檢測的標準數據集上的實驗結果表明, SDET在基線文本檢測網絡的召回率和F1得分上, 能顯著優于其他蒸餾方法.
              • 圖  1  DB文本檢測網絡的分割圖和信息熵圖可視化

                Fig.  1  Segmentation map and entropy map visualization of DB text detection network

                圖  2  不同知識蒸餾方法對比

                Fig.  2  Comparison of different knowledge distillation methods

                圖  3  SDET訓練框架

                Fig.  3  SDET training framework

                圖  4  輔助網絡的3種結構形式

                Fig.  4  The three types of auxiliary networks

                圖  5  SDET與基線模型的檢測結果對比((a)真實標簽; (b)基線模型檢測結果; (c) SDET訓練后的模型檢測結果)

                Fig.  5  Comparison of detection results between SDET and baseline models ((a)Ground-truth; (b)Detection results of baseline models; (c) detection results of models trained with SDET)

                表  1  不同輔助分類器對SDET的影響 (%)

                Table  1  The impact of different auxiliary classifiers on SDET (%)

                模型方法ICDAR 2013ICDAR 2015
                PRFPRF
                MV3-EAST基線81.764.472.080.975.478.0
                SDET-A78.865.971.878.876.377.5
                SDET-B84.466.574.481.377.079.1
                SDET-C81.467.473.778.977.778.3
                MV3-DB基線83.766.073.887.171.878.7
                SDET-A84.168.875.786.573.979.7
                SDET-B81.167.373.687.871.778.9
                SDET-C84.967.975.487.873.079.7
                下載: 導出CSV

                表  2  不同特征金字塔位置對SDET-B的影響 (%)

                Table  2  The impact of different feature pyramid positions on SDET-B (%)

                方法特征圖尺寸(像素)PRF
                基線80.975.478.0
                SDET-P0${\text{16}} \times {\text{16}}$79.175.877.4
                SDET-P1${\text{32}} \times {\text{32}}$79.576.578.0
                SDET-P2${\text{64}} \times {\text{64}}$80.777.479.0
                SDET-P3${\text{128}} \times {\text{128}}$81.377.079.1
                下載: 導出CSV

                表  3  MV3-DB在不同數據集上的知識蒸餾實驗結果(%)

                Table  3  Experimental results of knowledge distillation of MV3-DB on different datasets (%)

                方法ICDAR 2013TD500TD-TRICDAR 2015Total-textCASIA-10K
                PRFPRF精準率RFPRFPRFPRF
                基線83.766.073.878.771.474.983.674.478.787.171.878.787.266.975.788.151.965.3
                ST82.565.873.277.073.074.984.673.578.785.472.278.287.465.374.888.849.463.5
                KA82.566.873.879.571.375.286.372.578.885.073.378.785.966.875.287.851.464.8
                FitNets84.765.473.878.673.375.885.374.079.285.373.378.887.467.576.288.052.365.6
                SKD82.468.875.081.270.675.584.874.579.387.471.678.787.467.075.988.651.665.2
                SD83.567.874.879.472.275.685.074.079.185.173.078.687.067.676.187.152.065.1
                SAD82.866.773.978.772.375.487.372.078.986.772.779.186.567.175.688.450.764.4
                本文方法84.168.875.780.672.276.285.674.679.786.573.979.787.568.476.887.453.466.3
                下載: 導出CSV

                表  4  MV3-EAST在不同數據集上的知識蒸餾實驗結果(%)

                Table  4  Experimental results of knowledge distillation of MV3-EAST on different datasets (%)

                方法ICDAR 2013ICDAR 2015CASIA-10K
                PRFPRFPRF
                基線81.764.472.080.975.478.066.164.965.5
                ST77.864.970.880.975.177.964.765.164.9
                KA78.664.070.578.276.477.367.763.065.3
                FitNets82.465.873.278.077.877.965.464.264.8
                SKD79.566.372.381.975.678.666.664.765.6
                SD80.263.871.179.674.777.166.263.564.8
                SAD81.465.672.680.276.578.365.764.164.9
                本文方法84.466.574.481.377.079.170.863.066.7
                下載: 導出CSV

                表  5  SDET與DSN在不同數據集上的對比(%)

                Table  5  Comparison of SDET and DSN on different datasets (%)

                方法ICDAR 2013TD500TD-TRICDAR 2015Total-textCASIA-10K
                PRFPRFPRFPRFPRFPRF
                基線83.766.073.878.771.474.983.674.478.787.171.878.787.266.975.788.151.965.3
                DSN[16]84.468.075.379.771.575.486.472.278.785.873.479.186.167.975.987.952.365.6
                本文方法84.168.875.780.672.276.285.674.679.786.573.979.787.568.476.887.453.466.3
                下載: 導出CSV

                表  6  SDET在不同數據集上提升ResNet50-DB的效果(%)

                Table  6  The effect of SDET on improving ResNet50-DB on different datasets (%)

                方法ICDAR 2013TD500TD-TRICDAR 2015Total-textCASIA-10K
                PRFPRFPRFPRFPRFPRF
                基線86.372.979.084.175.979.887.380.483.790.380.184.987.779.483.390.164.775.3
                本文方法82.777.279.979.981.580.787.283.085.090.382.186.087.481.884.586.068.776.4
                下載: 導出CSV
                1. <button id="qm3rj"><thead id="qm3rj"></thead></button>
                  <samp id="qm3rj"></samp>
                  <source id="qm3rj"><menu id="qm3rj"><pre id="qm3rj"></pre></menu></source>

                  <video id="qm3rj"><code id="qm3rj"></code></video>

                    1. <tt id="qm3rj"><track id="qm3rj"></track></tt>
                        亚洲第一网址_国产国产人精品视频69_久久久久精品视频_国产精品第九页
                      1. [1] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015. 3431?3440
                        [2] Yuan Y H, Chen X L, Wang J D. Object-contextual representations for semantic segmentation. arXiv preprint arXiv: 1909.11065, 2019.
                        [3] Lyu P Y, Liao M H, Yao C, Wu W H, Bai X. Mask textspotter: An end-to-end trainable neural network for spotting text with arbitrary shapes. In: Proceedings of the European Conference on Computer Vision. Munich, Germany: Springer, 2018. 67?83
                        [4] He K M, Gkioxari G, Dollár P, Girshick R. Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 2961?2969
                        [5] Ye J, Chen Z, Liu J H, Du B. TextFuseNet: Scene text detection with richer fused features. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence. Yokohama, Japan: 2020. 516?522
                        [6] He K M, Zhang X Y, Ren S Q, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 770?778
                        [7] Hinton G E, Vinyals O, Dean J. Distilling the knowledge in a neural network. arXiv preprint arXiv:: 1503.02531, 2015.
                        [8] 賴軒, 曲延云, 謝源, 裴玉龍. 基于拓撲一致性對抗互學習的知識蒸餾. 自動化學報, 2021, x(x): 1?9 doi: 10.16383/j.aas.200665

                        Lai Xuan, Qu Yan-Yun, Xie Yuan, Pei Yu-Long. Topology-guided adversarial deep mutual learning for knowledge distillation. Acta Automatica Sinica, 2021, x(x): 1?9 doi: 10.16383/j.aas.200665
                        [9] Romero A, Ballas N, Kahou S E, Chassang A, Gatta C, Bengio Y. FitNets: Hints for thin deep nets. arXiv preprint arXiv: 1412.6550, 2014.
                        [10] Zagoruyko S, Komodakis N. Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. arXiv preprint arXiv: 1612.03928, 2016.
                        [11] Karatzas D, Gomez-Bigorda L, Nicolaou A, Ghosh S, Bagdanov A, Iwamura M, et al. ICDAR 2015 competition on robust reading. In: In: Proceedings of the 13th International Conference on Document Analysis and Recognition. Nancy, France: IEEE, 2015. 1156?1160
                        [12] Ch'ng C K, Chan C S. Total-text: A comprehensive data-set for scene text detection and recognition. In: Proceedings of the 14th International Conference on Document Analysis and Recognition. Kyoto, Japan: IEEE, 2017. 935?942
                        [13] Cho J H, Hariharan B. On the efficacy of knowledge distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 4794?4802
                        [14] Yang P, Yang G W, Gong X, Wu P P, Han X, Wu J S, Chen C S. Instance segmentation network with self-distillation for scene text detection. IEEE Access, 2020, 8: 45825?45836 doi: 10.1109/ACCESS.2020.2978225
                        [15] Vu T H, Jain H, Bucher M, Cord M, Pérez P. AdvEnt: Adversarial entropy minimization for domain adaptation in semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 2517?2526
                        [16] Lee C Y, Xie S N, Gallagher P, Zhang Z Y, Tu Z W. Deeply-supervised nets. In: Proceedings of the 18th International Conference on Artificial Intelligence and Statistics. San Diego, USA: PMLR, 2015. 562?570
                        [17] Hou Y N, Ma Z, Liu C X, Loy C C. Learning lightweight lane detection CNNs by self attention distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 1013?1021
                        [18] 王潤民, 桑農, 丁丁, 陳杰, 葉齊祥, 高常鑫, 劉麗. 自然場景圖像中的文本檢測綜述. 自動化學報, 2018, 44(12): 2113?2141

                        Wang Run-Min, Sang Nong, Ding Ding, Chen Jie, Ye Qi-Xiang, Gao Chang-Xin and Liu Li. Text detection in natural scene image: a survey. Acta Automatica Sinica, 2018, 44(12): 2113?2141
                        [19] Ren S Q, He K M, Girshick R, Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. arXiv preprint arXiv: 1506.01497, 2015.
                        [20] Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C Y, et al. Ssd: Single shot multi-box detector. In: Proceedings of the European Conference on Computer Vision. Amsterdam, Netherlands: 2016. 21?37
                        [21] Liao M H, Shi B G, Bai X, Wang X G, Liu W Y. Textboxes: A fast text detector with a single deep neural network. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. San Francisco, USA: AAAI, 2017. 4161?4167
                        [22] Tian Z, Huang W L, He T, He P, Qiao Y. Detecting text in natural image with connectionist text proposal network. In: Proceedings of the European Conference on Computer Vision. Amsterdam, Netherlands: Springer, 2016. 56?72
                        [23] Zhou X Y, Yao C, Wen H, Wang Y Z, Zhou S C, He W R, et al. East: An efficient and accurate scene text detector. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 5551?5560
                        [24] Ronneberger O, Fischer P, Brox T. U-Net: Convolutional networks for biomedical image segmentation. In: Proceedings of the Medical Image Computing and Computer Assisted Intervention. Munich. Germany: Springer, 2015. 234?241
                        [25] Liao M H, Wan Z Y, Yao C, Chen K, Bai X. Real-time scene text detection with differentiable binarization. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York, USA: AAAI, 2020. 11474?11481
                        [26] Wang W H, Xie E Z, Li X, Hou W B, Lu T, Yu G, et al. Shape robust text detection with progressive scale expansion network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 9336?9345
                        [27] Wang W H, Xie E Z, Song X G, Zang Y H, Wang W J, Lu T, et al. Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 8440?8449
                        [28] Xu Y C, Wang Y K, Zhou W, Wang Y P, Yang Z B, Bai X. Textfield: Learning a deep direction field for irregular scene text detection. IEEE Transactions on Image Processing, 2019, 28 (11): 5566?5579 doi: 10.1109/TIP.2019.2900589
                        [29] He T, Shen C H, Tian Z, Gong D, Sun C M, Yan Y L. Knowledge adaptation for efficient semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 578?587
                        [30] Liu Y F, Chen K, Liu C, Qin Z C, Luo Z B, Wang J D. Structured knowledge distillation for semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 2604?2613
                        [31] Wang Y K, Zhou W, Jiang T, Bai X, Xu Y C. Intra-class feature variation distillation for semantic segmentation. In: Proceedings of the European Conference on Computer Vision. Glasgow, UK: Springer, 2020. 346?362
                        [32] Zhang L F, Song J B, Gao A, Chen J W, Bao C L, Ma K S. Be your own teacher: Improve the performance of convolutional neural networks via self distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 3713?3722
                        [33] Howard A, Sandler M, Chu G, Chen L C, Chen B, Tan M X, et al. Searching for mobilenetv3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 1314?1324
                        [34] Lin T Y, Dollár P, Girshick R, He K M, Hariharan B, Belongie S. Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 2117?2125
                        [35] Chen Z Y, Xu Q Q, Cong R M, Huang Q M. Global context-aware progressive aggregation network for salient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence. New York, USA: AAAI, 2020. 10599?10606
                        [36] Karatzas D, Shafait F, Uchida S, Iwamura M, I Bigorda L G, Mestre S R, et al. ICDAR 2013 robust reading competition. In: Proceedings of the 12th International Conference on Document Analysis and Recognition. Washington DC, USA: IEEE, 2013. 1484?1493
                        [37] Yao C, Bai X, Liu W Y, Ma Y, Tu Z W. Detecting texts of arbitrary orientations in natural images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA: IEEE, 2012. 1083?1090
                        [38] Xue C H, Lu S J, Zhan F N. Accurate scene text detection through border semantics awareness and bootstrapping. In: Proceedings of the European Conference on Computer Vision. Munich, Germany: IEEE, 2018. 355?372
                        [39] He W H, Zhang X Y, Yin F, Liu C L. Multi-oriented and multi-lingual scene text detection with direct regression. IEEE Transactions on Image Processing, 2018, 27 (11): 5406?5419 doi: 10.1109/TIP.2018.2855399
                      2. 加載中
                      3. 計量
                        • 文章訪問數:  182
                        • HTML全文瀏覽量:  73
                        • 被引次數: 0
                        出版歷程
                        • 收稿日期:  2021-06-29
                        • 錄用日期:  2022-02-10
                        • 網絡出版日期:  2023-10-12

                        目錄

                          /

                          返回文章
                          返回