基于被動(dòng)聲吶音頻信號的水中目標識別綜述
doi: 10.16383/j.aas.c230153
-
1.
國防科技大學(xué)計算機學(xué)院 長(cháng)沙 410073
-
2.
并行與分布處理國防科技重點(diǎn)實(shí)驗室 長(cháng)沙 410073
A Review of Underwater Target Recognition Based on Passive Sonar Acoustic Signals
-
1.
School of Computer Science, National University of Defense Technology, Changsha 410073
-
2.
National Key Laboratory of Parallel and Distributed Processing, Changsha 410073
-
摘要: 基于被動(dòng)聲吶音頻信號的水中目標識別是當前水下無(wú)人探測領(lǐng)域的重要技術(shù)難題, 在軍事和民用領(lǐng)域都應用廣泛. 本文從數據處理和識別方法兩個(gè)層面系統闡述基于被動(dòng)聲吶信號進(jìn)行水中目標識別的方法和流程. 在數據處理方面, 從基于被動(dòng)聲吶信號的水中目標識別基本流程、被動(dòng)聲吶音頻信號分析的數理基礎及其特征提取三個(gè)方面概述被動(dòng)聲吶信號處理的基本原理. 在識別方法層面, 全面分析基于機器學(xué)習算法的水中目標識別方法, 并聚焦以深度學(xué)習算法為核心的水中目標識別研究. 本文從有監督學(xué)習、無(wú)監督學(xué)習、自監督學(xué)習等多種學(xué)習范式對當前研究進(jìn)展進(jìn)行系統性的總結分析, 并從算法的標簽數據需求、魯棒性、可擴展性與適應性等多個(gè)維度分析這些方法的優(yōu)缺點(diǎn). 同時(shí), 還總結該領(lǐng)域中較為廣泛使用的公開(kāi)數據集, 并分析公開(kāi)數據集應具備的基本要素. 最后, 通過(guò)對水中目標識別過(guò)程的論述, 總結目前基于被動(dòng)聲吶音頻信號的水中目標自動(dòng)識別算法存在的困難與挑戰, 并對該領(lǐng)域未來(lái)的發(fā)展方向進(jìn)行展望.
-
關(guān)鍵詞:
- 被動(dòng)聲吶信號 /
- 水中目標自動(dòng)識別 /
- 深度學(xué)習 /
- 有監督學(xué)習 /
- 自監督學(xué)習
Abstract: Underwater target recognition based on passive sonar acoustic signals is a significant technical challenge in the field of underwater unmanned detection, with broad applications in both military and civilian domains. This paper provides a comprehensive exposition of the methodology and process involved in underwater target recognition using passive sonar acoustic signals, addressing data processing and recognition methods at two levels. Regarding data processing, this paper presents a thorough exploration of the fundamental principles of passive sonar signal processing from three key aspects: The underlying process of underwater target recognition based on passive sonar signals, the mathematical foundations of passive sonar acoustic signal analysis, and the extraction of relevant features. At the level of recognition methods, this paper offers a comprehensive analysis of underwater target recognition techniques based on machine learning algorithms, and focus on the research conducted with deep learning algorithms at its core. The paper systematically summarizes and analyzes the current research progress across various learning paradigms, including supervised learning, unsupervised learning and self-supervised learning, and analyzes their advantages and disadvantages in terms of the algorithm's labeled data requirement, robustness, scalability and adaptability. Additionally, this paper provides an overview of widely-used public datasets in the field, and outlines the essential elements that such datasets should possess. Finally, by discussing the process of underwater target recognition, this paper summarizes the current difficulties and challenges in automatic underwater target recognition algorithms based on passive sonar acoustic signals, and offers insights into the future development direction of this field. -
圖 5 基于深度學(xué)習的水中音頻信號特征提取范式
Fig. 5 Deep learning-based paradigm for underwater acoustic signals feature extraction
圖 6 基于深度學(xué)習的水聲目標識別主流算法模型發(fā)展時(shí)間軸線(xiàn)
Fig. 6 Timeline: Evolution of mainstream deep learning algorithms for UATR
圖 9 基于CNN與Bi-LSTM融合的水聲目標識別方法網(wǎng)絡(luò )架構
Fig. 9 Network framework of UATR methods based on the fusion of CNN and Bi-LSTM
圖 11 基于RBM自編碼器重構的水聲目標識別方法架構
Fig. 11 The framework of RBM autoencoder-based reconstruction methods for UATR
圖 13 不同深度學(xué)習方法在水聲目標識別領(lǐng)域的性能對比
Fig. 13 Performance comparison of various deep learning methods for UATR
表 1 典型傳統機器學(xué)習的水聲目標識別算法
Table 1 Typical traditional machine learning algorithms for UATR
年份 機器學(xué)習算法 音頻特征 數據集 1992 Naive Bayes[54] 目標固有物理機理特征 自建數據集 2016 DT[55] 時(shí)域、頻域特征 仿真數據集 2014 基于小波變換的時(shí)頻特征 自建數據集 2016 MFCC 真實(shí)數據集 2017 GFCC 歷史數據集 2017 SVM[56?62] 改進(jìn)的GFCC 艦船數據集 2018 過(guò)零率 魚(yú)類(lèi)數據集 2019 融合表征 自建數據集 2022 LOFAR譜 ShipsEar 2018 KNN[60, 62] MFCC 魚(yú)類(lèi)數據集 2022 LOFAR譜 ShipsEar 2011 SVDD[63] — 艦船數據集 2014 HMM[56, 64] MFCC — 2018 機器音頻數據集 下載: 導出CSV表 2 基于卷積神經(jīng)網(wǎng)絡(luò )的水聲目標識別方法
Table 2 Convolutional neural network-based methods for UATR
年份 技術(shù)特點(diǎn) 模型優(yōu)劣分析 數據集來(lái)源 樣本大小 2017 卷積神經(jīng)網(wǎng)絡(luò )[66] 自動(dòng)提取音頻表征, 提高了模型的精度 Historic Naval Sound and Video database 16類(lèi) 2018 卷積神經(jīng)網(wǎng)絡(luò )[71] 使用極限學(xué)習機代替全連接層, 提高了模型的識別精度 私有數據集 3類(lèi) 2019 卷積神經(jīng)網(wǎng)絡(luò )[70] 使用二階池化策略, 更好地保留了信號分量的差異性 中國南海 5類(lèi) 一種基于聲音生成感知機制的卷積神經(jīng)網(wǎng)絡(luò )[41] 模擬聽(tīng)覺(jué)系統實(shí)現多尺度音頻表征學(xué)習, 使得表征更具判別性 Ocean Networks Canada 4類(lèi) 2020 基于ResNet的聲音生成感知模型[42] 使用全局平均池化代替全連接層, 極大地減少了參數, 提高了模型的訓練效率 Ocean Networks Canada 4類(lèi) 一種稠密卷積神經(jīng)網(wǎng)絡(luò )DCNN[43] 使用DCNN自動(dòng)提取音頻特征, 降低了人工干預對性能的影響 私有數據集 12類(lèi) 2021 一種具有稀疏結構的GoogleNet[72] 稀疏結構的網(wǎng)絡(luò )設計減少了參數量, 提升模型的訓練效率 仿真數據集 3類(lèi) 一種基于可分離卷積自編碼器的SCAE模型[73] 使用音頻的融合表征進(jìn)行分析, 證明了方法的魯棒性 DeepShip 5類(lèi) 殘差神經(jīng)網(wǎng)絡(luò )[76] 融合表征使得學(xué)習到的音頻表征更具判別性, 提升了模型的性能 ShipsEar 5類(lèi) 基于注意力機制的深度神經(jīng)網(wǎng)絡(luò )[46] 使用注意力機制抑制了海洋環(huán)境噪聲和其他艦船信號的干擾, 提升模型的識別能力 中國南海 4類(lèi) 基于雙注意力機制和多分辨率卷積神經(jīng)網(wǎng)絡(luò )架構[81] 多分辨率卷積網(wǎng)絡(luò )使得音頻表征更具判別性, 雙注意力機制有利于同時(shí)關(guān)注局部信息與全局信息 ShipsEar 5類(lèi) 基于多分辨率的時(shí)頻特征提取與數據增強的水中目標識別方法[85] 多分辨率卷積網(wǎng)絡(luò )使得音頻表征更具判別性, 數據增強增大了模型的訓練樣本規模, 從而提升了模型的識別性能 ShipsEar 5類(lèi) 2022 基于通道注意力機制的殘差網(wǎng)絡(luò )[82] 通道注意力機制的使用使得學(xué)習到的音頻表征更具判別性和魯棒性, 提升了模型的性能 私有數據集 4類(lèi) 一種基于融合表征與通道注意力機制的殘差網(wǎng)絡(luò )[83] 融合表征與通道注意力機制的使用使得學(xué)習到的音頻表征更具判別性和魯棒性, 提升了模型的性能 DeepShip ShipsEar 5類(lèi) 2023 基于注意力機制的多分支CNN[74] 注意力機制用以捕捉特征圖中的重要信息, 多分支策略的使用提升了模型的訓練效率 ShipsEar 5類(lèi) 下載: 導出CSV表 3 基于時(shí)延神經(jīng)網(wǎng)絡(luò )、循環(huán)神經(jīng)網(wǎng)絡(luò )和Transformer的水聲目標識別方法
Table 3 Time delay neural networks-based, recurrent neural network-based and Transformer-based methods for UATR
年份 技術(shù)特點(diǎn) 模型優(yōu)劣分析 數據集來(lái)源 樣本大小 2019 基于時(shí)延神經(jīng)網(wǎng)絡(luò )的UATR[87] 時(shí)延神經(jīng)網(wǎng)絡(luò )能夠學(xué)習音頻時(shí)序信息, 從而提高模型的識別能力 私有數據集 2類(lèi) 2022 一種可學(xué)習前端[88] 可學(xué)習的一維卷積濾波器可以實(shí)現更具判別性的音頻特征提取, 表現出比傳統手工提取的特征更好的性能 QLED, ShipsEar,
DeepShipQLED 2類(lèi)ShipsEar 5類(lèi)DeepShip 4類(lèi) 2020 采用Bi-LSTM同時(shí)考慮過(guò)去與未來(lái)信息的UATR[89] 使用雙向注意力機制能夠同時(shí)學(xué)習到歷史和后續的時(shí)序信息, 從而使得音頻表征蘊含信息更豐富以及判別性更高, 然而該方法復雜度較高 Sea Trial 2類(lèi) 基于Bi-GRU的混合時(shí)序網(wǎng)絡(luò )[90] 混合時(shí)序網(wǎng)絡(luò )從多個(gè)維度關(guān)注時(shí)序信息, 從而學(xué)習到更具判別性的音頻特征, 提高模型的識別能力 私有數據集 3類(lèi) 2021 采用LSTM融合音頻表征[91] 該方法能夠同時(shí)學(xué)習音頻的相位和頻譜特征, 并加以融合, 從而提升模型的識別性能 私有數據集 2類(lèi) CNN與Bi-LSTM組合的UATR[92] CNN與Bi-LSTM組合可以提取出同時(shí)關(guān)注局部特性和時(shí)序上下文依賴(lài)的音頻特征, 提高了模型的識別能力 私有數據集 3類(lèi) 2022 一維卷積與LSTM組合的UATR[93] 首次采用一維卷積和LSTM的組合網(wǎng)絡(luò )提取音頻表征, 能夠在提高音頻識別率的同時(shí)降低模型的參數量, 然而該方法穩定性有待提高 ShipsEar 5類(lèi) 2022 Transformer[94?95] 增強了模型的泛化性和學(xué)習能力, 提高了模型的識別準確率 ShipsEar 5類(lèi) 加入逐層聚合的Token機制, 同時(shí)兼顧全局信息和局部特性, 提高了模型的識別準確率 ShipsEar,
DeepShip5類(lèi) 下載: 導出CSV表 4 基于遷移學(xué)習的水聲目標識別方法
Table 4 Transfer learning-based methods for UATR
年份 技術(shù)特點(diǎn) 模型優(yōu)劣分析 數據集來(lái)源 樣本大小 2019 基于ResNet的遷移學(xué)習[98] 在保證較高性能的同時(shí)減少對標簽樣本的需求, 但不同領(lǐng)域任務(wù)的數據特征分布存在固有偏差 Whale FM website 16類(lèi) 2020 基于ResNet的遷移學(xué)習[99] 在預訓練模型的基礎上設計模型集成機制, 提升識別性能的同時(shí)減少了對標簽樣本的需求, 但不同領(lǐng)域任務(wù)的數據特征分布存在固有偏差 私有數據集 2類(lèi) 基于CNN的遷移學(xué)習[102] 使用AudioSet音頻數據集進(jìn)行預訓練, 減輕了不同領(lǐng)域任務(wù)的數據特征分布所存在的固有偏差 — — 2022 基于VGGish的遷移學(xué)習[103] 除了使用AudioSet數據集進(jìn)行預訓練, 還設計基于時(shí)頻分析與注意力機制結合的特征提取模塊, 提高了模型的泛化能力 ShipsEar 5類(lèi) 下載: 導出CSV表 5 基于無(wú)監督學(xué)習和自監督學(xué)習的水聲目標識別方法
Table 5 Unsupervised and self-supervised learning-based methods for UATR
年份 技術(shù)特點(diǎn) 模型優(yōu)劣分析 數據集來(lái)源 樣本大小 2013 深度置信網(wǎng)絡(luò )[104?108] 對標注數據集的需求小, 但由于訓練數據少, 容易出現過(guò)擬合的風(fēng)險 私有數據集 40類(lèi) 2017 加入混合正則化策略, 增強了所學(xué)到音頻表征的判別性, 提高了模型的識別準確率 3類(lèi) 2018 加入競爭機制, 增強了所學(xué)到音頻表征的判別性, 提高了模型的識別準確率 2類(lèi) 2018 加入壓縮機制, 減少了模型的冗余參數, 提升了模型的識別準確率 中國南海 2類(lèi) 2021 基于RBM自編碼器與重構的SSL[111] 降低了模型對標簽數據的需求, 增強了模型的泛化性和可擴展性 ShipsEar 5類(lèi) 2022 基于掩碼建模與重構的SSL[113] 使用掩碼建模與多表征重構策略, 提升了模型對特征的學(xué)習能力, 從而提升了識別性能 DeepShip 5類(lèi) 2023 基于自監督對比學(xué)習的SSL[112] 降低了模型對標簽數據的需求, 增強了學(xué)習到的音頻特征的泛化性和對數據的適應能力 ShipsEar, DeepShip 5類(lèi) 下載: 導出CSV表 6 常用的公開(kāi)水聲數據集總結
Table 6 Summary of commonly used public underwater acoustic signal datasets
數據集名稱(chēng) 數據結構 數據類(lèi)型 采樣頻率 (kHz) 獲取地址 類(lèi)別 樣本數 (個(gè)) 持續時(shí)間 (樣本) 總時(shí)間 DeepShip Cargo Ship 110 180 ~ 610 s 10 h 40 min 音頻 32 DeepShip Tug 70 180 ~ 1140 s 11 h 17 min Passenger Ship 70 6 ~ 1530 s 12 h 22 min Tanker 70 6 ~ 700 s 12 h 45 min ShipsEar A 16 — 1729 s 音頻 32 ShipsEar B 17 1435 s C 27 4054 s D 9 2041 s E 12 923 s Ocean Network Canada (ONC) Background noise 17000 3 s 8 h 30 min 音頻 — ONC Cargo 17000 8 h 30 min Passenger Ship 17000 8 h 30 min Pleasure craft 17000 8 h 30 min Tug 17000 8 h 30 min Five-element acoustic dataset 9個(gè)類(lèi)別 360 0.5 s 180 s 音頻 — Five-element··· Historic Naval Sound and Video 16個(gè)類(lèi)別 — 2.5 s — 音視頻 10.24 Historic Naval··· DIDSON 8個(gè)類(lèi)別 524 — — — — DIDSON Whale FM Pilot whale 10858 1 ~ 8 s 5 h 35 min 音頻 — Whale FM Killer whale 4673 注: 1. 官方給出的DeepShip數據集只包含Cargo Ship、Tug、Passenger Ship和Tanker這4個(gè)類(lèi)別. 在實(shí)際研究中, 學(xué)者通常會(huì )自定義一個(gè)新的類(lèi)別
“Background noise”作為第5類(lèi).
2. 獲取地址的訪(fǎng)問(wèn)時(shí)間為2023-07-20.下載: 導出CSV亚洲第一网址_国产国产人精品视频69_久久久久精品视频_国产精品第九页 -
[1] Cho H, Gu J, Yu S C. Robust sonar-based underwater object recognition against angle-of-view variation. IEEE Sensors Journal, 2016, 16(4): 1013?1025 doi: 10.1109/JSEN.2015.2496945 [2] D?stner K, Roseneckh-K?hler B V H Z, Opitz F, Rottmaier M, Schmid E. Machine learning techniques for enhancing maritime surveillance based on GMTI radar and AIS. In: Proceedings of the 19th International Radar Symposium (IRS). Bonn, Germany: IEEE, 2018. 1?10 [3] Terayama K, Shin K, Mizuno K, Tsuda K. Integration of sonar and optical camera images using deep neural network for fish monitoring. Aquacultural Engineering, 2019, 86: Article No. 102000 doi: 10.1016/j.aquaeng.2019.102000 [4] Choi J, Choo Y, Lee K. Acoustic classification of surface and underwater vessels in the ocean using supervised machine learning. Sensors, 2019, 19(16): Article No. 3492 doi: 10.3390/s19163492 [5] Marszal J, Salamon R. Detection range of intercept sonar for CWFM signals. Archives of Acoustics, 2014, 39: 215?230 [6] 孟慶昕. 海上目標被動(dòng)識別方法研究 [博士學(xué)位論文], 哈爾濱工程大學(xué), 中國, 2016.Meng Qing-Xin. Research on Passive Recognition Methods of Marine Targets [Ph.D. dissertation], Harbin Engineering University, China, 2016. [7] Yang H, Lee K, Choo Y, Kim K. Underwater acoustic research trends with machine learning: General background. Journal of Ocean Engineering and Technology, 2020, 34(2): 147?154 doi: 10.26748/KSOE.2020.015 [8] 方世良, 杜栓平, 羅昕煒, 韓寧, 徐曉男. 水聲目標特征分析與識別技術(shù). 中國科學(xué)院院刊, 2019, 34(3): 297?305Fang Shi-Liang, Du Shuan-Ping, Luo Xi-Wei, Han Ning, Xu Xiao-Nan. Development of underwater acoustic target feature analysis and recognition technology. Bulletin of Chinese Academy of Sciences, 2019, 34(3): 297?305 [9] Domingos L C, Santos P E, Skelton P S, Brinkworth R S, Sammut K. A survey of underwater acoustic data classification methods using deep learning for shoreline surveillance. Sensors, 2022, 22(6): Article No. 2181 doi: 10.3390/s22062181 [10] Luo X W, Chen L, Zhou H L, Cao H L. A survey of underwater acoustic target recognition methods based on machine learning. Journal of Marine Science and Engineering, 2023, 11(2): Article No. 384 doi: 10.3390/jmse11020384 [11] Kumar S, Phadikar S, Majumder K. Modified segmentation algorithm based on short term energy & zero crossing rate for Maithili speech signal. In: Proceedings of the International Conference on Accessibility to Digital World (ICADW). Guwahati, India: IEEE, 2016. 169?172 [12] Boashash B. Time-frequency Signal Analysis and Processing: A Comprehensive Reference. Pittsburgh: Academic Press, 2015. [13] Gopalan K. Robust watermarking of music signals by cepstrum modification. In: Proceedings of the IEEE International Symposium on Circuits and Systems. Kobe, Japan: IEEE, 2005. 4413?4416 [14] Lee C H, Shih J L, Yu K M, Lin H S. Automatic music genre classification based on modulation spectral analysis of spectral and cepstral features. IEEE Transactions on Multimedia, 2009, 11(4): 670?682 doi: 10.1109/TMM.2009.2017635 [15] Tsai W H, Lin H P. Background music removal based on cepstrum transformation for popular singer identification. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(5): 1196?1205 doi: 10.1109/TASL.2010.2087752 [16] Oppenheim A V, Schafer R W. From frequency to quefrency: A history of the cepstrum. IEEE Signal Processing Magazine, 2004, 21(5): 95?106 doi: 10.1109/MSP.2004.1328092 [17] Stevens S S, Volkmann J, Newman E B. A scale for the measurement of the psychological magnitude pitch. The Journal of the Acoustical Society of America, 1937, 8(3): 185?190 doi: 10.1121/1.1915893 [18] Oxenham A J. How we hear: The perception and neural coding of sound. Annual Review of Psychology, 2018, 69: 27?50 doi: 10.1146/annurev-psych-122216-011635 [19] Donald D A, Everingham Y L, McKinna L W, Coomans D. Feature selection in the wavelet domain: Adaptive wavelets. Comprehensive Chemometrics, 2009: 647?679 [20] Souli S, Lachiri Z. Environmental sound classification using log-Gabor filter. In: Proceedings of the IEEE 11th International Conference on Signal Processing. Beijing, China: IEEE, 2012. 144?147 [21] Costa Y, Oliveira L, Koerich A, Gouyon F. Music genre recognition using Gabor filters and LPQ texture descriptors. In: Proceedings of the Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. Havana, Cuba: Springer, 2013. 67?74 [22] Ezzat T, Bouvrie J V, Poggio T A. Spectro-temporal analysis of speech using 2-D Gabor filters. In: Proceedings of 8th Annual Conference of the International Speech Communication Association. Antwerp, Belgium: ISCA, 2007. 506?509 [23] He L, Lech M, Maddage N, Allen N. Stress and emotion recognition using log-Gabor filter analysis of speech spectrograms. In: Proceedings of the 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops. Amsterdam, Netherlands: IEEE, 2009. 1?6 [24] 蔡悅斌, 張明之, 史習智, 林良驥. 艦船噪聲波形結構特征提取及分類(lèi)研究. 電子學(xué)報, 1999, 27(6): 129?130 doi: 10.3321/j.issn:0372-2112.1999.06.030Cai Yue-Bin, Zhang Ming-Zhi, Shi Xi-Zhi, Lin Liang-Ji. The feature extraction and classification of ocean acoustic signals based on wave structure. Acta Electronica Sinica, 1999, 27(6): 129?130 doi: 10.3321/j.issn:0372-2112.1999.06.030 [25] Meng Q X, Yang S. A wave structure based method for recognition of marine acoustic target signals. The Journal of the Acoustical Society of America, 2015, 137(4): 2242 [26] Meng Q X, Yang S, Piao S C. The classification of underwater acoustic target signals based on wave structure and support vector machine. The Journal of the Acoustical Society of America, 2014, 136(4_Supplement): 2265 [27] Rajagopal R, Sankaranarayanan B, Rao P R. Target classification in a passive sonar——An expert system approach. In: Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. Albuquerque, USA: IEEE, 1990. 2911?2914 [28] Lourens J G. Classification of ships using underwater radiated noise. In: Proceedings of the Southern African Conference on Communications and Signal Processing (COMSIG). Pretoria, South Africa: IEEE, 1988. 130?134 [29] Liu Q Y, Fang S L, Cheng Q, Cao J, An L, Luo X W. Intrinsic mode characteristic analysis and extraction in underwater cylindrical shell acoustic radiation. Science China Physics, Mechanics and Astronomy, 2013, 56: 1339?1345 doi: 10.1007/s11433-013-5116-3 [30] Das A, Kumar A, Bahl R. Marine vessel classification based on passive sonar data: The cepstrum-based approach. IET Radar, Sonar & Navigation, 2013, 7(1): 87?93 [31] Wang S G, Zeng X Y. Robust underwater noise targets classification using auditory inspired time-frequency analysis. Applied Acoustics, 2014, 78: 68?76 doi: 10.1016/j.apacoust.2013.11.003 [32] Kim K I, Pak M I, Chon B P, Ri C H. A method for underwater acoustic signal classification using convolutional neural network combined with discrete wavelet transform. International Journal of Wavelets, Multiresolution and Information Processing, 2021, 19(4): Article No. 2050092 doi: 10.1142/S0219691320500927 [33] Qiao W B, Khishe M, Ravakhah S. Underwater targets classification using local wavelet acoustic pattern and multi-layer perceptron neural network optimized by modified whale optimization algorithm. Ocean Engineering, 2021, 219: Article No. 108415 doi: 10.1016/j.oceaneng.2020.108415 [34] Wei X, Li G H, Wang Z Q. Underwater target recognition based on wavelet packet and principal component analysis. Computer Simulation, 2011, 28(8): 8?290 [35] Xu K L, You K, Feng M, Zhu B Q. Trust-worth multi-representation learning for audio classification with uncertainty estimation. The Journal of the Acoustical Society of America, 2023, 153(3_supplement): Article No. 125 [36] 徐新洲, 羅昕煒, 方世良, 趙力. 基于聽(tīng)覺(jué)感知機理的水下目標識別研究進(jìn)展. 聲學(xué)技術(shù), 2013, 32(2): 151?158Xu Xin-Zhou, Luo Xi-Wei, Fang Shi-Liang, Zhao Li. Research process of underwater target recognition based on auditory perceiption mechanism. Technical Acoustics, 2013, 32(2): 151?158 [37] Békésy G V. On the elasticity of the cochlear partition. The Journal of the Acoustical Society of America, 1948, 20(3): 227?241 doi: 10.1121/1.1906367 [38] Johnstone B M, Yates G K. Basilar membrane tuning curves in the guinea pig. The Journal of the Acoustical Society of America, 1974, 55(3): 584?587 doi: 10.1121/1.1914568 [39] Zwislocki J J. Five decades of research on cochlear mechanics. The Journal of the Acoustical Society of America, 1980, 67(5): 1679?1685 [40] 費鴻博, 吳偉官, 李平, 曹毅. 基于梅爾頻譜分離和LSCNet的聲學(xué)場(chǎng)景分類(lèi)方法. 哈爾濱工業(yè)大學(xué)學(xué)報, 2022, 54(5): 124?130Fei Hong-Bo, Wu Wei-Guan, Li Ping, Cao Yi. Acoustic scene classification method based on Mel-spectrogram separation and LSCNet. Journal of Harbin Institute of Technology, 2022, 54(5): 124?130 [41] Yang H H, Li J H, Shen S, Xu G H. A deep convolutional neural network inspired by auditory perception for underwater acoustic target recognition. Sensors, 2019, 19(5): Article No. 1104 doi: 10.3390/s19051104 [42] Shen S, Yang H H, Yao X H, Li J H, Xu G H, Sheng M P. Ship type classification by convolutional neural networks with auditory-like mechanisms. Sensors, 2020, 20(1): Article No. 253 doi: 10.3390/s20010253 [43] Doan V S, Huynh-The T, Kim D S. Underwater acoustic target classification based on dense convolutional neural network. IEEE Geoscience and Remote Sensing Letters, 2020, 19: Article No. 1500905 [44] Miao Y C, Zakharov Y V, Sun H X, Li J H, Wang J F. Underwater acoustic signal classification based on sparse time-frequency representation and deep learning. IEEE Journal of Oceanic Engineering, 2021, 46(3): 952?962 doi: 10.1109/JOE.2020.3039037 [45] Hu G, Wang K J, Liu L L. Underwater acoustic target recognition based on depthwise separable convolution neural networks. Sensors, 2021, 21(4): Article No. 1429 doi: 10.3390/s21041429 [46] Xiao X, Wang W B, Ren Q Y, Gerstoft P, Ma L. Underwater acoustic target recognition using attention-based deep neural network. JASA Express Letters, 2021, 1(10): Article No. 106001 doi: 10.1121/10.0006299 [47] Gong Y, Chung Y A, Glass J. AST: Audio spectrogram Transformer. arXiv preprint arXiv: 2104.01778, 2021. [48] Yang H H, Li J H, Sheng M P. Underwater acoustic target multi-attribute correlation perception method based on deep learning. Applied Acoustics, 2022, 190: Article No. 108644 doi: 10.1016/j.apacoust.2022.108644 [49] Gong Y, Lai C I J, Chung Y A, Glass J. SSAST: Self-supervised audio spectrogram transformer. In: Proceedings of the AAAI Conference on Artificial Intelligence. Vancouver, Canada: AAAI, 2022. 10699?10709 [50] He K M, Chen X L, Xie S N, Li Y H, Dollár P, Girshick R. Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022. 15979?15988 [51] Baade A, Peng P Y, Harwath D. MAE-AST: Masked autoencoding audio spectrogram Transformer. arXiv preprint arXiv: 2203.16691, 2022. [52] Ghosh S, Seth A, Umesh S, Manocha D. MAST: Multiscale audio spectrogram Transformers. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Rhodes Island, Greece: IEEE, 2023. 1?5 [53] Domingos P. A few useful things to know about machine learning. Communications of the ACM, 2012, 55(10): 78?87 doi: 10.1145/2347736.2347755 [54] Kelly J G, Carpenter R N, Tague J A. Object classification and acoustic imaging with active sonar. The Journal of the Acoustical Society of America, 1992, 91(4): 2073?2081 doi: 10.1121/1.403693 [55] Shi H, Xiong J Y, Zhou C Y, Yang S. A new recognition and classification algorithm of underwater acoustic signals based on multi-domain features combination. In: Proceedings of the IEEE/OES China Ocean Acoustics (COA). Harbin, China: IEEE, 2016. 1?7 [56] Li H T, Cheng Y S, Dai W G, Li Z Z. A method based on wavelet packets-fractal and SVM for underwater acoustic signals recognition. In: Proceedings of the 12th International Conference on Signal Processing (ICSP). Hangzhou, China: IEEE, 2014. 2169?2173 [57] Sherin B M, Supriya M H. SOS based selection and parameter optimization for underwater target classification. In: Proceedings of the OCEANS 2016 MTS/IEEE Monterey. Monterey, USA: IEEE, 2016. 1?4 [58] Lian Z X, Xu K, Wan J W, Li G. Underwater acoustic target classification based on modified GFCC features. In: Proceedings of the IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC). Chongqing, China: IEEE, 2017. 258?262 [59] Lian Z X, Xu K, Wan J W, Li G, Chen Y. Underwater acoustic target recognition based on Gammatone filterbank and instantaneous frequency. In: Proceedings of the IEEE 9th International Conference on Communication Software and Networks (ICCSN). Guangzhou, China: IEEE, 2017. 1207?1211 [60] Feroze K, Sultan S, Shahid S, Mahmood F. Classification of underwater acoustic signals using multi-classifiers. In: Proceedings of the 15th International Bhurban Conference on Applied Sciences and Technology (IBCAST). Islamabad, Pakistan: IEEE, 2018. 723?728 [61] Li Y X, Chen X, Yu J, Yang X H. A fusion frequency feature extraction method for underwater acoustic signal based on variational mode decomposition, duffing chaotic oscillator and a kind of permutation entropy. Electronics, 2019, 8(1): Article No. 61 doi: 10.3390/electronics8010061 [62] Aksüren ? G, Hocao?lu A K. Automatic target classification using underwater acoustic signals. In: Proceedings of the 30th Signal Processing and Communications Applications Conference (SIU). Safranbolu, Turkey: IEEE, 2022. 1?4 [63] Kim T, Bae K. HMM-based underwater target classification with synthesized active sonar signals. In: Proceedings of the 19th European Signal Processing Conference. Barcelona, Spain: IEEE, 2011. 1805?1808 [64] Li H F, Pan Y, Li J Q. Classification of underwater acoustic target using auditory spectrum feature and SVDD ensemble. In: Proceedings of the OCEANS-MTS/IEEE Kobe Techno-Oceans (OTO). Kobe, Japan: IEEE, 2018. 1?4 [65] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. Communication of the ACM, 2017, 60(6): 84?90 doi: 10.1145/3065386 [66] Yue H, Zhang L L, Wang D Z, Wang Y X, Lu Z Q. The classification of underwater acoustic targets based on deep learning methods. In: Proceedings of the 2nd International Conference on Control, Automation and Artificial Intelligence (CAAI 2017). Hainan, China: Atlantis Press, 2017. 526?529 [67] Kirsebom O S, Frazao F, Simard Y, Roy N, Matwin S, Giard S. Performance of a deep neural network at detecting North Atlantic right whale upcalls. The Journal of the Acoustical Society of America, 2020, 147(4): 2636?2646 doi: 10.1121/10.0001132 [68] Yin X H, Sun X D, Liu P S, Wang L, Tang R C. Underwater acoustic target classification based on LOFAR spectrum and convolutional neural network. In: Proceedings of the 2nd International Conference on Artificial Intelligence and Advanced Manufacture (AIAM). Manchester, United Kingdom: ACM, 2020. 59?63 [69] Jiang J J, Shi T, Huang M, Xiao Z Z. Multi-scale spectral feature extraction for underwater acoustic target recognition. Measurement, 2020, 166: Article No. 108227 doi: 10.1016/j.measurement.2020.108227 [70] Cao X, Togneri R, Zhang X M, Yu Y. Convolutional neural network with second-order pooling for underwater target classification. IEEE Sensors Journal, 2019, 19(8): 3058?3066 doi: 10.1109/JSEN.2018.2886368 [71] Hu G, Wang K J, Peng Y, Qiu M R, Shi J F, Liu L L. Deep learning methods for underwater target feature extraction and recognition. Computational Intelligence and Neuroscience, 2018, 2018: Article No. 1214301 [72] Zheng Y L, Gong Q Y, Zhang S F. Time-frequency feature-based underwater target detection with deep neural network in shallow sea. Journal of Physics: Conference Series, 2021, 1756: Article No. 012006 [73] Irfan M, Zheng J B, Ali S, Iqbal M, Masood Z, Hamid U. DeepShip: An underwater acoustic benchmark dataset and a separable convolution based autoencoder for classification. Expert Systems With Applications, 2021, 183: Article No. 115270 doi: 10.1016/j.eswa.2021.115270 [74] Wang B, Zhang W, Zhu Y N, Wu C X, Zhang S Z. An underwater acoustic target recognition method based on AMNet. IEEE Geoscience and Remote Sensing Letters, 2023, 20: Article No. 5501105 [75] Li Y X, Gao P Y, Tang B Z, Yi Y M, Zhang J J. Double feature extraction method of ship-radiated noise signal based on slope entropy and permutation entropy. Entropy, 2022, 24(1): Article No. 22 [76] Hong F, Liu C W, Guo L J, Chen F, Feng H H. Underwater acoustic target recognition with a residual network and the optimized feature extraction method. Applied Sciences, 2021, 11(4): Article No. 1442 doi: 10.3390/app11041442 [77] Chen S H, Tan X L, Wang B, Lu H C, Hu X L, Fu Y. Reverse attention-based residual network for salient object detection. IEEE Transactions on Image Processing, 2020, 29: 3763?3776 doi: 10.1109/TIP.2020.2965989 [78] Lu Z Y, Xu B, Sun L, Zhan T M, Tang S Z. 3-D channel and spatial attention based multiscale spatial-spectral residual network for hyperspectral image classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2020, 13: 4311?4324 doi: 10.1109/JSTARS.2020.3011992 [79] Fan R Y, Wang L Z, Feng R Y, Zhu Y Q. Attention based residual network for high-resolution remote sensing imagery scene classification. In: Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS). Yokohama, Japan: IEEE, 2019. 1346?1349 [80] Tripathi A M, Mishra A. Environment sound classification using an attention-based residual neural network. Neurocomputing, 2021, 460: 409?423 doi: 10.1016/j.neucom.2021.06.031 [81] Liu C W, Hong F, Feng H H, Hu M L. Underwater acoustic target recognition based on dual attention networks and multiresolution convolutional neural networks. In: Proceedings of the OCEANS 2021: San Diego-Porto. San Diego, USA: IEEE, 2021. 1?5 [82] Xue L Z, Zeng X Y, Jin A Q. A novel deep-learning method with channel attention mechanism for underwater target recognition. Sensors, 2022, 22(15): Article No. 5492 doi: 10.3390/s22155492 [83] Li J, Wang B X, Cui X R, Li S B, Liu J H. Underwater acoustic target recognition based on attention residual network. Entropy, 2022, 24(11): Article No. 1657 doi: 10.3390/e24111657 [84] Park D S, Chan W, Zhang Y, Chiu C C, Zoph B, Cubuk E D, et al. SpecAugment: A simple data augmentation method for automatic speech recognition. arXiv preprint arXiv: 1904.08779, 2019. [85] Luo X W, Zhang M H, Liu T, Huang M, Xu X G. An underwater acoustic target recognition method based on spectrograms with different resolutions. Journal of Marine Science and Engineering, 2021, 9(11): Article No. 1246 doi: 10.3390/jmse9111246 [86] Szegedy C, Ioffe S, Vanhoucke V, Alemi A. Inception-v4, inception-ResNet and the impact of residual connections on learning. arXiv preprint arXiv: 1602.07261, 2016. [87] Ren J W, Huang Z Q, Li C, Guo X Y, Xu J. Feature analysis of passive underwater targets recognition based on deep neural network. In: Proceedings of the OCEANS 2019-Marseille. Marseille, France: IEEE, 2019. 1?5 [88] Ren J W, Xie Y, Zhang X W, Xu J. UALF: A learnable front-end for intelligent underwater acoustic classification system. Ocean Engineering, 2022, 264: Article No. 112394 doi: 10.1016/j.oceaneng.2022.112394 [89] Li S C, Yang S Y, Liang J H. Recognition of ships based on vector sensor and bidirectional long short-term memory networks. Applied Acoustics, 2020, 164: Article No. 107248 doi: 10.1016/j.apacoust.2020.107248 [90] Wang Y, Zhang H, Xu L W, Cao C H, Gulliver T A. Adoption of hybrid time series neural network in the underwater acoustic signal modulation identification. Journal of the Franklin Institute, 2020, 357(18): 13906?13922 doi: 10.1016/j.jfranklin.2020.09.047 [91] Qi P Y, Sun J G, Long Y F, Zhang L G, Tian Y. Underwater acoustic target recognition with fusion feature. In: Proceedings of the 28th International Conference on Neural Information Processing. Sanur, Indonesia: Springer, 2021. 609?620 [92] Kamal S, Chandran C S, Supriya M H. Passive sonar automated target classifier for shallow waters using end-to-end learnable deep convolutional LSTMs. Engineering Science and Technology, an International Journal, 2021, 24(4): 860?871 doi: 10.1016/j.jestch.2021.01.014 [93] Han X C, Ren C X, Wang L M, Bai Y J. Underwater acoustic target recognition method based on a joint neural network. PLoS ONE, 2022, 17(4): Article No. e0266425 doi: 10.1371/journal.pone.0266425 [94] Li P, Wu J, Wang Y X, Lan Q, Xiao W B. STM: Spectrogram Transformer model for underwater acoustic target recognition. Journal of Marine Science and Engineering, 2022, 10(10): Article No. 1428 doi: 10.3390/jmse10101428 [95] Feng S, Zhu X Q. A Transformer-based deep learning network for underwater acoustic target recognition. IEEE Geoscience and Remote Sensing Letters, 2022, 19: Article No. 1505805 [96] Ying J J C, Lin B H, Tseng V S, Hsieh S Y. Transfer learning on high variety domains for activity recognition. In: Proceedings of the ASE BigData & SocialInformatics. Taiwan, China: ACM, 2015. 1?6 [97] Zhang Y K, Guo X S, Leung H, Li L. Cross-task and cross-domain SAR target recognition: A meta-transfer learning approach. Pattern Recognition, 2023, 138: Article No. 109402 doi: 10.1016/j.patcog.2023.109402 [98] Zhang L L, Wang D Z, Bao C C, Wang Y X, Xu K L. Large-scale whale-call classification by transfer learning on multi-scale waveforms and time-frequency features. Applied Sciences, 2019, 9(5): Article No. 1020 doi: 10.3390/app9051020 [99] Zhong M, Castellote M, Dodhia R, Ferres J L, Keogh M, Brewer A. Beluga whale acoustic signal classification using deep learning neural network models. The Journal of the Acoustical Society of America, 2020, 147(3): 1834?1841 doi: 10.1121/10.0000921 [100] Hitawala S. Evaluating ResNeXt model architecture for image classification. arXiv preprint arXiv: 1805.08700, 2018. [101] Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE, 2016. 2818?2826 [102] Kong Q Q, Cao Y, Iqbal T, Wang Y X, Wang W W, Plumbley M D. PANNs: Large-scale pretrained audio neural networks for audio pattern recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2020, 28: 2880?2894 doi: 10.1109/TASLP.2020.3030497 [103] Li D H, Liu F, Shen T S, Chen L, Yang X D, Zhao D X. Generalizable underwater acoustic target recognition using feature extraction module of neural network. Applied Sciences, 2022, 12(21): Article No. 10804 doi: 10.3390/app122110804 [104] Kamal S, Mohammed S K, Pillai P R S, Supriya M H. Deep learning architectures for underwater target recognition. In: Proceedings of the Ocean Electronics (SYMPOL). Kochi, India: IEEE, 2013. 48?54 [105] Seok J. Active sonar target classification using multi-aspect sensing and deep belief networks. International Journal of Engineering Research and Technology, 2018, 11: 1999?2008 [106] 楊宏暉, 申昇, 姚曉輝, 韓振. 用于水聲目標特征學(xué)習與識別的混合正則化深度置信網(wǎng)絡(luò ). 西北工業(yè)大學(xué)學(xué)報, 2017, 35(2): 220?225Yang Hong-Hui, Shen Sheng, Yao Xiao-Hui, Han Zhen. Hybrid regularized deep belief network for underwater acoustic target feature learning and recognition. Journal of Northwestern Polytechnical University, 2017, 35(2): 220?225 [107] Yang H H, Shen S, Yao X H, Sheng M P, Wang C. Competitive deep-belief networks for underwater acoustic target recognition. Sensors, 2018, 18(4): Article No. 952 doi: 10.3390/s18040952 [108] Shen S, Yang H H, Sheng M P. Compression of a deep competitive network based on mutual information for underwater acoustic targets recognition. Entropy, 2018, 20(4): Article No. 243 doi: 10.3390/e20040243 [109] Cao X, Zhang X M, Yu Y, Niu L T. Deep learning-based recognition of underwater target. In: Proceedings of the IEEE International Conference on Digital Signal Processing (DSP). Beijing, China: IEEE, 2016. 89?93 [110] Luo X W, Feng Y L. An underwater acoustic target recognition method based on restricted Boltzmann machine. Sensors, 2020, 20(18): Article No. 5399 doi: 10.3390/s20185399 [111] Luo X W, Feng Y L, Zhang M H. An underwater acoustic target recognition method based on combined feature with automatic coding and reconstruction. IEEE Access, 2021, 9: 63841?63854 doi: 10.1109/ACCESS.2021.3075344 [112] Sun B G, Luo X W. Underwater acoustic target recognition based on automatic feature and contrastive coding. IET Radar, Sonar & Navigation, 2023, 17(8): 1277?1285 [113] You K, Xu K L, Feng M, Zhu B Q. Underwater acoustic classification using masked modeling-based swin Transformer. The Journal of the Acoustical Society of America, 2022, 152(4_Supplement): Article No. 296 [114] Xu K L, Xu Q S, You K, Zhu B Q, Feng M, Feng D W, et al. Self-supervised learning-based underwater acoustical signal classification via mask modeling. The Journal of the Acoustical Society of America, 2023, 154(1): 5?15 doi: 10.1121/10.0019937 [115] Le H T, Phung S L, Chapple P B, Bouzerdoum A, Ritz C H, Tran L C. Deep Gabor neural network for automatic detection of mine-like objects in sonar imagery. IEEE Access, 2020, 8: 94126?94139 doi: 10.1109/ACCESS.2020.2995390 [116] Huo G Y, Wu Z Y, Li J B. Underwater object classification in sidescan sonar images using deep transfer learning and semisynthetic training data. IEEE Access, 2020, 8: 47407?47418 doi: 10.1109/ACCESS.2020.2978880 [117] Berg H, Hjelmervik K T. Classification of anti-submarine warfare sonar targets using a deep neural network. In: Proceedings of the OCEANS 2018 MTS/IEEE Charleston. Charleston, USA: IEEE, 2018. 1?5 [118] Santos-Domínguez D, Torres-Guijarro S, Cardenal-López A, Pena-Gimenez A. ShipsEar: An underwater vessel noise database. Applied Acoustics, 2016, 113: 64?69 doi: 10.1016/j.apacoust.2016.06.008 [119] 劉妹琴, 韓學(xué)艷, 張森林, 鄭榮濠, 蘭劍. 基于水下傳感器網(wǎng)絡(luò )的目標跟蹤技術(shù)研究現狀與展望. 自動(dòng)化學(xué)報, 2021, 47(2): 235?251Liu Mei-Qin, Han Xue-Yan, Zhang Sen-Lin, Zheng Rong-Hao, Lan Jian. Research status and prospect of target tracking technologies via underwater sensor networks. Acta Automatic Sinica, 2021, 47(2): 235?251 [120] Xu K L, Zhu B Q, Kong Q Q, Mi H B, Ding B, Wang D Z, et al. General audio tagging with ensembling convolutional neural networks and statistical features. The Journal of the Acoustical Society of America, 2019, 145(6): EL521?EL527 doi: 10.1121/1.5111059 [121] Zhu B Q, Xu K L, Kong Q Q, Wang H M, Peng Y X. Audio tagging by cross filtering noisy labels. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2020, 28: 2073?2083 doi: 10.1109/TASLP.2020.3008832