基于層次特征復用的視頻超分辨率重建
doi: 10.16383/j.aas.c210095 cstr: 32138.14.j.aas.c210095
-
1.
天津大學(xué)電氣自動(dòng)化與信息工程學(xué)院 天津 300072
Video Super-resolution via Hierarchical Feature Reuse
-
1.
School of Electrical and Information Engineering, Tianjin University, Tianjin 300072
-
摘要: 當前的深度卷積神經(jīng)網(wǎng)絡(luò )方法, 在視頻超分辨率任務(wù)上實(shí)現的性能提升相對于圖像超分辨率任務(wù)略低, 部分原因是它們對層次結構特征中的某些關(guān)鍵幀間信息的利用不夠充分. 為此, 提出一個(gè)稱(chēng)作層次特征復用網(wǎng)絡(luò )(Hierarchical feature reuse network, HFRNet)的結構, 用以解決上述問(wèn)題. 該網(wǎng)絡(luò )保留運動(dòng)補償幀的低頻內容, 并采用密集層次特征塊(Dense hierarchical feature block, DHFB)自適應地融合其內部每個(gè)殘差塊的特征, 之后用長(cháng)距離特征復用融合多個(gè)DHFB間的特征, 從而促進(jìn)高頻細節信息的恢復. 實(shí)驗結果表明, 提出的方法在定量和定性指標上均優(yōu)于當前的方法.
-
關(guān)鍵詞:
- 層次特征復用 /
- 卷積神經(jīng)網(wǎng)絡(luò ) /
- 特征融合 /
- 視頻超分辨率重建
Abstract: The performance improvement of current deep convolution neural network methods in video super-resolution task is slightly lower than that in image super-resolution task, partly because they do not make full use of some key inter-frame information in hierarchical structure features. In this paper, we propose hierarchical feature reuse network (HFRNet) to solve the problem mentioned above. The network retains the low-frequency content of the motion compensation frame, and use dense hierarchical feature block (DHFB) to adaptively fuse the features of each residual block within it, then long-term feature reuse is proposed to fuse the features between multiple dense hierarchical feature block, so as to promote the recovery of high-frequency detail information. Experimental results show that the proposed method is superior to the current method in both quantitative and qualitative metrics. -
圖 1 層次特征復用網(wǎng)絡(luò )(HFRNet)的結構
Fig. 1 Architecture of hierarchical feature reuse network (HFRNet)
圖 3 本文方法和其他方法在VIDEO4和Myanmar數據集下得到的平均PSNR和平均SSIM
Fig. 3 Average PSNRs and SSIMs obtained by our method and other methods on VIDEO4 and Myanmar datasets
圖 4 HFRNet 與其他模型在 VIDEO4 數據集圖像上超分辨率的定性對比
Fig. 4 Qualitative super-resolution comparison of HFRNet with other models on an image from VIDEO4 dataset
圖 5 HFRNet 與其他模型在 Myanmar 數據集圖像上超分辨率的定性對比
Fig. 5 Qualitative super-resolution comparison of HFRNet with other models on an image from Myanmar dataset
圖 6 HFRNet 重建細節與其他模型超分辨率的定性對比
Fig. 6 Qualitative super-resolution comparison of the reconstruction details by HFRNet and other models
表 1 不同DHFB數目(D)和每個(gè)DHFB殘差塊數目(R)對2倍率超分辨率重建性能的影響(PSNR (dB))
Table 1 The impact (PSNR (dB)) of different numbers of DHFBs (D) and residual blocks (R) on the performance of 2× super-resolution reconstruction task
模塊組合方式 CITY序列 WALK序列 FOLIAGE序列 CALENDAR序列 平均PSNR R4D6 34.342 36.846 32.045 27.071 32.576 R6D4 34.339 37.101 32.117 27.067 32.656 R6D6 34.896 37.210 32.224 27.137 32.866 R6D8 34.901 (±0.035) 37.102 (±0.054) 32.187 (±0.069) 27.140 (±0.007) 32.833 (±0.041) R8D6 34.633 (±0.039) 36.873 (±0.025) 32.144 (±0.050) 27.109 (±0.019) 32.690 (±0.034) 下載: 導出CSV表 2 不同網(wǎng)絡(luò )結構實(shí)驗結果的平均PSNR及所需參數量
Table 2 The average PSNR and number of parameters for different network architectures
尺度 網(wǎng)絡(luò )結構 參數量 CITY序列 (dB) WALK序列 (dB) FOLIAGE序列 (dB) CALENDAR序列 (dB) 平均PSNR (dB) ×2 無(wú)層次特征復用 2.85 M 33.793 35.919 31.884 26.291 31.972 HFRNet(a) 3.01 M 34.896 37.210 32.224 27.137 32.866 HFRNet(b) 3.10 M 35.104 37.218 32.230 27.158 32.927 ×3 無(wú)層次特征復用 2.85 M 27.220 30.113 27.019 23.344 26.924 HFRNet(a) 3.01 M 28.235 31.513 27.539 24.190 27.869 HFRNet(b) 3.10 M 28.240 31.613 27.587 24.217 27.914 下載: 導出CSV表 3 不同光流估計方法對超分辨率重建性能的影響(PSNR (dB))
Table 3 The impact (PSNR (dB)) of different optical flow estimation methods on super-resolution reconstruction performance
尺度 光流估計算法 CITY序列 WALK序列 FOLIAGE序列 CALENDAR序列 平均PSNR ×2 CNN-based 35.226 37.106 32.244 27.817 33.098 CLG-TV 35.104 37.218 32.230 27.158 32.927 ×3 CNN-based 28.255 32.103 27.590 24.766 28.179 CLG-TV 28.240 31.613 27.587 24.217 27.914 下載: 導出CSV表 4 不同運動(dòng)補償算法對超分辨率重建性能的影響(平均PSNR (dB))
Table 4 Average PSNR (dB) in video super-resolution task, with different motion compensation algorithm
運動(dòng)補償算法與參數 尺度 MC (k = 0.050) MC (k = 0.100) MC (k = 0.125) MC (k = 0.175) AMC 平均PSNR (dB) ×2 32.493 32.510 32.714 32.615 32.927 ×3 27.505 27.684 27.822 27.694 27.914 下載: 導出CSV亚洲第一网址_国产国产人精品视频69_久久久久精品视频_国产精品第九页 -
[1] Liu C, Sun D. On Bayesian adaptive video super resolution. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(2): 346?360 doi: 10.1109/TPAMI.2013.127 [2] Shahar O, Faktor A, Irani M. Space-time super-resolution from a single video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Colorado Springs, USA: IEEE, 2011. 3353?3360 [3] Zhou Y, Wang Y, Zhang Y, Du X, Liu H, Li C. Manifold learning based super resolution for mixed-resolution multi-view video in visual internet of things. In: Proceedings of the International Conference on Artificial Intelligence for Communications and Networks. Harbin, China: Springer, 2019. 486?495 [4] Caballero J, Ledig C, Aitken A, Acosta A, Totz J, Wang Z, et al. Real-time video super-resolution with spatio-temporal networks and motion compensation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 2848?2857 [5] Tao X, Gao H, Liao R, Wang J, Jia J. Detail-revealing deep video super-resolution. In: Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 4482?4490 [6] Kappeler A, Yoo S, Dai Q, Katsaggelos A K. Video super-resolution with convolutional neural networks. IEEE Transactions on Computational Imaging, 2016, 2(2): 109?122 doi: 10.1109/TCI.2016.2532323 [7] Li D, Wang Z. Video super resolution via motion compensation and deep residual learning. IEEE Transactions on Computational Imaging, 2017, 3(4): 749?762 doi: 10.1109/TCI.2017.2671360 [8] Zhou Y, Zhang Y, Xie X, Kung S-Y. Image super-resolution based on dense convolutional auto-encoder blocks. Neurocomputing, 2021, 423(1): 98?109 [9] 李金新, 黃志勇, 李文斌, 周登文. 基于多層次特征融合的圖像超分辨率重建. 自動(dòng)化學(xué)報, 2023, 49(1): 161?171Li Jin-Xin, Huang Zhi-Yong, Li Wen-Bin, Zhou Deng-Wen. Image super-resolution based on multi hierarchical features fusion network. Acta Automatica Sinica, 2023, 49(1): 161?171 [10] 張毅鋒, 劉袁, 蔣程, 程旭. 用于超分辨率重建的深度網(wǎng)絡(luò )遞進(jìn)學(xué)習方法. 自動(dòng)化學(xué)報, 2020, 46(2): 274?282Zhang Yi-Feng, Liu Yuan, Jiang Cheng, Cheng Xu. A curriculum learning approach for single image super resolution. Acta Automatica Sinica, 2020, 46(2): 274?282 [11] Zhou Y, Feng L, Hou C, Kung S-Y. Hyperspectral and multispectral image fusion based on local low rank and coupled spectral unmixing. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(10): 5997?6009 doi: 10.1109/TGRS.2017.2718728 [12] 周登文, 趙麗娟, 段然, 柴曉亮. 基于遞歸殘差網(wǎng)絡(luò )的圖像超分辨率重建. 自動(dòng)化學(xué)報, 2019, 45(6): 1157?1165Zhou Deng-Wen, Zhao Li-Juan, Duan Ran, Chai Xiao-Liang. Image super-resolution based on recursive residual networks. Acta Automatica Sinica, 2019, 45(6): 1157?1165 [13] 孫旭, 李曉光, 李嘉鋒, 卓力. 基于深度學(xué)習的圖像超分辨率復原研究進(jìn)展. 自動(dòng)化學(xué)報, 2017, 43(5): 697?709Sun Xu, Li Xiao-Guang, Li Jia-Feng, Zhuo Li. Review on deep learning based image super-resolution restoration algorithms. Acta Automatica Sinica, 2017, 43(5): 697?709 [14] Xie X K, Zhou Y, Kung S-Y. Exploiting operation importance for differentiable neural architecture earch. arXiv preprint arXiv: 1911.10511, 2019. [15] Huo S, Zhou Y, Xiang W, Kung S-Y. Semi-supervised learning based on a novel iterative optimization model for saliency detection. IEEE Transactions on Neural Network and Learning System, 2019, 30(1): 225?241 doi: 10.1109/TNNLS.2018.2809702 [16] Zhou Y, Mao A, Huo S, Lei J, Kung S-Y. Salient object detection via fuzzy theory and object-level enhancement. IEEE Transactions on Multimedia, 2019, 1(1): 74?85 [17] Jo Y, Oh S W, Kang J, Kim S J. Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 3224?3232 [18] 潘志勇, 郁梅, 謝登梅, 宋洋, 蔣剛毅. 采用精簡(jiǎn)卷積神經(jīng)網(wǎng)絡(luò )的快速視頻超分辨率重建. 光電子 · 激光, 2018, 29(12): 1332?1341Pan Zhi-Yong, Yu Mei, Xie Deng-Mei, Song Yang, Jiang Gang-Yi. Fast video super-resolution reconstruction using a succinct convolutional neural network. Journal of Optoelectronics · Laser, 2018, 29(12): 1332?1341 [19] Drulea M, Nedevschi S. Total variation regularization of local-global optical flow. In: Proceedings of the IEEE Conference on Intelligent Transportation Systems. Washington D C, USA: IEEE, 2011. 318?323 [20] Lucas A, López-Tapia S, Molina R, Katsaggelos A K. Generative adversarial networks and perceptual losses for video super-resolution. IEEE Transactions on Image Processing, 2019, 28(7): 3312?3327 doi: 10.1109/TIP.2019.2895768 [21] Zhou Y, Yang J X, Li H R, Cao T, Kung S-Y. Adversarial learning for multiscale crowd counting under complex scenes. IEEE Transactions on Cybernetics, 2021, 51(11): 5423?5432 [22] Zhou Y, Huo S, Xiang W, Hou C, Kung S-Y. Semi-supervised salient object detection using a linear feedback control system model. IEEE Transactions on Cybernetics, 2019, 49(4): 1173?1185 doi: 10.1109/TCYB.2018.2793278 [23] Huo S, Zhou Y, Lei J, Ling N, Hou C. Iterative feedback control-based salient object segmentation. IEEE Transactions on Multimedia, 2018, 20(6): 1350?1364 doi: 10.1109/TMM.2017.2769801 [24] Zhou Y, Zhang T, Huo S, Hou C, Kung S-Y. Adaptive irregular graph construction based salient object detection. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(6): 1569?1582 doi: 10.1109/TCSVT.2019.2904463 [25] Szegedy C, Liu W, Jia Y Q, Sermanet P, Reed S E, Anguelov D, et al. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015. 1?9 [26] He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 770?778 [27] Huang G, Liu Z, Van Der Maaten L, Weinberger K Q. Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 2261?2269 [28] Zhang Y, Tian Y, Kong Y, Zhong B, Fu Y. Residual dense network for image super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 2472?2481 [29] Zhou Y, Du X T, Wang M F, Huo S W, Zhang Y D, Kung S-Y. Cross-scale residual network: A general framework for image super-resolution, denoising, and deblocking. IEEE Transactions on Cybernetics, 2022, 52(7): 5855?5867 [30] Yi P, Wang Z, Jiang K, Shao Z, Ma J. Multi-temporal ultra dense memory network for video super-resolution. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(8): 2503?2516 doi: 10.1109/TCSVT.2019.2925844 [31] Yi P, Wang Z, Jiang K, Jiang J, Ma J. Progressive fusion video super-resolution network via exploiting non-local spatio-temporal correlations. In: Proceedings of the IEEE International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 3106?3115 [32] Yi P, Wang Z Y, Jiang K, Jiang J J, Lu T, Ma J. A progressive fusion generative adversarial network for realistic and consistent video super-resolution. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(5): 2264?2280 [33] H Inc. Myanmar 60p [Online], available: http://www.harmoni-cinc.com/resources/videos/4k-video-clip-center, May 20, 2021 [34] Wang L, Guo Y, Liu L, Lin Z, Deng X, An W. Deep video super-resolution using HR optical flow estimation. IEEE Transactions on Image Processing, 2020, 29(1): 4323?4336 [35] Dong C, Loy C C, He K, Tang X. Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(2): 295?307 doi: 10.1109/TPAMI.2015.2439281 [36] Li D, Liu Y, Wang Z. Video super-resolution using motion compensation and residual bidirectional recurrent convolutional network. In: Proceedings of the IEEE International Conference on Image Processing. Beijing, China: IEEE, 2017. 1642?1646 [37] Kim S Y, Lim J, Na T, Kim M. Video super-resolution based on 3D-CNNs with consideration of scene change. In: Proceedings of the IEEE International Conference on Image Processing. Taipei, China: IEEE, 2019. 2831?2835 [38] Wang Z, Yi P, Jiang K, Jiang J, Han Z, Lu T, et al. Multi-memory convolutional neural network for video super-resolution. IEEE Transactions on Image Processing, 2019, 28(5): 2530?2544 doi: 10.1109/TIP.2018.2887017 [39] Kim J, Lee J K, Lee K M. Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 1646?1654 [40] Lai W S, Huang J B, Ahuja N, Yang M H. Deep Laplacian pyramid networks for fast and accurate super-resolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 5835?5843 [41] Wang L, Guo Y, Lin Z, Deng X, An W. Learning for video super-resolution through HR optical flow estimation. In: Proceedings of the Asian Conference on Computer Vision. Perth, Australia: Springer, 2018. 514?529