基于最大?最小策略的縱向聯(lián)邦學(xué)習隱私保護方法
doi: 10.16383/j.aas.c211233
-
1.
浙江工業(yè)大學(xué)信息工程學(xué)院 杭州 310023
-
2.
浙江工業(yè)大學(xué)計算機科學(xué)與技術(shù)學(xué)院 杭州 310023
-
3.
浙江工業(yè)大學(xué)網(wǎng)絡(luò )空間安全研究院 杭州 310023
-
4.
浙江大學(xué)網(wǎng)絡(luò )空間安全學(xué)院 杭州 310007
-
5.
浙江大學(xué)計算機科學(xué)與技術(shù)學(xué)院 杭州 310007
Privacy Preservation Method for Vertical Federated Learning Based on Max-min Strategy
-
1.
College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023
-
2.
College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023
-
3.
Institute of Cyberspace Security, Zhejiang University of Technology, Hangzhou 310023
-
4.
School of Cyber Science and Technology, Zhejiang University, Hangzhou 310007
-
5.
College of Computer Science and Technology, Zhejiang University, Hangzhou 310007
-
摘要: 縱向聯(lián)邦學(xué)習(Vertical federated learning, VFL)是一種新興的分布式機器學(xué)習技術(shù), 在保障隱私性的前提下, 利用分散在各個(gè)機構的數據實(shí)現機器學(xué)習模型的聯(lián)合訓練. 縱向聯(lián)邦學(xué)習被廣泛應用于工業(yè)互聯(lián)網(wǎng)、金融借貸和醫療診斷等諸多領(lǐng)域中, 因此保證其隱私安全性具有重要意義. 首先, 針對縱向聯(lián)邦學(xué)習協(xié)議中由于參與方交換的嵌入表示造成的隱私泄漏風(fēng)險, 研究由協(xié)作者發(fā)起的通用的屬性推斷攻擊. 攻擊者利用輔助數據和嵌入表示訓練一個(gè)攻擊模型, 然后利用訓練完成的攻擊模型竊取參與方的隱私屬性. 實(shí)驗結果表明, 縱向聯(lián)邦學(xué)習在訓練推理階段產(chǎn)生的嵌入表示容易泄漏數據隱私. 為了應對上述隱私泄漏風(fēng)險, 提出一種基于最大?最小策略的縱向聯(lián)邦學(xué)習隱私保護方法(Privacy preservation method for vertical federated learning based on max-min strategy, PPVFL), 其引入梯度正則組件保證訓練過(guò)程主任務(wù)的預測性能, 同時(shí)引入重構組件掩藏參與方嵌入表示中包含的隱私屬性信息. 最后, 在鋼板缺陷診斷工業(yè)場(chǎng)景的實(shí)驗結果表明, 相比于沒(méi)有任何防御方法的VFL, 隱私保護方法將攻擊推斷準確度從95%下降到55%以下, 接近于隨機猜測的水平, 同時(shí)主任務(wù)預測準確率僅下降2%.
-
關(guān)鍵詞:
- 縱向聯(lián)邦學(xué)習 /
- 屬性推斷攻擊 /
- 隱私保護 /
- 最大?最小策略 /
- 工業(yè)互聯(lián)網(wǎng)
Abstract: Vertical federated learning (VFL) is an emerging distributed machine learning that applies to the data distributed in various institutions to realize the joint construction of privacy preservation machine learning models. It has been widely applied to various fields such as industrial internet, financial lending, and medical diagnosis. Therefore, the privacy security research of vertical federated learning highlights its significance. Aiming at the risk of privacy leakage caused by the embedding exchanged by participants in the vertical federated learning protocol, we propose a general property inference attack initiated by the server. The adversary uses the auxiliary data and the embedding exchanged by the vertical federated learning protocol to train the attack model and steal the target privacy property of the participant. The experimental results show that the embedding representation generated by the vertical federated learning during the training and inference process can reveal the information of the personal private property. To deal with the above proposed privacy leakage risk, proposed a privacy preservation method for vertical federated learning based on max-min strategy (PPVFL), which introduces a gradient regular component to ensure the performance of the main task of the training process and adopts a construction component to hide participant's privacy property. Finally, in steel defect diagnosis industrial scenarios, compared to VFL without any defense method, privacy-preserving method reduces attack inference accuracy from 95% to below 55%, which is close to the level of random guessing, while the main task only dropped by 2% of the prediction accuracy. -
圖 7 不同比例背景知識下屬性推斷攻擊的性能
Fig. 7 Performance of property inference attack with different proportions of background knowledge
圖 11 PPVFL在多個(gè)參與方場(chǎng)景下隱私保護的性能
Fig. 11 PPVFL's privacy preservation performance in multiple parties
圖 13 PPVFL在不同攻擊模型下的隱私保護性能
Fig. 13 Performance of PPVFL's privacy preservation against different attack models
表 1 VFL隱私保護技術(shù)優(yōu)缺點(diǎn)對比
Table 1 Comparison of advantages and disadvantages of VFL privacy protection technology
下載: 導出CSV表 2 VFL數據集的基本統計信息
Table 2 The basic statistics of VFL datasets
數據集 樣本數 連邊關(guān)系 標簽類(lèi)別 屬性特征 隱私屬性 Adults 48842 — 2 14 婚姻 Rochester 4563 167653 6 236 教育 Yale 8578 405450 6 188 種族 下載: 導出CSV表 3 模型結構
Table 3 Model architectures
數據集 本地模型 頂部模型 Adults FCNN-1 FCNN-2 Rochester GCN-2 FCNN-2 Yale SGC-2 FCNN-2 下載: 導出CSV表 4 實(shí)際工業(yè)互聯(lián)網(wǎng)數據集上的隱私保護效果
Table 4 Privacy protection effect on actual industrial internet dataset
隱私屬性 鋼板序列 A300 訓練數據 測試數據 訓練數據 測試數據 推斷準確度 權衡值 推斷準確度 權衡值 主任務(wù)準確率 推斷準確度 權衡值 推斷準確度 權衡值 主任務(wù)準確率 無(wú)防御 0.95 0.82 0.96 0.81 0.78 0.74 1.00 0.72 1.03 0.74 Noisy$(\sigma=1.0)$ 0.66 1.00 0.84 0.79 0.66 0.63 0.95 0.62 0.97 0.60 Noisy$(\sigma=5.0)$ 0.60 0.93 0.55 1.02 0.56 0.60 0.83 0.59 0.85 0.50 Dropout$(\eta=0.5)$ 0.91 0.88 0.91 0.88 0.80 0.70 1.03 0.64 1.13 0.72 Dropout$(\eta=0.8)$ 0.86 0.86 0.86 0.86 0.74 0.70 0.96 0.64 1.05 0.67 DP$(\sigma=0.1)$ 0.56 1.21 0.56 1.21 0.68 0.67 1.06 0.65 1.09 0.71 DP$(\sigma=0.2)$ 0.90 0.79 0.89 0.80 0.71 0.68 1.06 0.67 1.07 0.72 DR$(d=8.0)$ 0.87 0.85 0.86 0.86 0.74 0.69 0.80 0.67 0.82 0.55 DR$(d=4.0)$ 0.66 0.97 0.65 0.98 0.64 0.68 0.79 0.64 0.84 0.54 PPVFL$(\lambda=0.1)$ 0.55 1.38 0.57 1.33 0.76 0.60 1.20 0.62 1.16 0.72 PPVFL$(\lambda=0.5)$ 0.55 1.36 0.54 1.39 0.75 0.59 1.20 0.61 1.16 0.71 下載: 導出CSV亚洲第一网址_国产国产人精品视频69_久久久久精品视频_国产精品第九页 -
[1] Luckow A, Cook M, Ashcraft N, Weill E, Djerekarov E, Vorster B. Deep learning in the automotive industry: Applications and tools. In: Proceedings of the IEEE International Conference on Big Data. Washington, USA: IEEE, 2016. 3759?3768 [2] Schneider S, Taylor G W, Kremer S C. Deep learning object detection methods for ecological camera trap data. In: Proceedings of the 15th Conference on Computer and Robot Vision. Toronto, Canada: IEEE, 2018. 321?328 [3] Sangineto E, Nabi M, Culibrk D, Sebe N. Self-paced deep learning for weakly supervised object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 41(3): 712?725 [4] Scoon C, Ko R K. The data privacy matrix project: Towards a global alignment of data privacy laws. In: Proceedings of the IEEE International Conference on Trust, Security and Privacy in Computing and Communications. Tianjin, China: IEEE, 2016. 1998?2005 [5] Goddard M. The EU general data protection regulation: Eur-opean regulation that has a global impact. International Journal of Market Research, 2017, 59(6): 703?705 doi: 10.2501/IJMR-2017-050 [6] Yang Q, Liu Y, Chen T J, Tong Y X. Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology, 2019, 10(2): 1?19 [7] 張澤輝, 富瑤, 高鐵杠. 支持數據隱私保護的聯(lián)邦深度神經(jīng)網(wǎng)絡(luò )模型研究. 自動(dòng)化學(xué)報, 2022, 48(5): 1273?1284Zhang Ze-Hui, Fu Yao, Gao Tie-Gang. Research on federated deep neural network model for data privacy protection. Acta Automatica Sinica, 2022, 48(5): 1273?1284 [8] 張澤輝, 李慶丹, 富瑤, 何寧昕, 高鐵杠. 面向非獨立同分布數據的自適應聯(lián)邦深度學(xué)習算法. 自動(dòng)化學(xué)報, 2023, 49(12): 2493?2506Zhang Ze-Hui, Li Qing-Dan, Fu Yao, He Ning-Xin, Gao Tie-Gang. Adaptive federated deep learning with non-IID data. Acta Automatica Sinica, 2023, 49(12): 2493?2506 [9] Nasr M, Shokri R, Houmansadr A. Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning. In: Proceedings of the IEEE Symposium on Security and Privacy. San Francisco, USA: IEEE, 2019. 739?753 [10] Luca M, Song C, Cristofaro E D, Shmatikov V. Exploiting unintended feature leakage in collaborative learning. In: Proceedings of the IEEE Symposium on Security and Privacy. San Francisco, USA: IEEE, 2019. 691?706 [11] Zhu L, Liu Z, Han S. Deep leakage from gradients. In: Proceedings of the Advances in Neural Information Processing Systems. Vancouver, Canada: 2019. 1?11 [12] 周純毅, 陳大衛, 王尚, 付安民, 高艷松. 分布式深度學(xué)習隱私與安全攻擊研究進(jìn)展與挑戰. 計算機研究與發(fā)展, 2021, 58(5): 927?943 doi: 10.7544/issn1000-1239.2021.20200966Zhou Chun-Yi, Chen Da-Wei, Wang Shang, Fu An-Min, Gao Yan-Song. Research and challenge of distributed deep learning privacy and security attack. Journal of Computer Research and Development, 2021, 58(5): 927?943 doi: 10.7544/issn1000-1239.2021.20200966 [13] Fu C, Zhang X, Ji S, Chen J Y, Wu J Z, Guo S Q, et al. Label inference attacks against vertical federated learning. In: Proceedings of the USENIX Security. Boston, USA: 2022. 1?18 [14] Ou W, Zeng J H, Guo Z J, Yan W Q, Liu D W, Fuentes S. A homomorphic-encryption-based vertical federated learning sche-me for rick management. Computer Science and Information Systems, 2020, 17(3): 819?834 doi: 10.2298/CSIS190923022O [15] Liu W, Cheng J H, Wang X L, Lu X J, Yin J W. Hybrid differential privacy based federated learning for internet of things. Journal of Systems Architecture, 2022, 124: 1?15 [16] Mehdi M, Al-Fuqaha A. Enabling cognitive smart cities using big data and machine learning: Approaches and challenges. IEEE Communications Magazine, 2018, 56(2): 94?101 doi: 10.1109/MCOM.2018.1700298 [17] Lu Y, Huang X H, Zhang K, Maharjan S, Zhang Y. Blockchain empowered asynchronous federated learning for secure data sharing in internet of vehicles. IEEE Transactions on Vehicular Technology, 2020, 69(4): 4298?4311 doi: 10.1109/TVT.2020.2973651 [18] Dinh C, Pubudu N, Ming D, Aruna S. Blockchain for 5G and beyond networks: A state of the art survey. Journal of Network and Computer Applications, 2020, 166: 1?45 [19] 韓璇, 袁勇, 王飛躍. 區塊鏈安全問(wèn)題: 研究現狀與展望. 自動(dòng)化學(xué)報, 2019, 45(1): 206?225Han Xuan, Yuan Yong, Wang Fei-Yue. Security problems on blockchain: The state of the art and future trends. Acta Automatica Sinica, 2019, 45(1): 206?225 [20] Sun H, Wang Z Y, Huang Y J, Ye J D. Privacy-preserving vertical federated logistic regression without trusted third-party coordinator. In: Proceedings of the 6th International Conference on Machine Learning and Soft Computing. Haikou, China: 2022. 132?138 [21] Cheng K, Fan T, Jin Y, Liu Y, Chen T J, Papadopoulos D, et al. Secureboost: A lossless federated learning framework. IEEE Intelligent Systems, 2021, 36(6): 1?9 doi: 10.1109/MIS.2021.3132250 [22] Luo X, Wu Y, Xiao X, Ooi B C. Feature inference attack on model predictions in vertical federated learning. In: Proceedings of the IEEE 37th International Conference on Data Engineering. Chania, Greece: 2021. 181?192 [23] Yang K, Song Z, Zhang Y, Zhou Y F, Sun X H, Wang J X. Model optimization method based on vertical federated learning. In: Proceedings of the IEEE International Symposium on Circuits and Systems. Daegu, South Korea: IEEE, 2021. 1?5 [24] Paramod S, Rohit S, Iiia L, Srinivas D, Sanjit A S. A formal foundation for secure remote execution of enclaves. In: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security. Dallas, USA: 2017. 2435?2450 [25] Florian T, Dan H. Slalom: Fast, verifiable and private execution of neural networks in trusted hardware. In: Proceedings of the 7th International Conference on Learning Representations. New Orleans, USA: 2019. 1?19 [26] Yaroslav G, Lempitsky V. Unsupervised domain adaptation by backpropagation. In: Proceedings of the 32nd International Conference on Machine Learning. Lille, France: 2015. 1180?1189 [27] Li K, Luo G C, Ye Y, Li W, Ji S H, Cai Z P. Adversarial privacy-preserving graph embedding against inference attack. IEEE Internet of Things Journal, 2020, 8(8): 6904?6915 [28] Vasisht D, Boutet A, Shejwalkar V. Quantifying privacy leakage in graph embedding. In: Proceedings of the 17th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services. Darmstadt, Germany: 2020. 76?85 [29] Zhang Z, Chen M, Backes M, Shen Y, Zhang Y. Inference attacks against graph neural networks. In: Proceedings of the USENIX Security. Boston, USA: 2022. 1?18 [30] Liao P, Zhao H, Xu K, Jaakkola T, Gordon G J, Jegelka S, et al. Information obfuscation of graph neural networks. In: Proceedings of the 38th International Conference on Machine Learning. Virtual Event: 2021. 6600?6610 [31] Thomas N, Welling M. Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations. Toulon, USA: 2017. 1?14 [32] Wu F, Zhang T Y, Souza A H, Fifty C, Yu T, Weinberger K Q. Simplifying graph convolutional networks. In: Proceedings of the 36th International Conference on Machine Learning. San Francisco, USA: 2019. 6861?6871 [33] 王婕婷, 錢(qián)宇華, 李飛江, 劉郭慶. 消除隨機一致性的支持向量機分類(lèi)方法. 計算機研究與發(fā)展, 2020, 57(8): 1581?1593 doi: 10.7544/issn1000-1239.2020.20200127Wang Jie-Ting, Qian Yu-Hua, Li Fei-Jiang, Liu Guo-Qing. Support vector machine with eliminating the random consistency. Journal of Computer Research and Development, 2020, 57(8): 1581?1593 doi: 10.7544/issn1000-1239.2020.20200127 [34] 竇諾, 趙瑞珍, 岑翼剛, 胡紹海, 張勇東. 基于稀疏表示的含噪圖像超分辨重建方法. 計算機研究與發(fā)展, 2015, 52(4): 943?951 doi: 10.7544/issn1000-1239.2015.20140047Dou Nuo, Zhao Rui-Zhen, Cen Yi-Gang, Hu Shao-Hai, Zhang Yong-Dong. Noisy image super-resolution reconstruction based on sparse representation. Journal of Computer Research and Development, 2015, 52(4): 943?951 doi: 10.7544/issn1000-1239.2015.20140047