1. <button id="qm3rj"><thead id="qm3rj"></thead></button>
      <samp id="qm3rj"></samp>
      <source id="qm3rj"><menu id="qm3rj"><pre id="qm3rj"></pre></menu></source>

      <video id="qm3rj"><code id="qm3rj"></code></video>

        1. <tt id="qm3rj"><track id="qm3rj"></track></tt>
            1. 2.765

              2022影響因子

              (CJCR)

              • 中文核心
              • EI
              • 中國科技核心
              • Scopus
              • CSCD
              • 英國科學文摘

              留言板

              尊敬的讀者、作者、審稿人, 關于本刊的投稿、審稿、編輯和出版的任何問題, 您可以本頁添加留言。我們將盡快給您答復。謝謝您的支持!

              姓名
              郵箱
              手機號碼
              標題
              留言內容
              驗證碼

              非平衡概念漂移數據流主動學習方法

              李艷紅 王甜甜 王素格 李德玉

              李艷紅, 王甜甜, 王素格, 李德玉. 非平衡概念漂移數據流主動學習方法. 自動化學報, 2024, 50(3): 589?606 doi: 10.16383/j.aas.c230233
              引用本文: 李艷紅, 王甜甜, 王素格, 李德玉. 非平衡概念漂移數據流主動學習方法. 自動化學報, 2024, 50(3): 589?606 doi: 10.16383/j.aas.c230233
              Li Yan-Hong, Wang Tian-Tian, Wang Su-Ge, Li De-Yu. Active learning method for imbalanced concept drift data stream. Acta Automatica Sinica, 2024, 50(3): 589?606 doi: 10.16383/j.aas.c230233
              Citation: Li Yan-Hong, Wang Tian-Tian, Wang Su-Ge, Li De-Yu. Active learning method for imbalanced concept drift data stream. Acta Automatica Sinica, 2024, 50(3): 589?606 doi: 10.16383/j.aas.c230233

              非平衡概念漂移數據流主動學習方法

              doi: 10.16383/j.aas.c230233
              基金項目: 國家重點研發項目(2022QY0300-01), 國家自然科學基金(62076158), 山西省基礎研究計劃項目(202203021221001)資助
              詳細信息
                作者簡介:

                李艷紅:山西大學計算機與信息技術學院副教授. 主要研究方向為數據挖掘, 機器學習. 本文通信作者. E-mail: liyh@sxu.edu.cn

                王甜甜:山西大學計算機與信息技術學院碩士研究生. 主要研究方向為數據挖掘, 機器學習. E-mail: wttstu@163.com

                王素格:山西大學計算機與信息技術學院教授. 主要研究方向為自然語言處理, 機器學習. E-mail: wsg@sxu.edu.cn

                李德玉:山西大學計算機與信息技術學院教授. 主要研究方向為數據挖掘, 人工智能. E-mail: lidy@sxu.edu.cn

              Active Learning Method for Imbalanced Concept Drift Data Stream

              Funds: Supported by National Key Research and Development Program of China (2022QY0300-01), National Natural Science Foundation of China (62076158), and Fundamental Research Program of Shanxi Province (202203021221001)
              More Information
                Author Bio:

                LI Yan-Hong Associate professor at the School of Computer and Information Technology, Shanxi University. Her research interest covers data mining and machine learning. Corresponding author of this paper

                WANG Tian-Tian Master student at the School of Computer and Information Technology, Shanxi University. Her research interest covers data mining and machine learning

                WANG Su-Ge Professor at the School of Computer and Information Technology, Shanxi University. Her research interest covers natural language processing and machine learning

                LI De-Yu Professor at the School of Computer and Information Technology, Shanxi University. His research interest covers data mining and artificial intelligence

              • 摘要: 數據流分類研究在開放、動態環境中如何提供更可靠的數據驅動預測模型, 關鍵在于從實時到達且不斷變化的數據流中檢測并適應概念漂移. 目前, 為檢測概念漂移和更新分類模型, 數據流分類方法通常假設所有樣本的標簽都是已知的, 這一假設在真實場景下是不現實的. 此外, 真實數據流可能表現出較高且不斷變化的類不平衡比率, 會進一步增加數據流分類任務的復雜性. 為此, 提出一種非平衡概念漂移數據流主動學習方法(Active learning method for imbalanced concept drift data stream, ALM-ICDDS). 定義基于多預測概率的樣本預測確定性度量, 提出邊緣閾值矩陣的自適應調整方法, 使得標簽查詢策略適用于類別數較多的非平衡數據流; 提出基于記憶強度的樣本替換策略, 將難區分、少數類樣本和代表當前數據分布的樣本保存在記憶窗口中, 提升新基分類器的分類性能; 定義基于分類精度的基分類器重要性評價及更新方法, 實現漂移后的集成分類器更新. 在7個合成數據流和3個真實數據流上的對比實驗表明, 提出的非平衡概念漂移數據流主動學習方法的分類性能優于6種概念漂移數據流學習方法.
              • 圖  1  算法框架

                Fig.  1  Algorithm framework

                圖  2  7種算法的ROC曲線

                Fig.  2  ROC curves of seven algorithms

                圖  3  7種算法的精確率曲線

                Fig.  3  Precision rate curves of seven algorithms

                圖  4  DS6上消融實驗的結果

                Fig.  4  Results of the ablation experiment on DS6

                圖  5  參數$\beta $對算法的影響

                Fig.  5  Effect of the parameter $\beta $ on the algorithm

                圖  9  參數$ n_d $對算法的影響

                Fig.  9  Effect of the parameter $ n_d $ on the algorithm

                圖  6  參數$ \theta_{0} $對算法的影響

                Fig.  6  Effect of the parameter $ \theta_{0} $ on the algorithm

                圖  7  參數n對算法的影響

                Fig.  7  Effect of the parameter n on the algorithm

                圖  8  參數$ \alpha$對算法的影響

                Fig.  8  Effect of the parameter $\alpha $ on the algorithm

                圖  10  不同類型概念漂移數據流上的精確率曲線

                Fig.  10  P curves on different types of concept drift data stream

                表  1  數據流特征

                Table  1  Data stream feature

                編號數據流樣本數特征數類別數類分布異常點(%)漂移次數
                1DS14000002515類平衡00
                2DS24000002515類平衡53
                3DS34000002515(1/1/1/1/1/1/1/1/1/1/2/2/3/3/5)00
                4DS44000002515(1/1/1/1/1/1/1/1/1/1/2/2/3/3/5)53
                5DS54000002515(1/1/1/1/1/1/1/1/1/1/2/2/3/3/5), 00
                (2/2/3/3/5/1/1/1/1/1/1/1/1/1/1)
                6DS64000002515(1/1/1/1/1/1/1/1/1/1/2/2/3/3/5), 53
                (2/2/3/3/5/1/1/1/1/1/1/1/1/1/1)
                7DS74000002550類平衡53
                8Kddcup99_10%4940004223
                9Shuttle570000107
                10PokerHand8300001010
                下載: 導出CSV

                表  2  概念漂移數據流特征

                Table  2  Concept drift data stream feature

                編號數據流概念漂移類型樣本數特征數類別數漂移寬度
                1DS8突變型40000025151
                2DS9重復型40000025151
                3DS10增量型400000251510000
                4DS11逐漸型400000251510000
                下載: 導出CSV

                表  3  7種算法的P值(%)

                Table  3  P value of seven algorithms (%)

                數據流LBBOLEARFRECALMIDOALM-IDSALM-ICDDS-EALM-ICDDS
                DS196.89±0.3196.36±0.1198.07±0.4398.01±0.4198.03±0.2597.18±0.4899.07±0.34
                DS290.61±0.2188.63±0.5492.77±0.4293.31±0.1493.27±0.4991.97±0.2694.64±0.15
                DS394.41±0.1196.07±0.2396.74±0.4596.64±0.3496.75±0.5696.46±0.6197.84±0.24
                DS486.91±0.4585.23±0.5288.30±0.2989.90±0.2890.27±0.4289.70±0.7292.06±0.28
                DS593.60±0.4894.04±0.5296.30±0.1894.65±0.4995.47±0.3294.24±0.3596.17±0.19
                DS686.59±0.1984.69±0.4888.02±0.4788.44±0.1988.65±0.2587.41±0.4090.86±0.37
                DS788.25±0.8687.21±0.7990.16±0.9290.49±0.4790.51±0.5389.32±0.3893.67±0.40
                Kddcup99_10%83.85±0.5981.10±0.1585.56±0.5492.12±0.4592.13±0.3191.24±0.5195.80±0.17
                Shuttle64.63±0.4263.85±0.2779.07±0.3185.35±0.1485.70±0.3283.48±0.2585.99±0.13
                PokerHand51.63±0.3950.36±0.3552.51±0.5653.93±0.2854.57±0.5052.90±0.1855.89±0.51
                下載: 導出CSV

                表  6  7種算法的${\rm{Kappa }}$值(%)

                Table  6  ${\rm{Kappa }}$ value of seven algorithms (%)

                數據流LBBOLEARFRECALMIDOALM-IDSALM-ICDDS-EALM-ICDDS
                DS195.09±0.4395.47±0.2697.11±0.3397.84±0.1897.52±0.5096.31±0.5398.72±0.18
                DS289.66±0.5088.28±0.4591.80±0.1792.55±0.2592.65±0.2891.27±0.2993.56±0.46
                DS393.08±0.1395.68±0.2295.62±0.5396.50±0.4696.46±0.6096.05±0.3697.69±0.21
                DS486.97±0.4685.86±0.1388.18±0.2589.94±0.2489.99±0.3688.61±0.4690.19±0.57
                DS592.32±0.3794.18±0.4595.86±0.2894.40±0.5095.52±0.1494.29±0.2095.81±0.35
                DS686.59±0.3285.25±0.2987.81±0.5488.90±0.5189.00±0.1387.68±0.4789.80±0.25
                DS788.28±0.4687.51±0.9789.93±0.7190.01±0.9290.19±0.4089.51±0.5993.67±0.54
                Kddcup99_10%80.94±0.2275.68±0.2579.36±0.3583.32±0.2485.83±0.5084.87±0.1686.81±0.33
                Shuttle58.73±0.3961.54±0.2273.78±0.2079.39±0.4380.11±0.5380.97±0.2483.56±0.54
                PokerHand50.34±0.5849.86±0.4050.36±0.1651.24±0.2151.39±0.1650.55±0.4152.25±0.35
                下載: 導出CSV

                表  4  7種算法的R值(%)

                Table  4  R value of seven algorithms (%)

                數據流LBBOLEARFRECALMIDOALM-IDSALM-ICDDS-EALM-ICDDS
                DS194.78±0.1396.04±0.2496.81±0.5997.87±0.2497.92±0.2596.15±0.3198.63±0.17
                DS288.65±0.2587.86±0.5390.35±0.3091.54±0.5491.84±0.5890.78±0.7092.30±0.24
                DS392.55±0.4595.92±0.3294.80±0.4396.12±0.1497.92±0.5495.99±0.5298.55±0.29
                DS487.03±0.4987.08±0.3988.23±0.3190.50±0.3091.07±0.5290.13±0.4391.15±0.11
                DS591.54±0.1192.33±0.5196.04±0.2093.82±0.5594.94±0.2792.91±0.4296.53±0.42
                DS686.56±0.5085.48±0.2487.83±0.4989.43±0.1888.85±0.3688.39±0.3490.63±0.21
                DS787.19±0.4286.12±0.1187.29±0.3688.41±0.5088.77±0.4387.87±0.2091.61±0.78
                Kddcup99_10%60.89±0.5063.05±0.5058.26±0.3861.88±0.3863.71±0.5463.42±0.6769.34±0.57
                Shuttle61.40±0.2150.84±0.3154.36±0.3559.52±0.4163.12±0.5961.79±0.1664.59±0.29
                PokerHand43.57±0.3044.78±0.4655.21±0.6056.84±0.1152.77±0.5455.36±0.2559.57±0.43
                下載: 導出CSV

                表  5  7種算法的${\rm{F}}1$值 (%)

                Table  5  ${\rm{F}}1$ value of seven algorithms (%)

                數據流LBBOLEARFRECALMIDOALM-IDSALM-ICDDS-EALM-ICDDS
                DS195.82±0.1896.20±0.1697.44±0.5097.94±0.3097.97±0.2596.66±0.3798.85±0.23
                DS289.62±0.2388.24±0.5391.54±0.3592.42±0.2292.55±0.5391.37±0.4393.46±0.18
                DS393.47±0.1895.99±0.2795.76±0.4496.38±0.2097.33±0.5596.22±0.5798.19±0.26
                DS486.97±0.4786.15±0.4588.26±0.3090.20±0.2990.67±0.4689.91±0.5991.60±0.16
                DS592.55±0.1793.18±0.3096.17±0.1994.23±0.5295.20±0.2993.57±0.3896.35±0.26
                DS686.57±0.2785.08±0.3287.92±0.4888.93±0.1888.75±0.3087.90±0.3590.74±0.27
                DS787.72±0.5686.66±0.1988.70±0.5289.44±0.4889.61±0.4788.59±0.2992.63±0.40
                Kddcup99_10%70.55±0.5470.94±0.2369.32±0.4574.03±0.2275.33±0.3974.82±0.5480.45±0.49
                Shuttle62.97±0.2856.61±0.2964.43±0.3370.13±0.2172.70±0.4171.01±0.2073.77±0.18
                PokerHand47.26±0.3447.41±0.4053.83±0.5755.35±0.1656.12±0.5254.10±0.2357.67±0.72
                下載: 導出CSV
                1. <button id="qm3rj"><thead id="qm3rj"></thead></button>
                  <samp id="qm3rj"></samp>
                  <source id="qm3rj"><menu id="qm3rj"><pre id="qm3rj"></pre></menu></source>

                  <video id="qm3rj"><code id="qm3rj"></code></video>

                    1. <tt id="qm3rj"><track id="qm3rj"></track></tt>
                        亚洲第一网址_国产国产人精品视频69_久久久久精品视频_国产精品第九页
                      1. [1] Liao G, Zhang P, Yin H, Luo T, Lin J. A novel semi-supervised classification approach for evolving data streams. Expert Systems with Applications, 2023, 215: 119273 doi: 10.1016/j.eswa.2022.119273
                        [2] 朱飛, 張煦堯, 劉成林. 類別增量學習研究進展和性能評價. 自動化學報, 2023, 49(3): 1?26

                        Zhu Fei, Zhang Xu-Yao, Liu Cheng-Lin. Class incremental learning: A review and performance evaluation. Acta Automatica Sinica, 2023, 49(3): 1?26
                        [3] Zhou Z H. Open-environment machine learning. National Science Review, 2022, 9(8): 211?221
                        [4] Wang P, Jin N, Woo W L, Woodward J R, Davies D. Noise tolerant drift detection method for data stream mining. Information Sciences, 2022, 609: 1318?1333 doi: 10.1016/j.ins.2022.07.065
                        [5] Yu H, Liu W, Lu J, Wen Y, Luo X, Zhang G. Detecting group concept drift from multiple data streams. Pattern Recognition, 2023, 134: 109113 doi: 10.1016/j.patcog.2022.109113
                        [6] Suárez-Cetrulo A L, Quintana D, Cervantes A. A survey on machine learning for recurring concept drifting data streams. Expert Systems with Applications, 2022, 213: 118934
                        [7] Yang L, Shami A. A lightweight concept drift detection and adaptation framework for IoT data streams. IEEE Internet of Things Magazine, 2021, 4(2): 96?101 doi: 10.1109/IOTM.0001.2100012
                        [8] Bayram F, Ahmed B S, Kassler A. From concept drift to model degradation: An overview on performance-aware drift detectors. Knowledge-Based Systems, 2022, 245: 108632 doi: 10.1016/j.knosys.2022.108632
                        [9] Karimian M, Beigy H. Concept drift handling: A domain adaptation perspective. Expert Systems with Applications, 2023, 224: 119946 doi: 10.1016/j.eswa.2023.119946
                        [10] Lu J, Liu A, Dong F, Gu F, Gama J, Zhang G. Learning under concept drift: A review. IEEE Transactions on Knowledge and Data Engineering, 2018, 31(12): 2346-2363
                        [11] Shahraki A, Abbasi M, Taherkordi A, Jurcut A D. Active learning for network traffic classification: A technical study. IEEE Transactions on Cognitive Communications and Networking, 2021, 8(1): 422?439
                        [12] Pham T, Kottke D, Sick B, Krempl G. Stream-based active learning for sliding windows under the influence of verification latency. Machine Learning, 2022, 111(6): 2011?2036 doi: 10.1007/s10994-021-06099-z
                        [13] Khowaja S A, Khuwaja P. Q-learning and LSTM based deep active learning strategy for malware defense in industrial IoT applications. Multimedia Tools and Applications, 2021, 80(10): 14637?14663 doi: 10.1007/s11042-020-10371-0
                        [14] Wang S, Luo H, Huang S, Li Q, Liu L, Su G, et al. Counterfactual-based minority oversampling for imbalanced classification. Engineering Applications of Artificial Intelligence, 2023, 122: 106024 doi: 10.1016/j.engappai.2023.106024
                        [15] Malialis K, Panayiotou C G, Polycarpou M M. Nonstationary data stream classification with online active learning and siamese neural networks. Neurocomputing, 2022, 512: 235?252 doi: 10.1016/j.neucom.2022.09.065
                        [16] Du H, Zhang Y, Gang K, Zhang L, Chen Y. Online ensemble learning algorithm for imbalanced data stream. Applied Soft Computing, 2021, 107(1): 107378
                        [17] Wang W, Sun D. The improved AdaBoost algorithms for imbalanced data classification. Information Sciences, 2021, 563: 358?374 doi: 10.1016/j.ins.2021.03.042
                        [18] Gao J, Fan W, Han J, Yu P. A general framework for mining concept-drifting data streams with skewed distributions. In: Proceedings of the International Conference on Data Mining. Minnesota, USA: 2007. 3?14
                        [19] Lu Y, Cheung Y, Tang Y Y. Dynamic weighted majority for incremental learning of imbalanced data streams with concept drift. In: Proceedings of the International Joint Conference on Artificial Intelligence. Melbourne, Australia: AAAI, 2017. 2393?2399
                        [20] Jiao B, Guo Y, Gong D, Chen Q. Dynamic ensemble selection for imbalanced data streams with concept drift. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(1): 1278-1291
                        [21] Guo H S, Zhang S, Wang W J. Selective ensemble-based online adaptive deep neural networks for streaming data with concept drift. Neural Networks, 2021, 142: 437?456 doi: 10.1016/j.neunet.2021.06.027
                        [22] Wang S, Minku L L, Yao X. Resampling-based ensemble methods for online class imbalance learning. IEEE Transactions on Knowledge and Data Engineering, 2014, 27(5): 1356?1368
                        [23] Cano A, Krawczyk B. ROSE: Robust online self-adjusting ensemble for continual learning on imbalanced drifting data streams. Machine Learning, 2022, 111(7): 2561?2599 doi: 10.1007/s10994-022-06168-x
                        [24] Bifet A, Gavalda R. Learning from time-changing data with adaptive windowing. In: Proceedings of the International Conference on Data Mining. Minnesota, USA: 2007. 443?448
                        [25] Barros R S M, Carvalho Santos S G T, Júnior P M G. A boosting-like online learning ensemble. In: Proceedings of the International Joint Conference on Neural Networks. Vancouver, Canada: 2016. 1871?1878
                        [26] Gama J, Medas P, Castillo G, Rodrigues P. Learning with drift detection. In: Proceedings of the Advances in Artificial Intelligence. Maranhao, Brazil: Springer, 2004. 286?295
                        [27] 張永清, 盧榮釗, 喬少杰, 韓楠, Gutierrez L A, 周激流. 一種基于樣本空間的類別不平衡數據采樣方法. 自動化學報, 2022, 48(10): 2549?2563

                        Zhang Yong-Qing, Lu Rong-Zhao, Qiao Shao-Jie, Han Nan, Gutierrez L A, Zhou Ji-Liu. A sampling method of imbalanced data based on sample space. Acta Automatica Sinica, 2022, 48(10): 2549?2563
                        [28] Bifet A, Holmes G, Pfahringer B. Leveraging bagging for evolving data stream. In: Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Barcelona, Spain: Springer, 2010. 135?150
                        [29] Ferreira L E B, Gomes H M, Bifet A, Oliveira L. Adaptive random forests with resampling for imbalanced data streams. In: Proceedings of the International Joint Conference on Neural Networks. Budapest, Hungary: IEEE, 2019. 1?6
                        [30] Gu Q, Tian J, Li X, Song J. A novel random forest integrated model for imbalanced data classification problem. Knowledge-Based Systems, 2022, 250: 109050 doi: 10.1016/j.knosys.2022.109050
                        [31] Martins V E, Cano A, Junior S B. Meta-learning for dynamic tuning of active learning on stream classification. Pattern Recognition, 2023, 138: 109359 doi: 10.1016/j.patcog.2023.109359
                        [32] Yin C Y, Chen S S, Yin Z C. Clustering-based active learning classification towards data stream. ACM Transactions on Intelligent Systems and Technology, 2023, 14(2): 1?18
                        [33] Xu W H, Zhao F F, Lu Z C. Active learning over evolving data streams using paired ensemble framework. In: Proceedings of the 8th International Conference on Advanced Computational Intelligence. Chiang Mai, Thailand: 2016. 180?185
                        [34] Liu S X, Xue S, Wu J, Zhou C, Yang J, Li Z, et al. Online active learning for drifting data streams. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(1): 186?200
                        [35] Liu W K, Zhang H, Ding Z Y, Liu Q B, Zhu C. A comprehensive active learning method for multiclass imbalanced data streams with concept drift. Knowledge-Based Systems, 2021, 215: 106778 doi: 10.1016/j.knosys.2021.106778
                        [36] 李艷紅, 任霖, 王素格, 李德玉. 非平衡數據流在線主動學習方法. 自動化學報, DOI: 10.16383/j.aas.c211246

                        Li Yan-Hong, Ren Lin, Wang Su-Ge, Li De-Yu. Online active learning method for imbalanced data stream. Acta Automatica Sinica, DOI: 10.16383/j.aas.c211246
                        [37] Zhao P, Cai L W, Zhou Z H. Handling concept drift via model reuse. Machine learning, 2020, 109: 533?568 doi: 10.1007/s10994-019-05835-w
                        [38] Karimi M R, Gürel N M, Karlas B, Rausch J, Zhang C, Krause A. Online active model selection for pre-trained classifiers. In: Proceedings of the International Conference on Artificial Intelligence and Statistics. San Diego, California, USA: 2021. 307?315
                        [39] Zyblewski P, Wozniak M, Sabourin R. Preprocessed dynamic classifier ensemble selection for highly imbalanced drifted data streams. Information Fusion, 2021, 66: 138?154 doi: 10.1016/j.inffus.2020.09.004
                        [40] Moraes M, Gradvohl A. MOAFS: A Massive Online Analysis library for feature selection in data streams. The Journal of Open Source Software, 2020, 5: 1970 doi: 10.21105/joss.01970
                      2. 加載中
                      3. 圖(10) / 表(6)
                        計量
                        • 文章訪問數:  261
                        • HTML全文瀏覽量:  100
                        • PDF下載量:  90
                        • 被引次數: 0
                        出版歷程
                        • 收稿日期:  2023-04-24
                        • 錄用日期:  2023-10-12
                        • 網絡出版日期:  2024-01-25
                        • 刊出日期:  2024-03-29

                        目錄

                          /

                          返回文章
                          返回