A Hybrid Learning Framework for Imbalanced Stream Classification

Date

2017-09-11

Department

Program

Citation of Original Publication

W. Zhang and J. Wang, "A Hybrid Learning Framework for Imbalanced Stream Classification," 2017 IEEE International Congress on Big Data (BigData Congress), Honolulu, HI, USA, 2017, pp. 480-487, doi: 10.1109/BigDataCongress.2017.70.

Rights

© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Abstract

The pervasive imbalanced class distribution occurring in real-world stream applications, such as surveillance, security and finance, in which data arrive continuously has sparked extensive interest in the study of imbalanced stream classification. In such applications, the evolution of unstable class concepts is always accompanied and complicated by the skewed class distribution. However, most of the existing methods focus on either class imbalance problem or non-stationary learning problem, the combined approach of addressing both issues has enjoyed relatively little research. In this paper, we propose a hybrid framework for imbalanced stream learning that consists of three components: classifier updating, resampling and cost sensitive classifier. Based on the framework, we propose a hybrid learning algorithm to combine data-level and algorithm-level methods as well as classifier retraining mechanics to tackle class imbalance in data streams. Our experiments using real-world datasets and synthetic datasets show that our proposed hybrid learning algorithm can have better effectiveness and efficiency.