Date(s) - 27 Apr 2022
1:10 PM - 2:00 PM
3043 ECpE Building Addition
Advisor: Goce Trajcevski, Srikanta Tirthapura
Title: Adaptive Strategies for Streaming Data Learning
Abstract: Streaming data is the continuous (never-ending) flow of data generated by various sources (such as IoT sensors, social media, security logs, or internal/external systems) in various formats and volumes. While traditional solutions are built to ingest, process, and structure data before it can be acted upon, today’s application requires models and solutions that can analyze and incorporate newly arrived data in motion. In this presentation, we will first dis- cuss STRSAGA, an algorithm that can efficiently maintain a machine learning model over data points that arrive over time and quickly update the model as new training data are observed. We present a competitive analysis that com- pares the sub-optimality of the model maintained by STRSAGA with that of an offline algorithm that is given the entire data beforehand. Our theoretical and experimental results show that the risk of STRSAGA is comparable to that of an offline algorithm on a variety of input arrival patterns, and its experimen- tal performance is significantly better than prior algorithms suited for stream- ing data. We will then present DriftSurf, which learns a model over a stream of data when the distribution of data changes over time (also known as con- cept drift). DriftSurf is an adaptive learning algorithm that extends previous drift-detection-based methods by incorporating drift detection into a broader stable-state/reactive-state process. The algorithm is generic in its base learner and can be applied across a variety of supervised learning problems. Our the- oretical analysis shows that the risk of the algorithm is (i) statistically better than standalone drift detection and (ii) competitive to an algorithm with oracle knowledge of when (abrupt) drifts occur. Experiments on synthetic and real datasets with concept drifts confirm our theoretical analysis.
Biography: Ashraf Tahmasbi is a Ph.D. candidate in Computer Engineering at Iowa State University. Her primary research is focused on machine learning and streaming data learning.