Abstract

This work presents a novel approach to forecast streaming big time series based on nearest similar patterns. This approach combines a clustering algorithm with a classifier and the nearest neighbours algorithm. It presents two separate stages: offline and online. The offline phase is for training and finding the best models for clustering, classification and the nearest neighbours algorithm. The online phase is to predict big time series in real time. In the offline phase, data are divided into clusters and a forecasting model based on the nearest neighbours is trained for each cluster. In addition, a classifier is trained using the cluster assignments previously generated by the clustering algorithm. In the online phase, the classifier predicts the cluster label of an instance, and the proper nearest neighbours model according to the predicted cluster label is applied to obtain the final prediction using the similar patterns. The algorithm is able to be updated incrementally for online learning from data streams. Results are reported using electricity consumption with a granularity of |$10$| minutes for 4-hour-ahead forecasting and compared with well-known online benchmark learners, showing a remarkable improvement in prediction accuracy.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://dbpia.nl.go.kr/journals/pages/open_access/funder_policies/chorus/standard_publication_model)
You do not currently have access to this article.