Title: k-means based load estimation of domestic smart meter measurements

Al-Wakeel A, Wu J, Cheng M (2016). k-means based load estimation of domestic smart meter measurements. Cardiff University. http://doi.org/10.17035/d.2016.0009225471

Access Rights: Data can be made freely available subject to attribution

Access Method: Click to email a request for this data to opendata@cardiff.ac.uk

Cardiff University Dataset Creators

Dataset Details

Publisher: Cardiff University

Date (year) of data becoming publicly available: 2016

Data format: .csv

Estimated total storage size of dataset: Less than 100 megabytes

Number of Files In Dataset: 3776

DOI : 10.17035/d.2016.0009225471

DOI URL: http://doi.org/10.17035/d.2016.0009225471


A load estimation algorithm based on k-means cluster analysis was developed. The developed algorithm applies cluster centres – of previously clustered load profiles – and distance functions to estimate missing and future measurements. Canberra, Manhattan, Euclidean, and Pearson correlation distances were investigated. Several case studies were implemented using daily and segmented load profiles of aggregated smart meters. Segmented profiles cover a time window that is less than or equal to 24 hours. Simulation results show that Canberra distance outperforms the other distance functions. Results also show that the segmented cluster centres produce more accurate load estimates than daily cluster centres. Higher accuracy estimates were obtained with cluster centres in the range of 16-24 hours. The developed load estimation algorithm can be integrated with state estimation or other network operational tools to enable better monitoring and control of distribution networks.
This dataset provides details to the

Input load profiles;
Output load profiles; and
Cluster centres

which comprise the average active power demand (measured in kilo-Watts) at each half-hourly time step during a day. The dataset also includes the values of the Mean Absolute Percentage Error (MAPE) between the actual and the estimated values of the active power demand. A readme.txt file has been included in each folder to help the reader trace the type of information provided within the folders.

Research results based upon these data are published at


Cluster analysis, k-means clustering, Load estimation, Smart meters

Related Projects

Last updated on 2019-12-09 at 09:41