In the last decade, extended efforts have been poured into energy efficiency. Several energy consumption datasets were henceforth published, with each dataset varying in properties, uses and limitations. For instance, building energy consumption patterns are sourced from several sources, including ambient conditions, user occupancy, weather conditions and consumer preferences. Thus, a proper understanding of the available datasets will result in a strong basis for improving energy efficiency. Starting from the necessity of a comprehensive review of existing databases, this work is proposed to survey, study and visualize the numerical and methodological nature of building energy consumption datasets. A total of thirty-one databases are examined and compared in terms of several features, such as the geographical location, period of collection, number of monitored households, sampling rate of collected data, number of sub-metered appliances, extracted features and release date. Furthermore, data collection platforms and related modules for data transmission, data storage and privacy concerns used in different datasets are also analyzed and compared. Based on the analytical study, a novel dataset has been presented, namely Qatar university dataset, which is an annotated power consumption anomaly detection dataset. The latter will be very useful for testing and training anomaly detection algorithms, and hence reducing wasted energy. Moving forward, a set of recommendations is derived to improve datasets collection, such as the adoption of multi-modal data collection, smart Internet of things data collection, low-cost hardware platforms and privacy and security mechanisms. In addition, future directions to improve datasets exploitation and utilization are identified, including the use of novel machine learning solutions, innovative visualization tools and explainable recommender systems.
翻译:过去十年来,能源消费的扩大努力被注入能源效率,从今以后,公布了几个能源消费数据集,每个数据集在性质、用途和限制方面各不相同。例如,能源消费模式的建设来自若干来源,包括环境条件、用户占用、天气条件和消费者偏好。因此,对现有数据集的正确了解将为提高能源效率奠定坚实的基础。从对现有数据库进行全面审查的必要性开始,提议进行这项工作,调查、研究和直观地分析建设能源消费数据集的数字和方法性质。从若干特点来看,共审查并比较了31个数据库,例如推荐的地理位置、收集期限、监测住户数目、抽样数据、次级设备数目、提取的特征和发布日期等,建立能源消费模式。从对现有数据库进行全面审查开始,提议对建立能源消费数据集的数字和方法进行考察、研究和比较。根据分析研究,提出了一个新的数据集,即卡塔尔大学数据集,这是一个有注释的电力消费异常探测数据集,从若干特点来看,例如推荐的地理位置、收集的时期、被监测的住户数目、所收集的数据的采样率、提取的数据收集平台,以及随后的数据收集工具的升级,从而改进了能源数据的收集,从而改进了数据的收集。