Two approaches for synthesizing scalable residential energy consumption data

Xiufeng Liu*, Nadeem Iftikhar, Huan Huo, Rongling Li, Per Sieverts Nielsen

*Corresponding author for this work

Research output: Contribution to journalJournal articleResearchpeer-review

2 Downloads (Pure)

Abstract

Many fields require scalable and detailed energy consumption data for different study purposes. However, due to privacy issues, it is often difficult to obtain sufficiently large datasets. This paper proposes two different methods for synthesizing fine-grained energy consumption data for residential households, namely a regression-based method and a probability-based method. They each use a supervised machine learning method, which trains models with a relatively small real-world dataset and then generates large-scale time series based on the models. This paper describes the two methods in details, including data generation process, optimization techniques, and parallel data generation. This paper evaluates the performance of the two methods, which compare the resulting consumption profiles with real-world data, including patterns, statistics, and parallel data generation in the cluster. The results demonstrate the effectiveness of the proposed methods and their efficiency in generating large-scale datasets.
Original languageEnglish
JournalFuture Generation Computer Systems
Volume95
Pages (from-to)586-600
ISSN0167-739X
DOIs
Publication statusPublished - 2019

Keywords

  • Energy consumption
  • Time series
  • Synthesize
  • Simulation
  • Data generation

Cite this

@article{8db91682179b4bc88b5d9ffafcd5be25,
title = "Two approaches for synthesizing scalable residential energy consumption data",
abstract = "Many fields require scalable and detailed energy consumption data for different study purposes. However, due to privacy issues, it is often difficult to obtain sufficiently large datasets. This paper proposes two different methods for synthesizing fine-grained energy consumption data for residential households, namely a regression-based method and a probability-based method. They each use a supervised machine learning method, which trains models with a relatively small real-world dataset and then generates large-scale time series based on the models. This paper describes the two methods in details, including data generation process, optimization techniques, and parallel data generation. This paper evaluates the performance of the two methods, which compare the resulting consumption profiles with real-world data, including patterns, statistics, and parallel data generation in the cluster. The results demonstrate the effectiveness of the proposed methods and their efficiency in generating large-scale datasets.",
keywords = "Energy consumption, Time series, Synthesize, Simulation, Data generation",
author = "Xiufeng Liu and Nadeem Iftikhar and Huan Huo and Rongling Li and Nielsen, {Per Sieverts}",
year = "2019",
doi = "10.1016/j.future.2019.01.045",
language = "English",
volume = "95",
pages = "586--600",
journal = "Future Generation Computer Systems - The International Journal of eScience",
issn = "0167-739X",
publisher = "Elsevier",

}

Two approaches for synthesizing scalable residential energy consumption data. / Liu, Xiufeng; Iftikhar, Nadeem; Huo, Huan; Li, Rongling; Nielsen, Per Sieverts.

In: Future Generation Computer Systems, Vol. 95, 2019, p. 586-600.

Research output: Contribution to journalJournal articleResearchpeer-review

TY - JOUR

T1 - Two approaches for synthesizing scalable residential energy consumption data

AU - Liu, Xiufeng

AU - Iftikhar, Nadeem

AU - Huo, Huan

AU - Li, Rongling

AU - Nielsen, Per Sieverts

PY - 2019

Y1 - 2019

N2 - Many fields require scalable and detailed energy consumption data for different study purposes. However, due to privacy issues, it is often difficult to obtain sufficiently large datasets. This paper proposes two different methods for synthesizing fine-grained energy consumption data for residential households, namely a regression-based method and a probability-based method. They each use a supervised machine learning method, which trains models with a relatively small real-world dataset and then generates large-scale time series based on the models. This paper describes the two methods in details, including data generation process, optimization techniques, and parallel data generation. This paper evaluates the performance of the two methods, which compare the resulting consumption profiles with real-world data, including patterns, statistics, and parallel data generation in the cluster. The results demonstrate the effectiveness of the proposed methods and their efficiency in generating large-scale datasets.

AB - Many fields require scalable and detailed energy consumption data for different study purposes. However, due to privacy issues, it is often difficult to obtain sufficiently large datasets. This paper proposes two different methods for synthesizing fine-grained energy consumption data for residential households, namely a regression-based method and a probability-based method. They each use a supervised machine learning method, which trains models with a relatively small real-world dataset and then generates large-scale time series based on the models. This paper describes the two methods in details, including data generation process, optimization techniques, and parallel data generation. This paper evaluates the performance of the two methods, which compare the resulting consumption profiles with real-world data, including patterns, statistics, and parallel data generation in the cluster. The results demonstrate the effectiveness of the proposed methods and their efficiency in generating large-scale datasets.

KW - Energy consumption

KW - Time series

KW - Synthesize

KW - Simulation

KW - Data generation

U2 - 10.1016/j.future.2019.01.045

DO - 10.1016/j.future.2019.01.045

M3 - Journal article

VL - 95

SP - 586

EP - 600

JO - Future Generation Computer Systems - The International Journal of eScience

JF - Future Generation Computer Systems - The International Journal of eScience

SN - 0167-739X

ER -