0% found this document useful (0 votes)
32 views18 pages

Determining Key Variables Influencing Energy Consumption in Office Buildings Through Cluster Analysis of Pre - and Post-Retrofit Building Data

Uploaded by

moorestell59
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views18 pages

Determining Key Variables Influencing Energy Consumption in Office Buildings Through Cluster Analysis of Pre - and Post-Retrofit Building Data

Uploaded by

moorestell59
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Energy and Buildings 159 (2018) 228–245

Contents lists available at ScienceDirect

Energy and Buildings


journal homepage: www.elsevier.com/locate/enbuild

Determining key variables influencing energy consumption in office


buildings through cluster analysis of pre- and post-retrofit building
data
Chirag Deb ∗ , Siew Eang Lee
Department of Building, School of Design and Environment, National University of Singapore, Singapore 117566, Singapore

a r t i c l e i n f o a b s t r a c t

Article history: This study aims to determine key building variables influencing energy consumption in air-conditioned
Received 7 September 2017 office buildings. The study is based in Singapore which entails tropical climatic conditions. The analysis
Received in revised form 11 October 2017 is based on assessment of several energy audit reports concerning pre- and post-retrofit data from 56
Accepted 6 November 2017
office buildings. A list of 14 building variables, extracted from these reports form the superset. These are
Available online 10 November 2017
systematically analyzed further to derive key variables influencing energy consumption and retrofitting
decisions. For this purpose, a robust iterative process is developed utilizing k-means clustering. This
Keywords:
process tests all combinations of the 14 variables against change in energy use intensity (EUI, measured
Building energy
Cluster analysis
as kWh/m2 .year) for pre- and post-retrofit conditions. The results indicate that the best set of variables
Energy efficiency consists of: 1) gross floor area (GFA), 2) non-air-conditioning energy consumption, 3) average chiller
Office buildings plant efficiency, and 4) installed capacity of chillers. This information can be utilized to explore energy
Building retrofit saving potential of office buildings that need to be retrofitted. The resultant clusters can also be used to
K-means clustering benchmark buildings based on pre-retrofit conditions and energy saving potential.
© 2017 Elsevier B.V. All rights reserved.

1. Introduction the building sector to avoid future emissions related to energy


consumption in new buildings should be as much as to reduce
The International Energy Agency (IEA) has identified energy emissions from existing buildings [3]. Since buildings have a long
efficiency in buildings as one of the five measures to secure long lifetime, the penetration of new, more efficient buildings as a pro-
term decarbonisation of the energy sector1 [1]. Due to the marked portion of the total building stock is extremely slow. Therefore,
environmental and economic benefits associated with energy effi- in the coming decades, buildings already in existence will still be
ciency in the building sector, governments around the world have major source of energy consumption and CO2 emissions.
embarked on various initiatives and regulations. A survey by the The upgradation of the existing building stock so that it complies
World Energy Council (WEC) on 63 countries that constitute 83% with energy efficient building regulations is a lengthy procedure. It
of the global energy consumption shows that most of the countries requires seamless cooperation between public bodies and build-
have either employed voluntary or mandatory energy efficiency ing stakeholders. In this regard, a variety of public policies and
regulations [2]. These regulations outline basic requirements to measures have been initiated and implemented worldwide. These
achieve an energy efficient design for new buildings and upgra- have often proved to be successful. The European Union directive
dation of existing buildings with a view to reduce the final energy of October 2012 on energy efficiency requires its Member States to
consumption and related CO2 emissions. The number of existing promote the availability of high quality and cost effective energy
buildings constitute a large part of the current and future build- audits to all final customers [4]. Energy use forecasts show that the
ing stock. The United Nations’ report on Environment and Energy portion of energy consumed per capita by the commercial building
2010, states that in rapidly developing countries, the priority for sector is expected to increase while that of the residential building
sector is expected to decrease [5]. This signifies the importance of
realizing the potential of energy efficiency in the commercial build-
ing sector. In this regard, the growth of Energy Service Companies
∗ Corresponding author.
(ESCOs) that provide energy audit services has been the highest in
E-mail addresses: [email protected], [email protected] (C. Deb).
1 the Asia Pacific and Western European regions (Fig. 1) [6].
© OECD/IEA 2015 World Energy Outlook Special Report, IEA Publishing. License:
[http://www.iea.org/t&c/termsandconditions/].

https://doi.org/10.1016/j.enbuild.2017.11.007
0378-7788/© 2017 Elsevier B.V. All rights reserved.
C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245 229

different countries [9,10]. A major advantage of such benchmarking


Nomenclature is that it provides a holistic foundation for comparing buildings with
similar energy performance.The two approaches that are taken for
ANN Artificial neural network data analysis of large dataset of buildings are the clustering and
DB Davies-Bouldin the simulation approach. A comprehensive review on various sim-
ECM Energy conservation measure ulation toolkits available for energy and cost saving retrofits is
EIA U.S energy information administration presented by Lee et al. [11]. Yang et al. developed an energy perfor-
ESCO Energy service company mance model using EnergyPlus with energy prediction capabilities
EUI Energy use intensity [12]. The authors also developed a K-shape clustering algorithm
GFA Ground floor area to classify energy use patterns in buildings [13]. Zhao et al. pre-
IEA International energy agency sented a methodology to replicate building stock energy data using
RT Refrigeration ton energy survey data [14]. The study aims building stock aggrega-
SOM Self-organising maps tion, thereby developing efficient energy models that extend the
SSE Sum of squared errors knowledge beyond individual buildings.
WEC World energy council As for the clustering approach, the European Union directive
%EUI Percentage change in EUI of 2012 has prompted studies focusing on the implementation of
cost-optimal analysis of retrofit improvements with respect to a
representative, reference building from a large dataset [15]. One of
the challenges for defining such a reference building in a stock of
As seen in Fig. 1, the office building type promises the highest
existing ones is to find how effectively these sub-groups can be cre-
potential for retrofitting in terms of revenue generated, followed by
ated [16]. For this purpose, the application of statistical techniques
retail and educational buildings. Although there is a vast scope of
such as clustering is necessary. This approach focuses the investi-
retrofitting opportunities, the literature and current retrofit prac-
gation on a small number of representative buildings. Even before
tices show that energy efficiency improvement projects have been
the European directives, Gaitani et al. analyzed a sample of 1100
conducted on an ad hoc basis without a systematic decision mak-
schools in Greece and clustered them into five clusters with respect
ing process [7,8]. This is attributed to the fact that different building
to seven principal components and seven building variables [17].
types exhibit uncommon characteristics due to their diverse func-
The following variables were selected to characterize the school
tionalities. For example, office buildings have much higher energy
buildings: heated surface (m2 ), age of the building (years), insula-
consumption per capita when compared to the residential energy
tion of the building (0 for non-insulated, 1 for insulated), number
use. This is because office buildings are equipped with energy inten-
of classrooms, number of students, school’s operating hours per
sive equipment and often have high space cooling/heating demand.
day, age of the heating system (years). However, this study was
Similarly, different building types have contrasting levels of energy
limited to just finding the reference building without any detailed
consumption.
recommendations on the retrofit improvements. Santamouris et al.
A way to assist in the energy audit process is to benchmark build-
classified school buildings in Greece into several energy classifica-
ings of the same typology so that a comparative analysis can be
tions using intelligent clustering techniques [18]. The classification
pursued. A great deal of research has been undertaken in this field,
using clustering technique was far better than that by using equal
giving rise to several benchmarking models based on dataset from

Fig. 1. Energy efficiency revenue by building type and region for 2011.
230 C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245

frequency distribution. Even for this study, no investigation on They discussed the challenges in the data availability and lack of
the energy saving potential was done. A slightly different clus- transparency of these models. Theodoridou et al. analyzed a large
tering analysis with hierarchical clustering using centroid method sample of Greek residential buildings with respect to the energy
was performed by Filippin et al. [19]. The building variables were consumption [32]. They adopted a correlation analysis approach to
selected by stepwise multivariate regression and in parallel, the mine data from the large building stock. Several factors like year
buildings were grouped using clustering techniques. The centroid of construction, building typology, glazing type etc. that influence
of each cluster was determined by taking the average of the vari- energy consumption was analyzed. Fracastoro and Serraino pre-
ables that described that cluster. Yu et al. performed clustering sented a methodology for large scale building stocks through study
analysis to examine the influences of occupant behavior on build- of the statistical distribution of floor area according to primary
ing energy consumption [20]. Gao and Malkawi proposed a new energy for heating demand [33]. Mata et al. presented a modelling
methodology for energy performance benchmarking using cluster- strategy for energy, carbon, and cost assessments of building stocks
ing algorithms [21]. The method consists of four steps starting with in Sweden [34]. The bottom-up building physics model is based on
the feature selection step. An ordinary least squares (OLS) stepwise a one-dimensional building energy balance, which gives hourly net
regression was applied to extract the features. Kontokosta ana- energy demand. The energy conservation measures were evaluated
lyzed detailed survey from asset managers of 763 office buildings in both individual and aggregated basis. Ascione et al. presented
to determine the factors affecting retrofitting decisions by build- a multi-stage and multi-objective optimization for cost-optimal
ing owners [22]. The factors were characterized into four groups energy retrofitting of a building [35]. The time taken to simulate
including building design and systems, fuel type and consumption the large number of retrofitting scenarios is reduced by combining
patterns (primary fuel type and weather normalized EUI), own- analysis in MATLAB with EnergyPlus. In another study by the same
ership type and tenant demand, and spatial/market controls. The authors, artificial neural networks (ANNs) models are combined
predictive model developed could correctly classify 81.2% of the with EneryPlus [36]. The outcomes of the EnergyPlus simulations
retrofit activity in the sample. In another study by Marasco and Kon- are taken as targets for training and testing the ANN model, thereby
tokotsa, energy audit data for over 1100 buildings in New York city reducing the time taken for evaluating all possible retrofitting sce-
are analyzed to identify opportunities for ECMs across building sys- narios.
tem categories (e.g. distribution system, domestic hot water, etc.)
[23]. A machine learning classifier based on binary features derived
1.1. Research gap
from the dataset is developed to predict ECM eligibility given a spe-
cific set of building characteristics. Hsu compared the integrated
In summary, the benchmarking approach is convenient if data
clustering methods for predicting building energy consumption
for a substantial sample is available. It provides an easy comparison
[24]. It was observed that cluster-wise regression was more accu-
with buildings of similar characteristics. However, the compari-
rate for predictions when compared to the k-means clustering but
son of buildings is usually based on just one index, for example,
resulted in less stable clusters and vice versa.
the energy use intensity (EUI). The energy use intensity is mea-
In the last three to four years, a lot of advancement has
sured as the energy used for every unit floor area of a building
been made in the field of building energy efficiency improvement
and is measured in kWh/m2 .year. In most cases of benchmarking
through data analytics. Park et al. developed a new energy bench-
studies, other variables are ignored mostly due to unavailability of
mark for improving the rating system for office buildings in Korea
data. However, buildings with different sets of building systems and
using various data-mining techniques [25]. The study divided the
characteristics can still have the same EUI. For example, a building
buildings based on five criterions and it was seen that some of the
with high cooling load and very efficient chiller system can use the
buildings were downgraded or upgraded in the rating system when
same amount of electrical energy as a building with low cooling
these five criterions were considered. The availability of energy data
load and non-efficient chiller system. The benchmarking approach,
through meters has led to a series of data-driven analysis as is sum-
however, will categorize these two buildings in the same category.
marized by Deb et al. [26]. Raatikainen et al. studied the energy
Therefore, a more comprehensive system of analysis that takes
consumption in school buildings using Self-Organizing Maps (SOM)
all the necessary building characteristics into account is required.
[27]. This technique was combined with k-means clustering tech-
Multivariate statistical techniques like clustering are alternatives
nique to obtain schools with similar electricity consumption costs
that can perform high-dimensional benchmarking by considering
per cubic meter. Ruparathna et al. critically reviewed the exist-
other variables as well. Clustering has the ability to deal with multi-
ing methodologies in improving the energy efficiency of existing
dimensional data and partition them based on set criterions. This
commercial and institutional buildings [7]. It was noted that even
is against the single variable approach taken by current building
though the existing studies predominately focused on technical
policies to benchmark buildings. A few research studies have also
advancements, approaches such as building behavioral changes
highlighted and supported the clustering approach [37,38].
have been largely overlooked as a strategy for improving energy
efficiency in buildings. A considerable amount of research has
been undertaken by the Lawrence Berkeley National Laboratory 1.2. Research objective
(LBNL) on energy retrofit analysis toolkits in the past ten years. Ma
et al. have provided a comprehensive review on existing building To achieve such a holistic clustering approach, this study uti-
retrofits and highlighted on the lack of decision making methods lizes 14 building variables that are co-analyzed with the change
to identify the most cost effective retrofit measures [28]. in building EUI for pre- and post-retrofit conditions. A robust iter-
Balaras et al. presented a methodology to determine the ative process to select the appropriate variables for clustering is
priorities for energy conservation measures (ECMs) in Hellenic resi- developed for this purpose. This process is based on the all-subset
dential buildings [29]. A total of 14 ECMs were assessed along with approach where each possible combination of the 14 variables are
many assumptions that were combined with the available data. tested to identify the best possible combination. A more detailed
However, the use of intelligent clustering technique that naturally explanation is given in the next section. This study also takes advan-
groups elements was missing. Caldera et al. developed a statistical tage of the availability of both, pre- and post-retrofit EUI data which
method to evaluate the heating energy demand for 50 multi-family is usually missing in previous studies. The availability of this data
residential buildings [30]. Kavgic et al. presented and compared presents a direct outlook into the change in energy consumption of
five bottom-up building stock models for energy consumption [31]. these buildings due to retrofitting.
C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245 231

Fig. 2. GFA and total energy consumption for the buildings in increasing order of GFA (first 28 buildings).

2. Methodology vides a good estimate of the intensity of energy use in buildings.


Chung presented a review of energy performance methodology for
2.1. Data collection buildings, where EUI for several sets of buildings from different
countries and climates is discussed [9]. A study done by Lee and
This study is based on 56 energy audit reports that has been Priyadarshini discusses the EUI for office buildings in Singapore
collected from 5 accredited Energy Service Companies (ESCOs) in where the average EUI for the dataset of 104 buildings is found
Singapore. These energy audit reports contain detailed analysis of to be around 221 kWh/m2 yr [42]. For the current study, the EUI
energy distribution and usage by various energy consuming sys- ranges from 196 to 303 kWh/m2 .yr for pre-retrofit condition and
tems in buildings. It was found that air conditioning system has the from 168 to 243 kWh/m2 .yr for post-retrofit condition respectively
highest energy consumption share. This is attributed to high cool- (Fig. 4).
ing loads for a hot and humid tropical climate as that of Singapore. The relationship between pre-retrofit EUI and post-retrofit EUI
Another important characteristic of office buildings in terms of provides an outlook on the extent of possible energy savings based
energy use is the large amount of IT equipment and presence of data on the existing pre-retrofit EUI of the building. A simple correlation
centers. However, the EUI values mentioned in this study do not based on the Pearson’s correlation coefficient is presented in Fig. 5.
include data center energy consumption. A more detailed account The correlation coefficient for this relationship is 0.63, which is not
on the effect of climate of Singapore on energy consumption in good enough for making predictions related to post-retrofit EUI
buildings can be found in related works by the authors [39,40]. based on knowledge of the pre-retrofit EUI. Therefore, it is intended
The data collected consists of energy audit reports of 56 office to explore other variables that influence energy use and compute
buildings and some post-retrofit information for these buildings. energy savings as a function of such key variables.
It forms a part of the PhD work carried out by Deb in 2017 in
National University of Singapore [41]. The data related to post- 2.2. K-means clustering
retrofit information includes overall energy consumption and the
improved chiller plant efficiency in the post-retrofit condition. It Clustering is a procedure to systematically identify natural
was observed that ESCOs mainly focus their analyses on air condi- groupings for data points or elements in a dataset. This is done
tioning plant room retrofitting. This is because the air conditioning in such a way that the characteristics of elements belonging to the
system consumes a large portion of the overall energy consump- same group or ‘cluster’ are similar to each other while differing with
tion in these types of buildings. It was observed that the energy elements in other clusters. This generates a concise representation
savings were mostly due to chiller plant retrofit and could be as of grouping behavior in the dataset. Each cluster is defined by the
high as 96% of the total savings after retrofitting. This signifies the member elements belonging to the cluster and by its centroid, or
importance of air-conditioning related retrofitting. The Gross Floor center. The center of each cluster represents the average value of
Area (GFA) for these buildings ranges from 6588 m2 to 89200 m2 , its elements. The essence of clustering approach is the classification
with the median being at 28419 m2 . The range of electrical energy according to nearest location [43,44]. Clustering has also been used
consumption is from 130 × 104 to 2117 × 104 kWh (Figs. 2 and 3). to classify time series data in groups [45].
The energy use intensity (EUI), also known as ‘energy utilization There are a few popular methods of clustering that have been
index’ is measured as energy use per unit GFA and is used world- extensively used in the field of building energy analysis. Hierar-
wide for the study of building energy performance. It includes both chical clustering and k-means clustering are the most utilized.
− electric energy directly used by the building and other forms of This study uses the k-means clustering technique to classify build-
electric power used in the building, for example, electricity used ings based on several aspects and variables that are discussed
for operating chillers. This index is convenient in normalizing the later. K-means is a relatively efficient method, which is quite
energy use of a building with regards to the GFA. The index pro- easy to implement and delivers accurate grouping of elements.
232 C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245

Fig. 3. GFA and total energy consumption for the buildings in increasing order of GFA (next 28 buildings).

Fig. 4. Range of EUI for pre-retrofit and post-retrofit conditions for the 56 office buildings that form the dataset.

The term k-means was first introduced by James MacQueen in the geometric distance in multidimensional space and is given by
1967 and the algorithm is based on measure of proximity [46]. Eq. (1).
Although, there are many possibilities for measuring proximity, one
approach is to note the nearness between each pair of elements to

p
 2
2
dAB = xrj − xsj (1)
determine their closeness. Another is to observe the difference or
j=1
distance between the pairs of elements, as the distance is comple-
mentary to closeness. Distance measures are the most commonly Geometric distance between two elements for k-means cluster-
used measurement of similarity between objects. These distances ing.
can be calculated both on single and multiple dimensions using Here, ‘dAB ’ is the distance between element A and element B,
Euclidean distances. The most common calculation to estimate ‘xk ’ is the position of element A in the space ‘xk ’ = (xk1 , xk2 ,. . . xkp ),
C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245 233

Fig. 5. Correlation between pre-retrofit EUI and post-retrofit EUI.

and ‘p’ is the space dimension. To derive true distance measures, information on the average distances between elements of clusters
Euclidean distances are usually calculated from the raw and not to the distance between the clusters. It is given as in Eq. (2).
from standardized data. However, if variables are measured on dif-
ferent scales, variables with large values contribute more to the
distance measure than variables with small values. For example, the
1  max Si + Sj
k
GFA of buildings is measured in a much larger scale (values can be
DB = (2)
up to 100,000) when compared to the chiller plant efficiency (usu- k j.j =
/ i dij
ally values between 0.5 and 2). Hence, when these two variables are i=1

taken to determine the clusters, there is a large bias towards GFA


and the clusters that are formed are governed by the GFA scale.
To overcome this bias, standard scores of variables are taken to The DB metric to evaluate the optimal number of clusters for k-
perform clustering in the current study. means clustering.
In this equation, the number of clusters is given by ‘k’. ‘Si ’ is
2.3. Methods to determine number of clusters the average distance of the input elements of cluster ‘i’ to its cen-
troid and ‘Sj ’ is the distance of the input elements of cluster ‘j’ to
The k-means clustering algorithm has a set of initial conditions its centroid. The term ‘dij ’ refers to the distance between centroids
that needs to be pre-specified. These include number of clusters and of clusters ‘i’ and ‘j’. Low values of the DB index indicate that ele-
initial centroid locations. For initial centroid locations, this study ments within clusters are near to each other and cluster centers are
selects random elements in the dataset. This is done because the far from each other.
dataset used for analysis is devoid of any outliers and so there is no Another method often employed to determine the optimal num-
risk of selecting elements that are abnormally far from the mean ber of clusters is the ‘elbow method’. In this method, k-means
as initial centroids. In addition, the clustering is repeated several clustering algorithm is run for different values of ‘k’ (for example,
times to neutralize any bias from random centroid selections. To from 1 to 10) and for each run the sum of squared errors (SSE) is
determine the optimal number of clusters, there is no global the- calculated. The SSE is the summation of squares of the difference
oretical method. In general, an increase in the number of clusters between elements of clusters to their centroids. The lower values of
increases the homogenous character of the clusters. However, this SSE indicate better cluster division and are inversely proportional
also results in too many clusters and presents risk of overfitting. In to the number of clusters. An ‘elbow’ point in the line plot of SSE
practice, the clustering is performed several times with different with the number of clusters is selected as optimal number of clus-
values of ‘k’ and the results are compared based on certain pre- ters. Beyond this point, an increase in the values of ‘k’ results in
defined conditions. An example of this is the Davies–Bouldin index diminishing returns. Both the methods described above are simi-
(DB) metric to evaluate clustering algorithms [47]. According to this lar in determining the optimal number of clusters and the elbow
index, the optimal number of clusters can be determined using the method is selected for this study.
234 C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245

Fig. 6. The effect of addition of newer variables on clustering outcome.

Table 1
List of variables extracted from the energy audit reports.

Sr. No. Variable Availability of data

Pre-retrofit Post-retrofit

1 GFA Yes Yes


2 Total air-conditioning energy Yes Yes
3 Total non-air-conditioning energy Yes Yes
4 Total energy consumption Yes Yes
5 Air-conditioning energy use percentage Yes Yes
6 Operating hours of the building Yes Yes
7 Avg. cooling load (RT) per day Taken to be same as part of research
8 Max. cooling load (RT) per day assumption
9 Min. cooling load (RT) per day
10 Avg. Chiller Plant Eff. (kW/RT) Yes Yes
11 Max. Chiller Plant Eff. (kW/RT) Yes No
12 Min. Chiller Plant Eff. (kW/RT) Yes No
13 Total installed capacity of chillers Yes No
14 Chilled water temp. diff. (T) Yes No

2.4. Variable selection for clustering characteristics indirectly aids in determining the cooling/heating
load for buildings.
The selection of variables to perform clustering depends on the As this study is based on the analysis of detailed energy audit
objective being targeted. For example, a set of buildings may be reports, the information on cooling load is readily available from
clustered based on the energy use index (EUI) to group buildings the energy audit reports. For this reason, the detailed exploration
exhibiting similar EUI values. Introducing a second variable, gross of physical characteristics of buildings has been avoided and only
floor area (GFA) in this case will result in clusters of buildings with the crucial variables influencing energy consumption have been
similar GFA as well as EUI. In this way, the selection of higher num- considered. The list of 14 variables extracted from energy audit
ber of variables leads to groups of buildings with higher similarity reports is presented in Table 1. As seen, the variables related to
as compared to the clustering done by selecting lesser number of cooling load have been assumed to be same in both pre- and post-
variables. This is illustrated in Fig. 6, where clustering with one, two retrofit condition. The cooling load will only change provided there
and three variables is performed. Often, these variables or attributes is improvement in the building envelope. Apart from that, if the
that define each element are known as features. The position of each occupancy and operation schedule of a building remains same, the
element is described by a set of these features or variables. cooling load in pre- and post-retrofit conditions will be very similar.
The criterions to select appropriate variables influencing energy As mentioned earlier, most of the retrofits pertain to chiller plant
consumption in buildings primarily depends on two aspects. The room. The detailed analysis on the correlation of these variables
first is to assess the availability of measured data which still to the existing building EUI and on their energy consumption is
possesses a challenge for data-driven analysis of building energy presented in the following section, where the inter-dependencies
efficiency [48]. The second criteria is to understand the correlation of these variables are also discussed. To obtain an understanding on
of each variable with energy consumption, which is usually done mean values of these variables and corresponding ranges, a boxplot
through multiple linear regression. Several studies have attempted diagram for these variables is presented in Fig. 7.
the formulation of regression models relating the influential vari- It is to be noted that standard score values have been used in
ables to energy consumption. In most of these studies, physical the boxplot so that the range of all the variables can be viewed and
characteristics like GFA, age of building, window to wall ratio, ther- compared in the same plot. The standard scores are computed using
mal transmittance values of envelop elements, occupant number, Eq. (3). The z-score represents the distance between the original
operational hours, cooling/heating degree days, etc. are taken as the value and the sample mean in units of standard deviation. The z-
influencing variables [16,21,49]. The physical characteristics have score is negative when the original value is below the mean, and it
a key role in determining the cooling/heating load of buildings. For is positive when the original value is above the mean.
example, a building in a hot tropical climate with a high window to
wall ratio and high thermal transmittance of windows would have Original value − Mean value
higher cooling load. This subsequently would lead to higher energy Standard score or ’z score’ = (3)
Standard deviation
consumption for the building as the air-conditioning system needs
to meet this higher demand in cooling. Hence, the data on physical
Formula to obtain standard score or ‘z score’.
C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245 235

Fig. 7. Box plot for standard scores of the fourteen variables.

In this study, the methodology to select variables for clustering Table 2


Number of combinations while taking a certain number of variables. The clustering
is based on a robust iterative process. For this, all possible combina-
is performed for all these combinations.
tions of the 14 variables are explored by taking a certain number of
variables at once. The number of possible combinations is derived Number of selected variables Number of possible combinations
using Eq. (4). 1 14
n n! n (n − 1) (n − 2) . . . (n − r + 1)
2 91
orCrn = = (4) 3 364
r r! (n − r)! r! 4 1001
5 2002
Equation to derive the number of possible combinations. 6 3003
Here, ‘n’ is the total number of variables and ‘r’ is the chosen 7 3432
number of variables. For example, by taking any 2 variables from 8 3003
9 2002
the set of 14 variables, there are 91 combinations possible. Simi-
10 1001
larly, by taking any 6 variables at once, there are 3003 combinations 11 364
possible. The details of the number of variables selected and the 12 91
associated combinations are given in Table 2. 13 14
14 1
The clustering is performed for each set of combinations. Since
Total 16383
this study involves random selection of initial centroids, the cluster-
236 C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245

Fig. 8. Methodology to determine the appropriate set of variables for k-means clustering − an example by taking 2 variables.

ing results may vary with the initial centroid locations. To overcome results arising by taking top 10 results for each iteration. Finally, the
this, the clustering is repeated 100 times for each set of combina- 1000 best results are plotted in a bar plot with respect to two out-
tions, giving rise to a total of 1,638,300 clustering iterations. The puts. First, to identify the combination number that has the highest
iteration runs are limited to 100 due to time constraints as it takes occurrence number and second, the combination number that cor-
a long time to perform clustering with high number of variables. responds to the highest values for%EUI. This combination number
For each clustering iteration, the values of change in EUI (%EUI) gives the set of variables that corresponds to the highest%EUI val-
corresponding to buildings in every cluster is recorded. It is then ues. This part of the study is performed in MATLAB version R2015a.
used to calculate the mean%EUI for that cluster. For each itera- The stepwise methodology is illustrated in Fig. 8. The complete
tion, the top 10 results corresponding to the maximum difference clustering process for this methodology can be summarized in the
in%EUI between clusters are recorded. At the end of 100 itera- following steps. The objective is to identify the best set of variables
tions for each set of combination, there are a total of 1000 best
C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245 237

Table 3 Table 4
The variables taken for clustering and the associated number with each variable. Clustering results for the three clusters based on percentage change in EUI.

Sr. No. Variable Associated number Clustering%EUI No. of elements Mean (%) Range

1 GFA 1 Cluster 1 13 10.19 5.22–12.69


2 Total air-conditioning energy (before retrofit) 2 Cluster 2 29 16.63 13.58 − 19.96
3 Total non-air-conditioning energy (before 3 Cluster 3 14 23.59 20.34 − 30.04
retrofit)
4 Total energy consumption (before retrofit) 4
5 Air-conditioning energy use percentage 5
variables. In other words, it is intended to identify the distribution
6 Operating hours 6
7 Avg. cooling load (RT) per day 7 and set of building variables that lead to corresponding clusters
8 Max. cooling load (RT) per day 8 of%EUI. Since only one variable (%EUI) is used to cluster the 56
9 Min. cooling load (RT) per day 9 buildings, the clustering is one dimensional. It is to be noted that
10 Avg. Chiller Plant Eff. (kW/RT) 10 no other variables are used for clustering at this step, rather, all
11 Max. Chiller Plant Eff. (kW/RT) 11
12 Min. Chiller Plant Eff. (kW/RT) 12
other variables are studied once the buildings are clustered based
13 Total installed capacity of chillers 13 on%EUI. The elbow method to determine the number of clusters
14 Chilled water supply and return temperature 14 show that the division of dataset into three clusters is optimum
difference (T) (Fig. 9). This is because there is an ‘elbow’ drop at this point as
marked with a small red circle. The elbow method tends to balance
between number of clusters and sum of squared errors of elements
that produces highest change in energy consumption in pre- and of each cluster. The three clusters of the buildings based on%EUI
post-retrofit condition. clustering is presented in Fig. 10. The results show that these three
clusters have mean values of 10.1, 16.6 and 23.5% respectively
1. Identify the number of variables to be selected for clustering. (Table 4). The number of buildings falling into these clusters are
2. Derive the number of possible combinations for the selected 13, 29 and 14, respectively. This shows that there are three distinct
number. clusters of buildings with similar change in energy consumption
3. Perform k-means clustering for each set of combinations (num- between pre- and post-retrofit conditions.
ber of clusters is decided by the elbow method). The results of assessing variables with respect to the% EUI clus-
4. The results of the clustering are evaluated based on the differ- tering show that there is no clear segregation in variables when
ence in mean values of%EUI between clusters. compared to the clear distribution in%EUI (Fig. 11). This indicates
5. Repeat the clustering for 100 times and for each iteration, record that buildings cannot be clearly distinguished into groups based
the top 10 results. on information on their pre-retrofit condition. The three variables
6. Analyze the best 1000 results (10 from each of the 100 itera- that have some distinction in distribution are: average chiller plant
tions) to determine the combination number corresponding to efficiency, percentage of air-conditioning energy and difference in
the highest difference in mean of%EUI values. chilled water supply temperature. However, this analysis shows
7. Use the combination number to determine the set of variables that clustering based on%EUI is not enough to cluster buildings
that generates best clustering result. The specific number asso- into groups that have similar characteristics. In other words, this
ciated with each variable is presented in Table 3. analysis shows that by taking all the variables, it is difficult to
determine the energy saving potential of buildings. It is to be noted
3. Results and discussion that the%EUI values can provide fair information on the energy
saving potential of buildings. A high value indicates high energy
The results of clustering are presented under two sections. These saving potential and vice versa. Therefore, further analysis that
two sections capture two approaches. The first approach clusters takes specific variables influence%EUI. This will enable to deter-
buildings based on change in EUI between pre- and post-retrofit mine groups of buildings with similar characteristics, exhibiting
conditions. These buildings, belonging to specific clusters are then similar values of%EUI. The next section takes specific variables to
assessed on their 14 variables. The idea is to see whether the vari- cluster buildings. The study on the distribution of resultant%EUI
ables also segregate when the buildings are clustered based on is done thereafter. This process, in a way, is a mirror of what is
change in EUI. attempted in this section.
The second approach is a mirrored step of the first. Here, the
clustering is done based on the building variables and the resultant 3.2. Clustering based on building variables
effect on change in EUI is studied. The challenge lies in selecting
the most appropriate variables (from 14 variables) to perform the The methodology to select variables for clustering is based on
clustering. For this, a robust process is developed in which all the a robust iterative process. This is described in detail in Section 2.4.
possible combinations of the 14 variables are tried. The best pos- Here, all possible combinations for the 14 variables are tested. For
sible set of variables that provide largest difference in change in each chosen set of variables, a hundred clustering iterations are
EUI among the clusters are determined. The knowledge of these performed to normalize the effect of random centroid initialization.
variables will help in determining the energy saving potential of The results of the clustering process are evaluated based on the
buildings in their pre-retrofit condition. outcome of differences between means of%EUI.
This section begins with a detailed analysis of the steps involved
3.1. Clustering based on change in EUI between pre- and for one set of combinations. This is followed by a summary and
post-retrofit conditions discussion on all different sets of combinations of variables that
are taken for clustering. As an example, the first set of clustering
The objective of this section is to cluster buildings based on is performed by taking 2 variables. Based on the elbow method,
change in pre- and post-retrofit EUI as a percent of pre-retrofit the number of clusters is taken as both, 2 and 3. The analysis for
EUI (abbreviated as%EUI). Knowing the%EUI clusters and then both is presented separately. The number of combinations that
looking into the distribution of variables in various clusters will pro- are possible by taking any 2 variables of the 14 variables is 91
vide an insight on the potential of%EUI for a given set of building (14 C2 ). The k-means clustering is done using MATLAB 2015a. Before
238 C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245

Fig. 9. Result of the elbow method to determine optimum number of clusters.

Fig. 10. The three clusters based on the percentage change in EUI between pre- and post-retrofit condition.

performing the clustering, each combination of the two variables that the combination number of 17, 47, 60, 67, 16, 21, 22, 23, 82 and
is listed and the clustering is performed for all these 91 combi- 9 exhibit the best values for difference in mean values of%EUI. The
nations. This entire process is repeated for a hundred iterations. combination number of 17 corresponds to the selection of variable
At each clustering iteration, the top 10 combinations (out of 91) numbers of 2 and 6 which are air-conditioning energy consump-
that results in maximum difference in mean values of%EUI are tion and operation hours respectively. Similarly, the combination
recorded (Fig. 12). In this way, at the end of 100 iterations, there are number of 47 corresponds to the selection of variable numbers of
a total of 1000 best combinations recorded. At this point, there are 5 and 6 which are percentage of air-conditioning energy and oper-
two outputs that are examined. First, the combination of variables ation hours. The list of numbers associated with the variables is
that occur for the highest number of times in these 1000 combi- presented in Table 3. The cumulative effect of such analysis for a
nations are identified and second, the sum of difference in mean hundred iterations is presented in Figs. 13 and 14.
values of%EUI for these unique combinations are also identified. As shown in these results, the cumulative sum of differences in
The analysis of these two outputs will help identify the best set of mean values for%EUI shows that the variables corresponding to
variables that should be selected for clustering. For example, the combination number 17 produce the best result. These variables
result for clustering buildings taking 2 variables and plotting the are air-conditioning energy consumption and operation hours of
corresponding mean values of%EUI is shown in Fig. 12. It shows the building. The total sum of differences in mean values of%EUI
C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245 239

is 456. The combinations of variables that occur for the highest It is to be noted that these results are based on clustering with
number of times are also noted. The combination number for vari- ‘k’ value of 2. With a value of ‘k’ as 3, the clustering results exhibit
ables with highest occurrences is also 17. However, it is interesting the highest sum of differences in mean values of%EUI to be 453
to note that the second highest occurrence is for combination for the combination number corresponding to 47. This combination
number 22. This corresponds to variable numbers 2 and 11 respec- number also corresponds to the highest number of occurrences.
tively. These two variables correspond to air-conditioning energy This analysis is done for all possible combinations of the vari-
consumption and total installed capacity of chillers (in units of ables. However, the results in the previous three plots are presented
Refrigeration ton {RT}). This is different than the second high- for combinations taking only 2 variables. For combinations gener-
est combination number for sum of differences in mean values ated by using higher number of variables, the results for clustering
of%EUI. For this case, the combination number is 47 correspond- into 2 and 3 clusters are shown in Tables 5 and 6 respectively. It
ing to variable numbers of 5 and 6 respectively. These are variables shows that the best result is obtained by using 4 variables for the
of percentage of air-conditioning energy and operation hours of the combination number of 114. This corresponds to the 4 variables
building. of GFA, non-air-conditioning energy consumption, average chiller
plant efficiency and installed capacity of chillers. The cumulative

Fig. 11. Clusters of various variables based on change in EUI clusters for: a). GFA. b). Pre-retrofit air-conditioning energy consumption. c). Pre-retrofit non-air-conditioning
energy consumption. d). Pre-retrofit total energy consumption. e). Percentage of air-conditioning energy. f). Operating hours. g). Average cooling load per day. h). maximum
cooling load per day. i). Minimum cooling load per day. j). Average chiller plant efficiency. k). Maximum chiller plant efficiency. l). Minimum chiller plant efficiency. m). Total
installed capacity of chillers. n). Difference in chilled water main (supply and return temperature).
240 C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245

Fig. 11. (Continued).

Table 5 Table 6
Combination number and sum of differences in mean values of change in EUI from Combination number and sum of differences in mean values of change in EUI from
the 10 best results for 100 iterations (dividing the data into 2 clusters). the 10 best results for 100 iterations (dividing the data into 3 clusters).

Number of Number of possible Combination Sum of differences in Number of Number of Combination Sum of differences in
selected combinations number for best mean values of%EUI selected possible number for best mean values
variables result variables combinations result of%EUI

1 14 6 586.3 1 14 5 496.78
2 91 17 456 2 91 47 540.57
3 364 47 453 3 364 22 334.61
4 1001 114 632 4 1001 209 191.88
5 2002 1834 620.8 5 2002 232 227.37
6 3003 2853 508 6 3003 2092 134.97
7 3432 2938 387.4 7 3432 886 123.95
8 3003 2985 472 8 3003 45 173.86
9 2002 1054 340.8 9 2002 26 290.37
10 1001 10 317 10 1001 21 302.49
11 364 10 317 11 364 6 315.9
12 91 10 317 12 91 38 302.59
13 14 10 317 13 14 3 444.54
14 1 1 314.5 14 1 1 6.52
Total 16383 Total 16383
C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245 241

Fig. 11. (Continued).


242 C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245

Fig. 12. Plot between sum of difference in mean values of change in EUI and combination of variables. Such iterations are repeated 100 times. In this case, the combination
number of 17, 47, 60, 67, 16, 21, 22, 23, 82 and 9 exhibit top 10 results.

Fig. 13. Cumulative sum of differences in change in EUI for 100 iterations.

sum of differences for change in EUI for buildings in these 2 clus- ables is 1834. The time taken to perform the clustering for both 2
ters is 632. This cumulative value is a result of a hundred iterations. and 3 clusters is shown in Table 7. It is seen that the time taken
Therefore, on average, the difference in mean values of change in for the clustering process varies and is directly proportional to the
EUI for these 2 clusters is 6.32. The boxplot for the%EUI values for number of possible combinations. It is to be noted that this anal-
these two clusters is presented in Fig. 15. The next best result is seen ysis is not based on real-time data and the mention of time spent
when 5 variables are taken for clustering. These 5 variables are per- for analysis is just to show that complex clustering with higher
centage of air-conditioning energy, average cooling load per day, variables indeed takes a longer time.
average chiller plant efficiency, maximum chiller plant efficiency
and difference in chilled water supply and return temperature. The
combination number for the best results for clustering with 5 vari-
C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245 243

Fig. 14. Plot showing the number of occurrences for each combination of 2 variables for 100 iterations.

Fig. 15. Boxplot for%EUI for the 2 clusters obtained by taking the best combination of 4 variables.

4. Conclusions all these combinations with a stopping criterion for each itera-
tion using k-means clustering. It will be interesting to see how
This study presents an analysis of pre- and post-retrofit energy other clustering approaches like k-medoids and hierarchical per-
consumption data for 56 buildings. There are 14 building vari- forms in this analysis. However, this is beyond the scope of this
ables that are analyzed. The prime objective is to identify the best study. To accommodate the random selection of cluster centroids
set of variables that can be used to cluster buildings to obtain an for k-means clustering, the clustering is repeated for a hundred iter-
insight on their energy saving potential. The existing procedures ations.This analysis provides building variables that should be used
to select variables for clustering is either based on intuitive human for clustering and can be applied for the study of existing build-
experience or the development of regression models. Whereas, in ings needing retrofitting. This methodology enhances the current
this study, the methodology to select variables for clustering is approaches for clustering buildings with respect to building vari-
based on a robust iterative process. For this, all possible combi- ables and energy performance. The following major conclusions are
nations of the 14 available variables are explored. This accounts drawn from this study.
to a total of 16,383 combinations. The clustering is performed for
244 C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245

Table 7 [3] United Nations Development Programme, Promoting Energy Efficiency in


Information on the number of combinations that are possible and the time taken to Buildings: Lessons Learned from International Experience, 2010, New York,
perform all the iterations. USA.
[4] Energy Efficiency Directive, Off. J. Eur. Union, 2012.
Number of selected Number of possible Time taken for 100 [5] US Department of Energy, Buildings Energy Data Book, 2011, Washington D.C.
variables combinations iterations (minute) [6] N.C. Inc, Energy Efficiency Retrofits for Commercial and Public Buildings, 2014
https://www.navigantresearch.com/newsroom/commercial-building-energy-
2 clusters 3 clusters
efficiency-retrofits-will-surpass-127-billion-in-annual-market-value-by-
1 14 0.26 0.1 2023.
2 91 1.07 0.90 [7] R. Ruparathna, K. Hewage, R. Sadiq, Improving the energy efficiency of the
3 364 3.9 3.7 existing building stock: a critical review of commercial and institutional
buildings, Renew. Sustain. Energy Rev. 53 (2016) 1032–1045, http://dx.doi.
4 1001 10.53 10.52
org/10.1016/j.rser.2015.09.084.
5 2002 22.08 18.98
[8] S. Hall, Development and initial trial of a tool to enable improved energy &
6 3003 33.86 28 human performance in existing commercial buildings, Renew. Energy 67
7 3432 42.08 32.58 (2014) 109–118, http://dx.doi.org/10.1016/j.renene.2013.11.022.
8 3003 27.05 28.31 [9] W. Chung, Review of building energy-use performance benchmarking
9 2002 21.75 48.26 methodologies, Appl. Energy 88 (2011) 1470–1479, http://dx.doi.org/10.
10 1001 7.53 13.48 1016/j.apenergy.2010.11.022.
11 364 2.72 5.18 [10] L. Pérez-Lombard, J. Ortiz, R. González, I.R. Maestre, A review of
12 91 0.72 1.23 benchmarking, rating and labelling concepts within the framework of
13 14 0.13 0.37 building energy certification schemes, Energy Build. 41 (2009) 272–278,
14 1 0.1 0.1 http://dx.doi.org/10.1016/j.enbuild.2008.10.004.
Total 16383 [11] S.H. Lee, T. Hong, M.A. Piette, S.C. Taylor-Lange, Energy retrofit analysis
toolkits for commercial buildings: a review, Energy 89 (2015) 1087–1100,
http://dx.doi.org/10.1016/j.energy.2015.06.112.
[12] J. Yang, M. Santamouris, S.E. Lee, C. Deb, Energy performance model
development and occupancy number identification of institutional buildings,
· As per the first approach, there are 3 clusters of buildings based Energy Build. (2015), http://dx.doi.org/10.1016/j.enbuild.2015.12.018.
on the change in EUI (%EUI). These 3 clusters have respective [13] J. Yang, C. Ning, C. Deb, F. Zhang, D. Cheong, S. Eang Lee, C. Sekhar, K. Wai
mean values of 10.1, 16.6 and 23.5% respectively. The number of Tham, k-Shape clustering algorithm for building energy usage patterns
analysis and forecasting model accuracy improvement, Energy Build. 146
elements in these 3 clusters are 13, 29 and 14. This shows that (2017) 27–37, http://dx.doi.org/10.1016/j.enbuild.2017.03.071.
groups of buildings experience different levels of change in EUI val- [14] F. Zhao, S.H. Lee, G. Augenbroe, Reconstructing building stock to replicate
ues between pre- and post-retrofit conditions. However, there is energy consumption data, Energy Build. 117 (2016) 301–312, http://dx.doi.
org/10.1016/j.enbuild.2015.10.001.
no clear segregation of the building variables for the three clusters [15] E. Commission, Commission Delegated Regulation (EU) No 244/2012 of 16
when compared to the clear distribution in%EUI. Only variables of January 2012 Supplementing Directive 2010/31/EU of the European
average chiller plant efficiency and percentage of air-conditioning Parliament and of the Council on the Energy Performance of Buildings by
Establishing a Comparative Methodology Framework for Calculating, 2012.
energy have some segregation.
[16] R. Arambula Lara, G. Pernigotto, F. Cappelletti, A. Gasparella, Energy audit of
· As per the second approach, the building variables are taken for schools by means of cluster analysis, Energy Build. 95 (2015) 160–171, http://
clustering. A robust iterative process is developed for this process. dx.doi.org/10.1016/j.enbuild.2015.03.036.
[17] N. Gaitani, C. Lehmann, M. Santamouris, G. Mihalakakou, P. Patargias, Using
The results show that the 4 variables of GFA, non-air-conditioning
principal component and cluster analysis in the heating evaluation of the
energy consumption, average chiller plant efficiency and installed school building sector, Appl. Energy 87 (2010) 2079–2086, http://dx.doi.org/
capacity of chillers provide the best result for clustering. 10.1016/j.apenergy.2009.12.007.
These clustering variables and boxplots for change in EUI can [18] M. Santamouris, G. Mihalakakou, P. Patargias, N. Gaitani, K. Sfakianaki, M.
Papaglastra, C. Pavlou, P. Doukas, E. Primikiri, V. Geros, M.N. Assimakopoulos,
be used to explore the expected energy saving potential for build- R. Mitoula, S. Zerefos, Using intelligent clustering techniques to classify the
ings targeted for retrofitting. This method does not alleviate the energy performance of school buildings, Energy Build. 39 (2007) 45–51,
process of data collection on building variables but rather provides http://dx.doi.org/10.1016/j.enbuild.2006.04.018.
[19] C. Filippín, F. Ricard, S. Flores Larsen, Evaluation of heating energy
a significant comparison platform with similar buildings that have consumption patterns in the residential building sector using stepwise
undergone retrofitting. A limitation to this study is that it considers selection and multivariate analysis, Energy Build. 66 (2013) 571–581, http://
only 56 energy audit reports by 5 ESCOs. Data from more buildings dx.doi.org/10.1016/j.enbuild.2013.07.054.
[20] Z. Yu, B.C.M. Fung, F. Haghighat, H. Yoshino, E. Morofsky, A systematic
would have added more robustness to the study. However, most of procedure to study the influence of occupant behavior on building energy
the office buildings in Singapore are high-rise buildings, with very consumption, Energy Build. 43 (2011) 1409–1417, http://dx.doi.org/10.1016/
similar building design and construction. Due to this, the sample j.enbuild.2011.02.002.
[21] X. Gao, A. Malkawi, A new methodology for building energy performance
of buildings used in this study represents the population of office
benchmarking: an approach based on intelligent clustering algorithm, Energy
buildings in Singapore fairly well. In addition, this methodology can Build. 84 (2014) 607–616, http://dx.doi.org/10.1016/j.enbuild.2014.08.030.
be expanded to other building typologies like retail, health-care [22] C.E. Kontokosta, Modeling the energy retrofit decision in commercial office
buildings, Energy Build. 131 (2016) 1–20, http://dx.doi.org/10.1016/j.enbuild.
and residential buildings. An associated challenge is the collection
2016.08.062.
of data but the methodology can be readily applied to the study of [23] D.E. Marasco, C.E. Kontokosta, Applications of machine learning methods to
other building types as well. identifying and predicting building retrofit opportunities, Energy Build. 128
(2016) 431–441, http://dx.doi.org/10.1016/j.enbuild.2016.06.092.
[24] D. Hsu, Comparison of integrated clustering methods for accurate and stable
prediction of building energy consumption data, Appl. Energy 160 (2015)
Acknowledgements 153–163, http://dx.doi.org/10.1016/j.apenergy.2015.08.126.
[25] H.S. Park, M. Lee, H. Kang, T. Hong, J. Jeong, Development of a new energy
The authors deeply acknowledge the contribution of the fol- benchmark for improving the operational rating system of office buildings
using various data-mining techniques, Appl. Energy 173 (2016) 225–237,
lowing five Energy Service Companies (ESCOs) in providing all the
http://dx.doi.org/10.1016/j.apenergy.2016.04.035.
energy audit reports used in this study: EMSI Singapore, E2green [26] C. Deb, F. Zhang, J. Yang, S. Eang Lee, K. Wei Shah, A Review on Time Series
Pte Ltd., G-Energy Global Pte Ltd., ING-Energy Global Pte Ltd., LJ Forecasting Techniques for Building Energy Consumption, 2017, http://dx.doi.
Energy Pte Ltd. org/10.1016/j.rser.2017.02.085.
[27] M. Raatikainen, J.-P. Skön, K. Leiviskä, M. Kolehmainen, Intelligent analysis of
energy consumption in school buildings, Appl. Energy 165 (2016) 416–429,
http://dx.doi.org/10.1016/j.apenergy.2015.12.072.
References [28] Z. Ma, P. Cooper, D. Daly, L. Ledo, Existing building retrofits: methodology and
state-of-the-art, Energy Build. 55 (2012) 889–902, http://dx.doi.org/10.1016/
[1] IEA, Energy and Climate Change, 2015, http://dx.doi.org/10.1038/479267b. j.enbuild.2012.08.018.
[2] W.E. Council, Energy Efficiency Policies and Indicators, 2004, London.
C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245 245

[29] C.A. Balaras, A.G. Gaglia, E. Georgopoulou, S. Mirasgedis, Y. Sarafidis, D.P. Lalas, Appl. Therm. Eng. 31 (2011) 3521–3525, http://dx.doi.org/10.1016/j.
European residential buildings and empirical assessment of the Hellenic applthermaleng.2011.07.005.
building stock, energy consumption, emissions and potential energy savings, [39] C. Deb, L.S. Eang, J. Yang, M. Santamouris, Forecasting diurnal cooling energy
Build. Environ. 42 (2007) 1298–1314, http://dx.doi.org/10.1016/j.buildenv. load for institutional buildings using Artificial Neural Networks, Energy Build.
2005.11.001. (2015), http://dx.doi.org/10.1016/j.enbuild.2015.12.050.
[30] M. Caldera, S.P. Corgnati, M. Filippi, Energy demand for space heating through [40] C. Deb, L.S. Eang, J. Yang, M. Santamouris, Forecasting energy consumption of
a statistical approach: application to residential buildings, Energy Build. 40 institutional buildings in Singapore, Procedia Eng. 121 (2015) 1734–1740,
(2008) 1972–1983, http://dx.doi.org/10.1016/j.enbuild.2008.05.005. http://dx.doi.org/10.1016/j.proeng.2015.09.144.
[31] M. Kavgic, A. Mavrogianni, D. Mumovic, A. Summerfield, Z. Stevanovic, M. [41] C. Deb, Development of an Automated Energy Audit Protocol for Office
Djurovic-Petrovic, A review of bottom-up building stock models for energy Buildings, National University of Singapore, 2017 http://scholarbank.nus.edu.
consumption in the residential sector, Build. Environ. 45 (2010) 1683–1697, sg/handle/10635/136280.
http://dx.doi.org/10.1016/j.buildenv.2010.01.021. [42] S.E. Lee, P. Rajagopalan, Building energy efficiency labeling programme in
[32] I. Theodoridou, A.M. Papadopoulos, M. Hegger, Statistical analysis of the Singapore, Energy Policy 36 (2008) 3982–3992, http://dx.doi.org/10.1016/j.
Greek residential building stock, Energy Build. 43 (2011) 2422–2428, http:// enpol.2008.07.014.
dx.doi.org/10.1016/j.enbuild.2011.05.034. [43] J.F. Hair, R.E. Anderson, R.L. Tatham, Multivariate Data Analysis with Readings,
[33] G.V. Fracastoro, M. Serraino, A methodology for assessing the energy Macmillan, 1987.
performance of large scale building stocks and possible applications, Energy [44] J.D. Jobson, Applied Multivariate Data Analysis, Springer New York, New York,
Build. 43 (2011) 844–852, http://dx.doi.org/10.1016/j.enbuild.2010.12.004. NY, 1992, http://dx.doi.org/10.1007/978-1-4612-0921-8.
[34] É. Mata, A.S. Kalagasidis, F. Johnsson, A modelling strategy for energy, carbon, [45] T. Warren Liao, Clustering of time series data—a survey, Pattern Recognit. 38
and cost assessments of building stocks, Energy Build. 56 (2013) 100–108, (2005) 1857–1874, http://dx.doi.org/10.1016/j.patcog.2005.01.025.
http://dx.doi.org/10.1016/j.enbuild.2012.09.037. [46] J. MacQueen, Some methods for classification and analysis of multivariate
[35] F. Ascione, N. Bianco, C. De Stasio, G.M. Mauro, G.P. Vanoli, Multi-stage and observations Proc. Fifth Berkeley Symp. Math. Stat. Prob., Vol. 1, University of
multi-objective optimization for energy retrofitting a developed hospital California Press, Berkeley, 1967, pp. 281–297, http://projecteuclid.org/euclid.
reference building: a new approach to assess cost-optimality, Appl. Energy bsmsp/1200512992. (Accessed 6 November 2016).
174 (2016) 37–68, http://dx.doi.org/10.1016/j.apenergy.2016.04.078. [47] D.L. Davies, D.W. Bouldin, A cluster separation measure, IEEE Trans. Pattern
[36] F. Ascione, N. Bianco, C. De Stasio, G.M. Mauro, G.P. Vanoli, Artificial neural Anal. Mach. Intell. PAMI-1 (1979) 224–227, http://dx.doi.org/10.1109/TPAMI.
networks to predict energy performance and retrofit scenarios for any 1979.4766909.
member of a building category: a novel approach, Energy 118 (2017) [48] T. Hong, L. Yang, D. Hill, W. Feng, Data and analytics to inform energy retrofit
999–1017, http://dx.doi.org/10.1016/j.energy.2016.10.126. of high performance buildings, Appl. Energy 126 (2014) 90–106, http://dx.doi.
[37] E. Wang, Benchmarking whole-building energy performance with org/10.1016/j.apenergy.2014.03.052.
multi-criteria technique for order preference by similarity to ideal solution [49] W. Chung, Y.V. Hui, Y.M. Lam, Benchmarking the energy efficiency of
using a selective objective-weighting approach, Appl. Energy 146 (2015) commercial buildings, Appl. Energy 83 (2006) 1–14, http://dx.doi.org/10.
92–103, http://dx.doi.org/10.1016/j.apenergy.2015.02.048. 1016/j.apenergy.2004.11.003.
[38] W.-S. Lee, L.-C. Lin, Evaluating and ranking the energy performance of office
building using technique for order preference by similarity to ideal solution,

You might also like