Determining Key Variables Influencing Energy Consumption in Office Buildings Through Cluster Analysis of Pre - and Post-Retrofit Building Data
Determining Key Variables Influencing Energy Consumption in Office Buildings Through Cluster Analysis of Pre - and Post-Retrofit Building Data
a r t i c l e i n f o a b s t r a c t
Article history: This study aims to determine key building variables influencing energy consumption in air-conditioned
Received 7 September 2017 office buildings. The study is based in Singapore which entails tropical climatic conditions. The analysis
Received in revised form 11 October 2017 is based on assessment of several energy audit reports concerning pre- and post-retrofit data from 56
Accepted 6 November 2017
office buildings. A list of 14 building variables, extracted from these reports form the superset. These are
Available online 10 November 2017
systematically analyzed further to derive key variables influencing energy consumption and retrofitting
decisions. For this purpose, a robust iterative process is developed utilizing k-means clustering. This
Keywords:
process tests all combinations of the 14 variables against change in energy use intensity (EUI, measured
Building energy
Cluster analysis
as kWh/m2 .year) for pre- and post-retrofit conditions. The results indicate that the best set of variables
Energy efficiency consists of: 1) gross floor area (GFA), 2) non-air-conditioning energy consumption, 3) average chiller
Office buildings plant efficiency, and 4) installed capacity of chillers. This information can be utilized to explore energy
Building retrofit saving potential of office buildings that need to be retrofitted. The resultant clusters can also be used to
K-means clustering benchmark buildings based on pre-retrofit conditions and energy saving potential.
© 2017 Elsevier B.V. All rights reserved.
https://doi.org/10.1016/j.enbuild.2017.11.007
0378-7788/© 2017 Elsevier B.V. All rights reserved.
C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245 229
Fig. 1. Energy efficiency revenue by building type and region for 2011.
230 C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245
frequency distribution. Even for this study, no investigation on They discussed the challenges in the data availability and lack of
the energy saving potential was done. A slightly different clus- transparency of these models. Theodoridou et al. analyzed a large
tering analysis with hierarchical clustering using centroid method sample of Greek residential buildings with respect to the energy
was performed by Filippin et al. [19]. The building variables were consumption [32]. They adopted a correlation analysis approach to
selected by stepwise multivariate regression and in parallel, the mine data from the large building stock. Several factors like year
buildings were grouped using clustering techniques. The centroid of construction, building typology, glazing type etc. that influence
of each cluster was determined by taking the average of the vari- energy consumption was analyzed. Fracastoro and Serraino pre-
ables that described that cluster. Yu et al. performed clustering sented a methodology for large scale building stocks through study
analysis to examine the influences of occupant behavior on build- of the statistical distribution of floor area according to primary
ing energy consumption [20]. Gao and Malkawi proposed a new energy for heating demand [33]. Mata et al. presented a modelling
methodology for energy performance benchmarking using cluster- strategy for energy, carbon, and cost assessments of building stocks
ing algorithms [21]. The method consists of four steps starting with in Sweden [34]. The bottom-up building physics model is based on
the feature selection step. An ordinary least squares (OLS) stepwise a one-dimensional building energy balance, which gives hourly net
regression was applied to extract the features. Kontokosta ana- energy demand. The energy conservation measures were evaluated
lyzed detailed survey from asset managers of 763 office buildings in both individual and aggregated basis. Ascione et al. presented
to determine the factors affecting retrofitting decisions by build- a multi-stage and multi-objective optimization for cost-optimal
ing owners [22]. The factors were characterized into four groups energy retrofitting of a building [35]. The time taken to simulate
including building design and systems, fuel type and consumption the large number of retrofitting scenarios is reduced by combining
patterns (primary fuel type and weather normalized EUI), own- analysis in MATLAB with EnergyPlus. In another study by the same
ership type and tenant demand, and spatial/market controls. The authors, artificial neural networks (ANNs) models are combined
predictive model developed could correctly classify 81.2% of the with EneryPlus [36]. The outcomes of the EnergyPlus simulations
retrofit activity in the sample. In another study by Marasco and Kon- are taken as targets for training and testing the ANN model, thereby
tokotsa, energy audit data for over 1100 buildings in New York city reducing the time taken for evaluating all possible retrofitting sce-
are analyzed to identify opportunities for ECMs across building sys- narios.
tem categories (e.g. distribution system, domestic hot water, etc.)
[23]. A machine learning classifier based on binary features derived
1.1. Research gap
from the dataset is developed to predict ECM eligibility given a spe-
cific set of building characteristics. Hsu compared the integrated
In summary, the benchmarking approach is convenient if data
clustering methods for predicting building energy consumption
for a substantial sample is available. It provides an easy comparison
[24]. It was observed that cluster-wise regression was more accu-
with buildings of similar characteristics. However, the compari-
rate for predictions when compared to the k-means clustering but
son of buildings is usually based on just one index, for example,
resulted in less stable clusters and vice versa.
the energy use intensity (EUI). The energy use intensity is mea-
In the last three to four years, a lot of advancement has
sured as the energy used for every unit floor area of a building
been made in the field of building energy efficiency improvement
and is measured in kWh/m2 .year. In most cases of benchmarking
through data analytics. Park et al. developed a new energy bench-
studies, other variables are ignored mostly due to unavailability of
mark for improving the rating system for office buildings in Korea
data. However, buildings with different sets of building systems and
using various data-mining techniques [25]. The study divided the
characteristics can still have the same EUI. For example, a building
buildings based on five criterions and it was seen that some of the
with high cooling load and very efficient chiller system can use the
buildings were downgraded or upgraded in the rating system when
same amount of electrical energy as a building with low cooling
these five criterions were considered. The availability of energy data
load and non-efficient chiller system. The benchmarking approach,
through meters has led to a series of data-driven analysis as is sum-
however, will categorize these two buildings in the same category.
marized by Deb et al. [26]. Raatikainen et al. studied the energy
Therefore, a more comprehensive system of analysis that takes
consumption in school buildings using Self-Organizing Maps (SOM)
all the necessary building characteristics into account is required.
[27]. This technique was combined with k-means clustering tech-
Multivariate statistical techniques like clustering are alternatives
nique to obtain schools with similar electricity consumption costs
that can perform high-dimensional benchmarking by considering
per cubic meter. Ruparathna et al. critically reviewed the exist-
other variables as well. Clustering has the ability to deal with multi-
ing methodologies in improving the energy efficiency of existing
dimensional data and partition them based on set criterions. This
commercial and institutional buildings [7]. It was noted that even
is against the single variable approach taken by current building
though the existing studies predominately focused on technical
policies to benchmark buildings. A few research studies have also
advancements, approaches such as building behavioral changes
highlighted and supported the clustering approach [37,38].
have been largely overlooked as a strategy for improving energy
efficiency in buildings. A considerable amount of research has
been undertaken by the Lawrence Berkeley National Laboratory 1.2. Research objective
(LBNL) on energy retrofit analysis toolkits in the past ten years. Ma
et al. have provided a comprehensive review on existing building To achieve such a holistic clustering approach, this study uti-
retrofits and highlighted on the lack of decision making methods lizes 14 building variables that are co-analyzed with the change
to identify the most cost effective retrofit measures [28]. in building EUI for pre- and post-retrofit conditions. A robust iter-
Balaras et al. presented a methodology to determine the ative process to select the appropriate variables for clustering is
priorities for energy conservation measures (ECMs) in Hellenic resi- developed for this purpose. This process is based on the all-subset
dential buildings [29]. A total of 14 ECMs were assessed along with approach where each possible combination of the 14 variables are
many assumptions that were combined with the available data. tested to identify the best possible combination. A more detailed
However, the use of intelligent clustering technique that naturally explanation is given in the next section. This study also takes advan-
groups elements was missing. Caldera et al. developed a statistical tage of the availability of both, pre- and post-retrofit EUI data which
method to evaluate the heating energy demand for 50 multi-family is usually missing in previous studies. The availability of this data
residential buildings [30]. Kavgic et al. presented and compared presents a direct outlook into the change in energy consumption of
five bottom-up building stock models for energy consumption [31]. these buildings due to retrofitting.
C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245 231
Fig. 2. GFA and total energy consumption for the buildings in increasing order of GFA (first 28 buildings).
Fig. 3. GFA and total energy consumption for the buildings in increasing order of GFA (next 28 buildings).
Fig. 4. Range of EUI for pre-retrofit and post-retrofit conditions for the 56 office buildings that form the dataset.
The term k-means was first introduced by James MacQueen in the geometric distance in multidimensional space and is given by
1967 and the algorithm is based on measure of proximity [46]. Eq. (1).
Although, there are many possibilities for measuring proximity, one
approach is to note the nearness between each pair of elements to
p
2
2
dAB = xrj − xsj (1)
determine their closeness. Another is to observe the difference or
j=1
distance between the pairs of elements, as the distance is comple-
mentary to closeness. Distance measures are the most commonly Geometric distance between two elements for k-means cluster-
used measurement of similarity between objects. These distances ing.
can be calculated both on single and multiple dimensions using Here, ‘dAB ’ is the distance between element A and element B,
Euclidean distances. The most common calculation to estimate ‘xk ’ is the position of element A in the space ‘xk ’ = (xk1 , xk2 ,. . . xkp ),
C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245 233
and ‘p’ is the space dimension. To derive true distance measures, information on the average distances between elements of clusters
Euclidean distances are usually calculated from the raw and not to the distance between the clusters. It is given as in Eq. (2).
from standardized data. However, if variables are measured on dif-
ferent scales, variables with large values contribute more to the
distance measure than variables with small values. For example, the
1 max Si + Sj
k
GFA of buildings is measured in a much larger scale (values can be
DB = (2)
up to 100,000) when compared to the chiller plant efficiency (usu- k j.j =
/ i dij
ally values between 0.5 and 2). Hence, when these two variables are i=1
Table 1
List of variables extracted from the energy audit reports.
Pre-retrofit Post-retrofit
2.4. Variable selection for clustering characteristics indirectly aids in determining the cooling/heating
load for buildings.
The selection of variables to perform clustering depends on the As this study is based on the analysis of detailed energy audit
objective being targeted. For example, a set of buildings may be reports, the information on cooling load is readily available from
clustered based on the energy use index (EUI) to group buildings the energy audit reports. For this reason, the detailed exploration
exhibiting similar EUI values. Introducing a second variable, gross of physical characteristics of buildings has been avoided and only
floor area (GFA) in this case will result in clusters of buildings with the crucial variables influencing energy consumption have been
similar GFA as well as EUI. In this way, the selection of higher num- considered. The list of 14 variables extracted from energy audit
ber of variables leads to groups of buildings with higher similarity reports is presented in Table 1. As seen, the variables related to
as compared to the clustering done by selecting lesser number of cooling load have been assumed to be same in both pre- and post-
variables. This is illustrated in Fig. 6, where clustering with one, two retrofit condition. The cooling load will only change provided there
and three variables is performed. Often, these variables or attributes is improvement in the building envelope. Apart from that, if the
that define each element are known as features. The position of each occupancy and operation schedule of a building remains same, the
element is described by a set of these features or variables. cooling load in pre- and post-retrofit conditions will be very similar.
The criterions to select appropriate variables influencing energy As mentioned earlier, most of the retrofits pertain to chiller plant
consumption in buildings primarily depends on two aspects. The room. The detailed analysis on the correlation of these variables
first is to assess the availability of measured data which still to the existing building EUI and on their energy consumption is
possesses a challenge for data-driven analysis of building energy presented in the following section, where the inter-dependencies
efficiency [48]. The second criteria is to understand the correlation of these variables are also discussed. To obtain an understanding on
of each variable with energy consumption, which is usually done mean values of these variables and corresponding ranges, a boxplot
through multiple linear regression. Several studies have attempted diagram for these variables is presented in Fig. 7.
the formulation of regression models relating the influential vari- It is to be noted that standard score values have been used in
ables to energy consumption. In most of these studies, physical the boxplot so that the range of all the variables can be viewed and
characteristics like GFA, age of building, window to wall ratio, ther- compared in the same plot. The standard scores are computed using
mal transmittance values of envelop elements, occupant number, Eq. (3). The z-score represents the distance between the original
operational hours, cooling/heating degree days, etc. are taken as the value and the sample mean in units of standard deviation. The z-
influencing variables [16,21,49]. The physical characteristics have score is negative when the original value is below the mean, and it
a key role in determining the cooling/heating load of buildings. For is positive when the original value is above the mean.
example, a building in a hot tropical climate with a high window to
wall ratio and high thermal transmittance of windows would have Original value − Mean value
higher cooling load. This subsequently would lead to higher energy Standard score or ’z score’ = (3)
Standard deviation
consumption for the building as the air-conditioning system needs
to meet this higher demand in cooling. Hence, the data on physical
Formula to obtain standard score or ‘z score’.
C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245 235
Fig. 8. Methodology to determine the appropriate set of variables for k-means clustering − an example by taking 2 variables.
ing results may vary with the initial centroid locations. To overcome results arising by taking top 10 results for each iteration. Finally, the
this, the clustering is repeated 100 times for each set of combina- 1000 best results are plotted in a bar plot with respect to two out-
tions, giving rise to a total of 1,638,300 clustering iterations. The puts. First, to identify the combination number that has the highest
iteration runs are limited to 100 due to time constraints as it takes occurrence number and second, the combination number that cor-
a long time to perform clustering with high number of variables. responds to the highest values for%EUI. This combination number
For each clustering iteration, the values of change in EUI (%EUI) gives the set of variables that corresponds to the highest%EUI val-
corresponding to buildings in every cluster is recorded. It is then ues. This part of the study is performed in MATLAB version R2015a.
used to calculate the mean%EUI for that cluster. For each itera- The stepwise methodology is illustrated in Fig. 8. The complete
tion, the top 10 results corresponding to the maximum difference clustering process for this methodology can be summarized in the
in%EUI between clusters are recorded. At the end of 100 itera- following steps. The objective is to identify the best set of variables
tions for each set of combination, there are a total of 1000 best
C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245 237
Table 3 Table 4
The variables taken for clustering and the associated number with each variable. Clustering results for the three clusters based on percentage change in EUI.
Sr. No. Variable Associated number Clustering%EUI No. of elements Mean (%) Range
Fig. 10. The three clusters based on the percentage change in EUI between pre- and post-retrofit condition.
performing the clustering, each combination of the two variables that the combination number of 17, 47, 60, 67, 16, 21, 22, 23, 82 and
is listed and the clustering is performed for all these 91 combi- 9 exhibit the best values for difference in mean values of%EUI. The
nations. This entire process is repeated for a hundred iterations. combination number of 17 corresponds to the selection of variable
At each clustering iteration, the top 10 combinations (out of 91) numbers of 2 and 6 which are air-conditioning energy consump-
that results in maximum difference in mean values of%EUI are tion and operation hours respectively. Similarly, the combination
recorded (Fig. 12). In this way, at the end of 100 iterations, there are number of 47 corresponds to the selection of variable numbers of
a total of 1000 best combinations recorded. At this point, there are 5 and 6 which are percentage of air-conditioning energy and oper-
two outputs that are examined. First, the combination of variables ation hours. The list of numbers associated with the variables is
that occur for the highest number of times in these 1000 combi- presented in Table 3. The cumulative effect of such analysis for a
nations are identified and second, the sum of difference in mean hundred iterations is presented in Figs. 13 and 14.
values of%EUI for these unique combinations are also identified. As shown in these results, the cumulative sum of differences in
The analysis of these two outputs will help identify the best set of mean values for%EUI shows that the variables corresponding to
variables that should be selected for clustering. For example, the combination number 17 produce the best result. These variables
result for clustering buildings taking 2 variables and plotting the are air-conditioning energy consumption and operation hours of
corresponding mean values of%EUI is shown in Fig. 12. It shows the building. The total sum of differences in mean values of%EUI
C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245 239
is 456. The combinations of variables that occur for the highest It is to be noted that these results are based on clustering with
number of times are also noted. The combination number for vari- ‘k’ value of 2. With a value of ‘k’ as 3, the clustering results exhibit
ables with highest occurrences is also 17. However, it is interesting the highest sum of differences in mean values of%EUI to be 453
to note that the second highest occurrence is for combination for the combination number corresponding to 47. This combination
number 22. This corresponds to variable numbers 2 and 11 respec- number also corresponds to the highest number of occurrences.
tively. These two variables correspond to air-conditioning energy This analysis is done for all possible combinations of the vari-
consumption and total installed capacity of chillers (in units of ables. However, the results in the previous three plots are presented
Refrigeration ton {RT}). This is different than the second high- for combinations taking only 2 variables. For combinations gener-
est combination number for sum of differences in mean values ated by using higher number of variables, the results for clustering
of%EUI. For this case, the combination number is 47 correspond- into 2 and 3 clusters are shown in Tables 5 and 6 respectively. It
ing to variable numbers of 5 and 6 respectively. These are variables shows that the best result is obtained by using 4 variables for the
of percentage of air-conditioning energy and operation hours of the combination number of 114. This corresponds to the 4 variables
building. of GFA, non-air-conditioning energy consumption, average chiller
plant efficiency and installed capacity of chillers. The cumulative
Fig. 11. Clusters of various variables based on change in EUI clusters for: a). GFA. b). Pre-retrofit air-conditioning energy consumption. c). Pre-retrofit non-air-conditioning
energy consumption. d). Pre-retrofit total energy consumption. e). Percentage of air-conditioning energy. f). Operating hours. g). Average cooling load per day. h). maximum
cooling load per day. i). Minimum cooling load per day. j). Average chiller plant efficiency. k). Maximum chiller plant efficiency. l). Minimum chiller plant efficiency. m). Total
installed capacity of chillers. n). Difference in chilled water main (supply and return temperature).
240 C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245
Table 5 Table 6
Combination number and sum of differences in mean values of change in EUI from Combination number and sum of differences in mean values of change in EUI from
the 10 best results for 100 iterations (dividing the data into 2 clusters). the 10 best results for 100 iterations (dividing the data into 3 clusters).
Number of Number of possible Combination Sum of differences in Number of Number of Combination Sum of differences in
selected combinations number for best mean values of%EUI selected possible number for best mean values
variables result variables combinations result of%EUI
1 14 6 586.3 1 14 5 496.78
2 91 17 456 2 91 47 540.57
3 364 47 453 3 364 22 334.61
4 1001 114 632 4 1001 209 191.88
5 2002 1834 620.8 5 2002 232 227.37
6 3003 2853 508 6 3003 2092 134.97
7 3432 2938 387.4 7 3432 886 123.95
8 3003 2985 472 8 3003 45 173.86
9 2002 1054 340.8 9 2002 26 290.37
10 1001 10 317 10 1001 21 302.49
11 364 10 317 11 364 6 315.9
12 91 10 317 12 91 38 302.59
13 14 10 317 13 14 3 444.54
14 1 1 314.5 14 1 1 6.52
Total 16383 Total 16383
C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245 241
Fig. 12. Plot between sum of difference in mean values of change in EUI and combination of variables. Such iterations are repeated 100 times. In this case, the combination
number of 17, 47, 60, 67, 16, 21, 22, 23, 82 and 9 exhibit top 10 results.
Fig. 13. Cumulative sum of differences in change in EUI for 100 iterations.
sum of differences for change in EUI for buildings in these 2 clus- ables is 1834. The time taken to perform the clustering for both 2
ters is 632. This cumulative value is a result of a hundred iterations. and 3 clusters is shown in Table 7. It is seen that the time taken
Therefore, on average, the difference in mean values of change in for the clustering process varies and is directly proportional to the
EUI for these 2 clusters is 6.32. The boxplot for the%EUI values for number of possible combinations. It is to be noted that this anal-
these two clusters is presented in Fig. 15. The next best result is seen ysis is not based on real-time data and the mention of time spent
when 5 variables are taken for clustering. These 5 variables are per- for analysis is just to show that complex clustering with higher
centage of air-conditioning energy, average cooling load per day, variables indeed takes a longer time.
average chiller plant efficiency, maximum chiller plant efficiency
and difference in chilled water supply and return temperature. The
combination number for the best results for clustering with 5 vari-
C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245 243
Fig. 14. Plot showing the number of occurrences for each combination of 2 variables for 100 iterations.
Fig. 15. Boxplot for%EUI for the 2 clusters obtained by taking the best combination of 4 variables.
4. Conclusions all these combinations with a stopping criterion for each itera-
tion using k-means clustering. It will be interesting to see how
This study presents an analysis of pre- and post-retrofit energy other clustering approaches like k-medoids and hierarchical per-
consumption data for 56 buildings. There are 14 building vari- forms in this analysis. However, this is beyond the scope of this
ables that are analyzed. The prime objective is to identify the best study. To accommodate the random selection of cluster centroids
set of variables that can be used to cluster buildings to obtain an for k-means clustering, the clustering is repeated for a hundred iter-
insight on their energy saving potential. The existing procedures ations.This analysis provides building variables that should be used
to select variables for clustering is either based on intuitive human for clustering and can be applied for the study of existing build-
experience or the development of regression models. Whereas, in ings needing retrofitting. This methodology enhances the current
this study, the methodology to select variables for clustering is approaches for clustering buildings with respect to building vari-
based on a robust iterative process. For this, all possible combi- ables and energy performance. The following major conclusions are
nations of the 14 available variables are explored. This accounts drawn from this study.
to a total of 16,383 combinations. The clustering is performed for
244 C. Deb, S.E. Lee / Energy and Buildings 159 (2018) 228–245
[29] C.A. Balaras, A.G. Gaglia, E. Georgopoulou, S. Mirasgedis, Y. Sarafidis, D.P. Lalas, Appl. Therm. Eng. 31 (2011) 3521–3525, http://dx.doi.org/10.1016/j.
European residential buildings and empirical assessment of the Hellenic applthermaleng.2011.07.005.
building stock, energy consumption, emissions and potential energy savings, [39] C. Deb, L.S. Eang, J. Yang, M. Santamouris, Forecasting diurnal cooling energy
Build. Environ. 42 (2007) 1298–1314, http://dx.doi.org/10.1016/j.buildenv. load for institutional buildings using Artificial Neural Networks, Energy Build.
2005.11.001. (2015), http://dx.doi.org/10.1016/j.enbuild.2015.12.050.
[30] M. Caldera, S.P. Corgnati, M. Filippi, Energy demand for space heating through [40] C. Deb, L.S. Eang, J. Yang, M. Santamouris, Forecasting energy consumption of
a statistical approach: application to residential buildings, Energy Build. 40 institutional buildings in Singapore, Procedia Eng. 121 (2015) 1734–1740,
(2008) 1972–1983, http://dx.doi.org/10.1016/j.enbuild.2008.05.005. http://dx.doi.org/10.1016/j.proeng.2015.09.144.
[31] M. Kavgic, A. Mavrogianni, D. Mumovic, A. Summerfield, Z. Stevanovic, M. [41] C. Deb, Development of an Automated Energy Audit Protocol for Office
Djurovic-Petrovic, A review of bottom-up building stock models for energy Buildings, National University of Singapore, 2017 http://scholarbank.nus.edu.
consumption in the residential sector, Build. Environ. 45 (2010) 1683–1697, sg/handle/10635/136280.
http://dx.doi.org/10.1016/j.buildenv.2010.01.021. [42] S.E. Lee, P. Rajagopalan, Building energy efficiency labeling programme in
[32] I. Theodoridou, A.M. Papadopoulos, M. Hegger, Statistical analysis of the Singapore, Energy Policy 36 (2008) 3982–3992, http://dx.doi.org/10.1016/j.
Greek residential building stock, Energy Build. 43 (2011) 2422–2428, http:// enpol.2008.07.014.
dx.doi.org/10.1016/j.enbuild.2011.05.034. [43] J.F. Hair, R.E. Anderson, R.L. Tatham, Multivariate Data Analysis with Readings,
[33] G.V. Fracastoro, M. Serraino, A methodology for assessing the energy Macmillan, 1987.
performance of large scale building stocks and possible applications, Energy [44] J.D. Jobson, Applied Multivariate Data Analysis, Springer New York, New York,
Build. 43 (2011) 844–852, http://dx.doi.org/10.1016/j.enbuild.2010.12.004. NY, 1992, http://dx.doi.org/10.1007/978-1-4612-0921-8.
[34] É. Mata, A.S. Kalagasidis, F. Johnsson, A modelling strategy for energy, carbon, [45] T. Warren Liao, Clustering of time series data—a survey, Pattern Recognit. 38
and cost assessments of building stocks, Energy Build. 56 (2013) 100–108, (2005) 1857–1874, http://dx.doi.org/10.1016/j.patcog.2005.01.025.
http://dx.doi.org/10.1016/j.enbuild.2012.09.037. [46] J. MacQueen, Some methods for classification and analysis of multivariate
[35] F. Ascione, N. Bianco, C. De Stasio, G.M. Mauro, G.P. Vanoli, Multi-stage and observations Proc. Fifth Berkeley Symp. Math. Stat. Prob., Vol. 1, University of
multi-objective optimization for energy retrofitting a developed hospital California Press, Berkeley, 1967, pp. 281–297, http://projecteuclid.org/euclid.
reference building: a new approach to assess cost-optimality, Appl. Energy bsmsp/1200512992. (Accessed 6 November 2016).
174 (2016) 37–68, http://dx.doi.org/10.1016/j.apenergy.2016.04.078. [47] D.L. Davies, D.W. Bouldin, A cluster separation measure, IEEE Trans. Pattern
[36] F. Ascione, N. Bianco, C. De Stasio, G.M. Mauro, G.P. Vanoli, Artificial neural Anal. Mach. Intell. PAMI-1 (1979) 224–227, http://dx.doi.org/10.1109/TPAMI.
networks to predict energy performance and retrofit scenarios for any 1979.4766909.
member of a building category: a novel approach, Energy 118 (2017) [48] T. Hong, L. Yang, D. Hill, W. Feng, Data and analytics to inform energy retrofit
999–1017, http://dx.doi.org/10.1016/j.energy.2016.10.126. of high performance buildings, Appl. Energy 126 (2014) 90–106, http://dx.doi.
[37] E. Wang, Benchmarking whole-building energy performance with org/10.1016/j.apenergy.2014.03.052.
multi-criteria technique for order preference by similarity to ideal solution [49] W. Chung, Y.V. Hui, Y.M. Lam, Benchmarking the energy efficiency of
using a selective objective-weighting approach, Appl. Energy 146 (2015) commercial buildings, Appl. Energy 83 (2006) 1–14, http://dx.doi.org/10.
92–103, http://dx.doi.org/10.1016/j.apenergy.2015.02.048. 1016/j.apenergy.2004.11.003.
[38] W.-S. Lee, L.-C. Lin, Evaluating and ranking the energy performance of office
building using technique for order preference by similarity to ideal solution,