**Chapter 4. Virtual Control Group to Improve Adjusted Statistical Methods**

**4.2 Experiment**

**4.2.1 Experimental Settings for Virtual Control Group Adjustment . 36**

Table 4.1. Definition of two types of virtual control group adjustment.

Adjustment method

Additive type Ratio type

Virtual control group

ˆ

y^{v}_{c}_{i}_{,d,m}= ˆyc_{i},d,m+_{|G}_{N P}^{1} _{|}P

i^{′}∈G_{N P}(yc_{i}′,d,m−yˆc_{i}′,d,m) ˆy^{v}_{c}_{i}_{,d,m}= ˆyc_{i},d,m

Pi′∈GNPy_{c}

i′,d,m

Pi′∈GNPyˆ_{c}_{i′}_{,d,m}

an additive way. In this case, it can be better to use a ratio type. A similar problem also appears for customers with very large usage. Therefore, it is expected that the ratio type will be more effective when considering such abnormal customers. In this study, we compare both types, but use the ratio type as the default option for virtual control group adjusted statistical method.

end at 20:00. Even though the actual event times of our pilot program were not used for the non-event day study, the results were confirmed to be in line when the analysis was repeated for the event time and duration as in the real DR events (i.e., time is chosen as one of 2 pm, 3 pm, 7 pm, 8 pm, and 9 pm for each day; each DR lasts for 1 hour). For evaluating the control group approach and virtual control group adjusted statistical method for a DR event, there needs to be a group of customers who do not join the DR event. Only with such a non-participating group, the performance of control group and virtual control group adjusted statistical method can be calculated. In our study, we have randomly chosen 60% of the customers and assumed them to be the non-participating group. The other 40% were assumed to participate in the DR event.

This setting is reasonable for evaluating the control group approach where the control group size needs to be large enough and both treatment and control groups should have similar characteristics. The setting, however, can be inadequate for evaluating virtual control group adjusted statistical method. For virtual control group adjusted statistical method, the virtual control group can have a selection bias for real DR events and simply choosing the non-participating group to be random can be too optimistic. This issue will be addressed in subsection 4.2.3, where we analyze the effect of a virtual control group’s size and selection bias.

4.2.2. Result and Analysis of Performance Comparison

The average ME performances evaluated over 213 non-event days of 2017 are shown in Table 4.2. Statistical method’s performance is worse than the control group approach or virtual control group adjusted method. This is most likely because statistical method cannot handle the external factors. The adjusted models did not perform well either.

Pre-hour adjusted statistical method showed a meaningful improvement over statistical method, but the resulting performance was still much worse than the control group- based methods. Both the control group approach and virtual control group adjusted method showed outstanding performance for ME. Between the two, virtual control

group adjusted method can reduce the error by 26.58% as compared to the control group approach (3.01 vs. 2.21), rendering virtual control group adjusted method the best performing model for impact estimation. Compared to statistical method, virtual control group adjusted method can reduce the error by 88.30% (18.89 vs 2.21). Among the 10 baseline functions, High5of5 showed the best performance for statistical method, pre- hour adjusted method (additive), and weather adjusted method. ExpMovAvg showed the best performance for pre-hour adjusted method (ratio) and virtual control group adjusted method. For ME, there was no meaningful difference between additive type and ratio type. Individual-wise, one of the two might outperform the other. Nevertheless, the overall impact on the population was not noticeable. The average MAE performances evaluated over the 213 non-event days of 2017 are shown in Table 4.3. For MAE, the control group approach performs the worst, because it fails to address the different usage levels of the customers. While some customers use a large amount of electricity load and some others use a small amount, the same value of predicted load is chosen by the control group approach. Unlike the control group approach, all the other forecasting models adopt a baseline function that utilizes the historical usage level of the residential customer, and thus show better performance. For MAE, virtual control group adjusted method still achieves the best performance. A surprising result is that the statistical method works well and shows comparable performance to the virtual control group adjusted method. DID is not effective because residential customers often change their daily usage pattern regardless of DR events. Such a random usage pattern change in each customer cannot be handled even with the DID method, resulting in comparable performances between statistical method and virtual control group adjusted method. To be precise, however, virtual control group adjusted method showed 3.06% of positive improvement over the statistical method (98.60 vs 101.71). In addition, it should be noted that ratio type worked better for virtual control group adjusted method. Between the two adjusted methods, this time weather adjusted method showed better performance.

The different results for MAE as compared to ME are owing to the nature of MAE,

Table 4.2. ME of unadjusted/adjusted statistical methods and virtual control group adjusted statistical method evaluated for non-event days. For the control group, the baseline functions are not applicable and the calculated ME value is pasted in both mean and best rows for an easier comparison. The unit of the error values is Wh.

Model Statistical method Pre-hour adjusted method Weather adjusted method

Control Group

Virtual control group adjusted method

Additive Ratio Additive Ratio Additive Ratio

High10of10 21.1 13.6 16.2 22.4 22.4 - 2.2 2.2

High5of5 18.9 12.3 16.7 19.9 20.0 - 2.3 2.3

High4of5 31.0 13.9 23.3 30.5 30.6 - 2.3 2.3

High5of10 70.5 22.5 26.9 68.8 68.3 - 2.4 2.3

Low4of5 31.1 14.2 15.3 31.4 31.1 - 2.4 2.4

Low5of10 67.8 24.8 15.4 68.6 68.4 - 2.4 2.4

Mid4of6 19.4 12.3 19.8 20.5 20.5 - 2.4 2.4

Mid8of10 20.7 13.1 16.9 22.1 22.2 - 2.2 2.3

Regression 27.6 13.5 39.1 28.1 28.4 - 7.8 5.7

ExpMovAvg 26.1 12.3 13.2 27.8 27.8 - 2.2 2.2

Mean absolute ME 33.4 15.3 20.3 34.0 34.0 3.0 2.9 2.7

Best absolute ME 18.9 12.3 13.2 19.9 20.0 3.0 2.2 2.2

where the customer level of accuracy is analyzed instead of the aggregated level of accuracy.

Besides MAE, we also evaluated MAPE, SMAPE, and RMSE that are commonly used in lieu of MAE. The results are shown in Table 4.4 and overeall performance of each metric is shown in B. The performance of RMSE was similar to MAE, and virtual control group adjusted method still outperformed all the others. For MAPE, however, statistical method performed better than virtual control group adjusted method. This indicates that there are customers whose predicted load can be better estimated when virtual control group is not used for their load prediction and that most likely they are low load customers whose contribution to MAE and RMSE was relatively smaller. With the normalization in MAPE calculation, the contribution to the error metric is equal for all the customers including the low-load and high-load customers. As an example, there

Table 4.3. MAE of unadjusted/adjusted statistical methods and virtual control group adjusted statistical method evaluated for non-event days. For the control group, the baseline functions are not applicable and the calculated MAE value is pasted in both mean and best rows for an easier comparison. The unit of the error values is Wh.

Model Statistical method Pre-hour adjusted method Weather adjusted method

Control Group

Virtual control group adjusted method

Additive Ratio Additive Ratio Additive Ratio

High10of10 102.7 129.8 143.9 103.5 103.4 - 101.7 100.1

High5of5 104.3 134.8 150.1 105.1 105.1 - 103.6 102.2

High4of5 110.3 140.5 155.9 110.9 111.1 - 106.9 103.6

High5of10 127.4 146.2 157.6 127.4 127.4 - 113.4 105.5

Low4of5 103.0 134.0 151.5 104.0 104.1 - 103.7 103.3

Low5of10 109.7 132.1 148.8 111.2 110.9 - 106.9 106.4

Mid4of6 103.7 136.1 153.8 104.6 104.7 - 103.6 102.3

Mid8of10 101.7 130.1 145.6 102.6 102.6 - 101.3 99.9

Regression 116.2 152.9 184.7 116.8 116.9 - 117.2 113.7

ExpMovAvg 102.3 126.3 140.2 103.0 102.9 - 100.7 98.6

Mean MAE 108.1 136.3 153.2 108.9 108.9 173.7 105.9 103.6

Best MAE 101.7 126.3 140.2 102.6 102.6 173.7 100.7 98.6

Table 4.4. MAE, MAPE, SMAPE, RMSE of unadjusted/adjusted statistical methods and virtual control group adjusted statistical method evaluated for non-event days.

Statistical method Pre-hour adjusted method Weather adjusted method

Control Group

Virtual control group adjusted method

Additive Ratio Additive Ratio Additive Ratio

Best MAE 101.7 126.3 140.2 102.6 102.6 173.7 100.7 98.6

Best MAPE 32.8 43.6 45.2 33.4 33.2 88.5 41.5 39.6

Best SMAPE 32.2 45.3 36.5 32.6 32.5 50.4 34.3 32.0

Best RMSE 225.2 256.5 293.8 225.5 226.0 307.5 221.0 219.3

can be a low electricity load customer whose load stays almost constant regardless of the external factors such as weather and temperature changes. Using virtual control group to eliminate the external factors can be harmful for such a customer, and the error of the customer will be equally reflected in MAPE calculation despite the customer being low load and thus being less influential to the overall reduction. In SMAPE performances, virtual control group adjusted method showed slightly better performance compared to the performance of statistical method. Ideally, the gain of using virtual control group should be very large for most of the customers such that virtual control group adjusted method could stay as the best forecasting algorithm for any reasonable choice of metric.

This is not the case for the pilot dataset that we have evaluated. Nonetheless, the virtual control group adjusted method is certainly helpful for a large portion of the individual customers, and there are ways to improve MAPE performance. This issue is addressed later in the discussion section.

The daily performances of statistical method, control group approach, and virtual control group adjusted method are shown in Figure 4.2. While Table 4.2 and Table 4.3 show the average performance over a year, Figure 4.2 shows the individual daily error performance curves over the year. In the ME plot (a), it can be seen that statistical method’s error wildly fluctuates in the summer. Korea’s summer in 2017 was hotter than the usual summers, and the temperature fluctuation and the resulting electricity

load usage level’s fluctuation could not be handled by statistical method. The control group approach and virtual control group adjusted method, however, were not affected by the fluctuations, because of their inherent ability to cancel out the external factors (only virtual control group adjusted method is shown because control group closely overlaps with virtual control group adjusted method). Note that the weather adjusted method showed poor performance in Table 4.2. The particular weather adjustment method we used was not effective, and this indicates that control group approach and virtual control group adjusted method are much easier and robust to use because they work well without requiring tuning. As we will see in Subsection 4.2.3, virtual control group adjusted method has an additional advantage because it is also insensitive to the selection bias and thus can be used without paying much attention to the selection of the customers to include in the control group. In the MAE plot (b), statistical method and virtual control group adjusted method both strongly outperform the control group approach (only virtual control group adjusted method is shown because statistical method closely overlaps with virtual control group adjusted method). While both statistical method and virtual control group adjusted method had approximately the same performance as can be confirmed in Table 4.3, there were time periods when virtual control group adjusted method clearly outperformed statistical method. To address the improvement, the difference between statistical method’s MAE and virtual control group adjusted method’s MAE are shown in (c). At the end of summer, virtual control group adjusted method outperformed statistical method by up to 45-50Wh.

There was a sudden decrease in temperature at the end of summer, and the external factor was properly handled by virtual control group adjusted method while statistical method failed to take the change into account. While virtual control group adjusted method did not strongly outperform statistical method in terms of the average MAE performance, virtual control group adjusted method is clearly much more robust to the abrupt changes due to its inherent design of handling external factors.

We identified the characteristics of the virtual control group of our residential DR

program, and discusses the effects of unadjusted/adjusted statistical methods on actual DR events using the actual event day data in Appendix C.