musingsheader image
6 Helpful?

Energy Modeling Isn’t Very Accurate

Before spending time or money on energy modeling, it’s important to know its limitations

Posted on Mar 30 2012 by Martin Holladay

Energy consultants and auditors use energy modeling software for a variety of purposes, including rating the performance of an existing house, calculating the effect of energy retrofit measures, estimating the energy use of a new home, and determining the size of new heating and cooling equipment. According to most experts, the time and expense spent on energy modeling is an excellent investment, because it leads to better decisions than those made by contractors who use rules of thumb.

Yet Michael Blasnik, an energy consultant in Boston, has a surprisingly different take on energy modeling. According to Blasnik, most modeling programs aren’t very accurate, especially for older buildings. Unfortunately, existing models usually aren’t revised or improved, even when utility bills from existing houses reveal systematic errors in the models.

Most energy models require too many inputs, many of which don’t improve the accuracy of the model, and energy modeling often takes up time that would be better spent on more worthwhile activities. Blasnik presented data to support these conclusions on March 8, 2012, at the NESEANorth East Sustainable Energy Association. A regional membership organization promoting sustainable energy solutions. NESEA is committed to advancing three core elements: sustainable solutions, proven results and cutting-edge development in the field. States included in this region stretch from Maine to Maryland. Building Energy 12 conference in Boston.

Blasnik sees more data in a day than most raters do in a lifetime

Blasnik has worked as a consultant for utilities and energy-efficiency programs all over the country. “I bought one of the first blower doors on the market,” Blasnik said. “I’ve been trying to find out how to save energy in houses for about 30 years. I’ve spent a lot of time looking at energy bills, and comparing bills before and after retrofit work is done. I’ve looked at a lot of data. Retrofit programs are instructive, because they show how the models perform.”

According to Blasnik, most energy models do a poor job of predicting actual energy use, especially for older houses. And since large datasets show that the differences between the models and actual energy use are systematic, we can’t really blame the occupants; we have to blame the models.

Blasnik isn’t the only researcher to note that most energy models do a poor job with existing houses. Blasnik cited several other researchers who have reached the same conclusion, including Scott Pigg, whose 1999 Wisconsin HERS study found that REM/Rate energy-use predictions are, on average, 22% higher than the energy use shown on actual energy bills.

Retrofit studies are consistent: projected savings are overestimated

Blasnik cited five studies that found that the measured savings from retrofit work equal 50% to 70% of projected savings. “The projected savings are always higher than the actual savings,” said Blasnik, “whether you are talking about insulation retrofit work, air sealing, or lightbulb swaps.”

So why do energy-efficiency programs almost always overestimate anticipated savings? The main culprit, Blasnik said, is not the takeback (or rebound) effect. Citing data from researchers who looked into the question, Blasnik noted, “People don’t turn up the thermostat after weatherization work. References to the takeback effect are mostly attempts to scapegoat the occupants for the energy model deficiencies.”

Many assumptions, inputs, simplifications, and algorithms are bad

The biggest errors occur in modeling estimates of energy use in older homes. “Post-retrofit energy use is pretty close to modeled estimates,” said Blasnik, “but pre-retrofit use is dramatically overestimated because of poor assumptions, biased inputs, and bad algorithms.”

Poor assumptions. “Models and auditors underestimate the efficiency of existing heating equipment,” said Blasnik. “They often assume 60% efficiency for old furnaces.”

Low R-valueMeasure of resistance to heat flow; the higher the R-value, the lower the heat loss. The inverse of U-factor. estimates for existing walls (R-3.5) and attics. “They also use lots of biased defaults,” said Blasnik. “They assume R-3.5 for an old wall, when many old walls actually perform at R-5 or R-6.” Energy models often underestimate the effects of a high framing factor, thick sheathingMaterial, usually plywood or oriented strand board (OSB), but sometimes wooden boards, installed on the exterior of wall studs, rafters, or roof trusses; siding or roofing installed on the sheathing—sometimes over strapping to create a rainscreen. , and multiple layers of old siding, all of which improve a wall’s R-value.

Low R-value estimates for existing single-pane windows. “They assume that old single-pane windows are R-1, when they are probably closer to R-1.35 or R-1.4. When calculating the outside surface film coefficient, they assume worst-case conditions — in other words, that the wind is always blowing away heat from the window. They do it that way because the design load is always calculated for the coldest, windiest day of the year (even though the coldest day usually isn’t windy). If an auditor calculates single-pane windows at R-1, he’s assuming that the wind is blowing continuously nonstop all winter long. But in a real house, the wind speed is often close to zero up against the window.”

Low or absent estimates for thermal regain. Blasnik explained that energy models underestimate thermal regain from basements and crawlspaces. “Most models get big things wrong, like how basements and crawlspaces work,” he said. “Vented crawl spaces usually aren’t at the outdoor temperature. When the outdoor temperature is 10 degrees, a vented crawl space can be at 50 degrees. Why is it that when you insulate a basement ceiling, you get very little savings — maybe zero savings, or maybe $20 a year? Well, if you have a furnace and ductwork in the basement, you are regaining a lot of the heat given off by the furnace and ducts, due to the directional nature of air leakage in the wintertime. The stack effectAlso referred to as the chimney effect, this is one of three primary forces that drives air leakage in buildings. When warm air is in a column (such as a building), its buoyancy pulls colder air in low in buildings as the buoyant air exerts pressure to escape out the top. The pressure of stack effect is proportional to the height of the column of air and the temperature difference between the air in the column and ambient air. Stack effect is much stronger in cold climates during the heating season than in hot climates during the cooling season. brings basement air upstairs. The basement is pretty warm, so the air leaking into the house is warmer than the models predict. A similar effect happens in attics: because of the stack effectAlso referred to as the chimney effect, this is one of three primary forces that drives air leakage in buildings. When warm air is in a column (such as a building), its buoyancy pulls colder air in low in buildings as the buoyant air exerts pressure to escape out the top. The pressure of stack effect is proportional to the height of the column of air and the temperature difference between the air in the column and ambient air. Stack effect is much stronger in cold climates during the heating season than in hot climates during the cooling season., most of the air leaving the house leaves through the attic. In a leaky house, you might have 200 cfm of air flow being dumped into the attic. That makes the attic warmer than the models predict. If the attic is 50 degrees, the heat loss through the ceiling insulation is less than the model assumes.”

Models also ignore interactions between air flow and conductionMovement of heat through a material as kinetic energy is transferred from molecule to molecule; the handle of an iron skillet on the stove gets hot due to heat conduction. R-value is a measure of resistance to conductive heat flow.. “Every single house acts like an HRV(HRV). Balanced ventilation system in which most of the heat from outgoing exhaust air is transferred to incoming fresh air via an air-to-air heat exchanger; a similar device, an energy-recovery ventilator, also transfers water vapor. HRVs recover 50% to 80% of the heat in exhausted air. In hot climates, the function is reversed so that the cooler inside air reduces the temperature of the incoming hot air. , since outdoor air flowing through walls is picking up some of the heat that is leaving the house,” said Blasnik. “The heat exchange is always going on, but it’s not being quantified or accounted for. Complicated models use algorithms for air infiltration that aren’t very good — the infiltration and conduction interactions aren’t modeled.”

Too many inputs

Anyone designing a computer model has to decide which inputs to require. “The trouble with the complicated models is that they ask for inputs that you can’t measure well,” said Blasnik. “After all, a lot of people don’t even know which orientation is south. Unfortunately, many existing models ask for inputs that are difficult to assess — for example, window shading percentages, wind exposure ratings, and soil conditions. What’s the water table height? What’s the flow rate of the water? Who knows?”

As Blasnik noted, “It’s hard enough to get auditors to agree on the area of a house.”

Many models ask for inputs which are open to interpretation. Blasnik asked, “How do you decide if a basement is conditioned or unconditioned? Perhaps it’s semi-conditioned? Or unintentionally conditioned? Or maybe unintentionally semi-unconditioned?”

When making these types of assessments, it’s hard for technicians to avoid unintentional bias. Technicians entering pre-retrofit information on an older home often come up with pessimistic R-value estimates for existing insulation levels, leading to overestimated savings projections.

Because of these problems with input accuracy, default assumptions are often more accurate than data collection. But even when using a model with the best possible default assumptions, there are limitations to accuracy. “Houses are complicated, and that’s a problem,” said Blasnik. “Lots of factors are difficult to model: foundation heat loss, infiltration, wall heat loss, attic heat loss, framing factors, edge effects, window heat loss, window heat gainIncrease in the amount of heat in a space, including heat transferred from outside (in the form of solar radiation) and heat generated within by people, lights, mechanical systems, and other sources. See heat loss., exterior shading, interior shading, the effect of insect screens, air films, HVAC(Heating, ventilation, and air conditioning). Collectively, the mechanical systems that heat, ventilate, and cool a building. equipment performance, duct efficiency and regain, AC refrigerant charge, and air flows over HVAC coils. There are many unknowns: soil conductivity and ground temperatures are unknown. Wind speed is unknown. Leak locations are unknown.”

The good news: energy models do a better job with newer homes

Because newer homes tend to have lower rates of air leakage and higher R-values than older homes, energy models usually do a better job of predicting energy use in newer homes.

A study of 10,258 recently built Energy Star homesA U.S. Environmental Protection Agency (EPA) program to promote the construction of new homes that are at least 15% more energy-efficient than homes that minimally comply with the 2004 International Residential Code. Energy Star Home requirements vary by climate. in Houston showed that the median discrepancy between the REM/Rate prediction and the actual energy use in the homes was 17%. In other words, in half of the homes the discrepancy between the modeled and actual energy use was 17% or less; in the rest of the homes, the discrepancy was greater.

Is energy modeling cost-effective?

Blasnik noted the irony that energy experts who analyze the cost-effectiveness of window replacement or refrigerator swaps haven’t bothered to calculate the cost-effectiveness of energy modeling.

“How do the time, effort, and costs of collecting detailed data and using complicated models compare to the benefits?” Blasnik asked. “For most residential retrofits, it is hard to justify the cost of a detailed model that takes more than a few minutes to fill out. It makes more sense to just fix the obvious problems instead of doing a detailed modeling exercise. Data collection work distracts you from other tasks. Often raters spend so much time filling out the audit software that they never talk to the occupants — the homeowner is just sitting there. So here’s an idea: maybe you could talk to the homeowners.”

Blasnik even questions the wisdom of modeling new homes. “If you are building super-efficient homes, the heating usage will be dominated by hard-to-model factors, including internal gains like light bulbs and plug loads,” said Blasnik. “Small changes make a significant difference. Do the owners have a few big dogs? How long does the bathtub water sit before it drains down the pipes? Are the shading calculations accurate? What about internal shading by the occupants? How clean are the windows? How big a swing in indoor temperature will the occupants accept? Most models pay close attention to heating use, but in a super-efficient home, the hot water load and plug loads are bigger than the heating loadRate at which heat must be added to a space to maintain a desired temperature. See cooling load. — these other loads dominate. One large-screen plasma TV may matter more than the thickness of the foam insulation under the slab.”

A study compares energy models

Energy Trust of Oregon is an independent nonprofit organization that sponsors a variety of energy-efficiency programs; its work is funded by public benefit charges tacked onto ratepayers’ electric bills. “In 2008, the Energy Trust of Oregon was aiming to come up with a low-cost energy rating for homes,” said Blasnik. “The question was, is there a low-cost alternative to paying $600 for a HERS rating of a house? Is there such a thing as a $100 energy rating — a ‘light’ energy rating?”

To help answer the question, the Energy Trust hired Blasnik to give advice on which energy models to test. When the team couldn't identify a promising simplified model, Blasnik offered to develop a spreadsheet that would be easier to use than existing energy models. Dubbed the Simple spreadsheet, Blasnik's creation required only 32 inputs and less operator knowledge than other energy models. Blasnik explained, “The spreadsheet was quickly designed to see if a simpler tool could work OK. The model asks for the conditioned floor area and number of stories, but it doesn’t ask you the area of the windows, walls, or attic. The model doesn’t want to know R-values for the walls or attic, or what kind of windows you have. No blower door or duct leakage numbers are necessary.” Instead of requiring R-value inputs, the Simple spreadsheet asks a technician to choose from a limited menu of options — for example, options like “some insulation,” “standard insulation,” or “average airtightness.” (See Image 3, below, for a list of the Simple spreadsheet inputs.)

A research project called the Earth Advantage Energy Performance Score Pilot compared Blasnik’s Simple spreadsheet to three well-established energy models: REM/Rate and two versions of Home Energy Saver, dubbed Home Energy Saver (full) and Home Energy Saver (mid). The Home Energy Saver models were developed by the Department of Energy and Lawrence Berkeley National Laboratory. “The Simple spreadsheet has 32 data inputs,” said Blasnik. “This compares to 185 data inputs required for the full Home Energy Saver model.”

The three energy models made energy use projections for 300 existing houses, and these projections were then compared to actual energy bills. The Simple spreadsheet performed better in most situations; it had the smallest average error and far fewer cases with large errors. The mean absolute percentage error for the four energy models were:

  • Simple, 25.1%
  • Home Energy Saver (full), 33.4%
  • REM/Rate, 43.7%
  • Home Energy Saver (mid), 96.6%.

“My dumb spreadsheet does better than REM/Rate and the other models because the other models are horrible. For predicting gas use in older homes, REM/Rate had a median error of 85%. Two-thirds of the REM/Rate houses had huge errors. The mean actual use of gas was 617 therms a year, but REM/Rate predicted 1,089 therms. My Simple spreadsheet overpredicted by only 27 therms.”

Blasnik said, “The other models are very sophisticated, but they focus on the wrong areas. The moral is to get the big stuff right, and don’t waste your time with the other stuff. You can get worse answers if you collect more data than if you just make reasonable default assumptions. These detailed models are precise but not accurate — so they miss the target. The simplified models are accurate but not precise. It is better to be approximately right than precisely wrong.”

Unfortunately, Blasnik's Simple spreadsheet is not available. However, an energy modeling tool based on Blasnik's Simple algorithms has been developed; it just isn't particularly easy to purchase. The software, EPS Auditor Pro, is available from Earth Advantage Institute in Portland, Oregon; that catch is that in order to be eligible to purchase the software, you must be a certified BPI analyst. Once you've obtained your BPI certification, you still can't get the EPS software until you complete an additional multistage training program that includes a 5-hour online class, a 3-hour Webinar, and a final exam. The cost for the whole EPS package (training and software) is $199 for individual users.

Remember, it’s a house, not a science project

Blasnik reminds energy nerds that not every house needs to be a science project. “For energy retrofits, don’t waste your time doing simulations with dozens of inputs,” he said. “Do the obvious stuff. Just fix the leaky uninsulated house — don’t model it. If you need a computer to find out what work you need to do, then you don’t know the answer — no matter what the computer says. There are more important issues that come up in a retrofit project, like: Do we have people who know how to do the work? Will they do the work well?”

Energy nerds can get distracted by modeling and testing. “Bruce Manclark, an energy consultant working with Puget Sound Energy, realized that their duct-sealing program would have been cost-effective if only they didn’t have to do Duct BlasterCalibrated air-flow measurement system developed to test the airtightness of forced-air duct systems. All outlets for the duct system, except for the one attached to the duct blaster, are sealed off and the system is either pressurized or depressurized; the work needed by the fan to maintain a given pressure difference provides a measure of duct leakage. testing before and after the sealing,” said Blasnik. “So Bruce said, ‘Let’s not test them.’ He called it the ‘Duct Ninja’ program. He recommended that workers just start sealing — seal the air handler and then seal every single duct connection you can access, without any testing. That way you don’t need testing equipment or training in using testing equipment, and you don’t need to spend hours testing. A lot of us are getting distracted by tests and computer software. What we really need are efficient processes to improve homes.”

Experienced energy retrofit workers rarely rely on models. “When we make retrofit decisions, other factors like experience are more important than modeling,” said Blasnik. “Even if you need modeling to make design decisions, you don’t have to model every house. Model something well just once, and then apply the lesson to lots of buildings. If a house isn’t unique, modeling is a waste of time.”

What about PHPP?

Blasnik’s analysis raises important questions about the need for fine details in residential energy models. PassivhausA residential building construction standard requiring very low levels of air leakage, very high levels of insulation, and windows with a very low U-factor. Developed in the early 1990s by Bo Adamson and Wolfgang Feist, the standard is now promoted by the Passivhaus Institut in Darmstadt, Germany. To meet the standard, a home must have an infiltration rate no greater than 0.60 AC/H @ 50 pascals, a maximum annual heating energy use of 15 kWh per square meter (4,755 Btu per square foot), a maximum annual cooling energy use of 15 kWh per square meter (1.39 kWh per square foot), and maximum source energy use for all purposes of 120 kWh per square meter (11.1 kWh per square foot). The standard recommends, but does not require, a maximum design heating load of 10 W per square meter and windows with a maximum U-factor of 0.14. The Passivhaus standard was developed for buildings in central and northern Europe; efforts are underway to clarify the best techniques to achieve the standard for buildings in hot climates. designers are on the opposite end of the spectrum from Blasnik; the software used by Passivhaus designers (PHPP) is so complicated that most energy consultants don’t attempt to use it without first taking nine days of classroom training.

For example, consider this formula used to calculate window heat losses in Passivhaus buildings:

This level of detail raises several questions, including:

  • Do most PHPP users supply accurate inputs?
  • Is the PHPP model accurate?
  • How much do the small differences that PHPP users sweat over really matter?

The Ja/Nein Fallacy

At the Building Energy 12 conference in Boston, Matthew O’Malia, an architect at GO Logic in Belfast, Maine, explained how Passivhaus designers approach their work. “PHPP is a massive spreadsheet,” said O’Malia. “It’s the mother of all spreadsheets. Here’s what I like about the Passivhaus approach: You either achieve the standard or you don’t. At the end of the spreadsheet, your answer appears in this box. The answer is either ‘Ja’ or ‘Nein.’ There is no ‘maybe’ in German.”

Some Passivhaus designers go further than O’Malia, implying that a building that falls short of the magic 15 kWh per square meter is at risk of failure. To these designers, the Passivhaus standard represents an important threshold for performance and moisture control. The implication is that designers who aren’t conversant with WUFI or THERM can end up designing buildings that encourage condensation and mold.

I propose a name for this mistake — the “Ja/Nein Fallacy.”

In fact, there is no evidence that superinsulated buildings that fall on the “Nein” side of the Passivhaus divide are experiencing moisture or performance problems. Moreover, as Blasnik pointed out, once the homeowners move into their new Passivhaus abode, variations in plug loads can overwhelm the small envelope issues that Passivhaus designers lose hours of sleep over.

Don't throw your energy models out the window

Good energy models, including PHPP, can be very instructive for new-home designers. The best models clearly reveal the importance of choosing a compact shape, avoiding bump-outs, installing orientation-specific glazingWhen referring to windows or doors, the transparent or translucent layer that transmits light. High-performance glazing may include multiple layers of glass or plastic, low-e coatings, and low-conductivity gas fill., and addressing thermal bridges. Once learned, however, these valuable lessons do not need to be rediscovered for every new house.

Of course, designers of custom superinsulated homes are likely to continue using energy modeling programs, and their designs — resulting from an iterative process of continual refinement — help instruct designers and builders of simpler homes who may choose to avoid the expense of energy modeling.

Last week’s blog: “Solar Thermal is Dead.”

Tags: , , , , , , ,

Image Credits:

  1. Table and graph from Michael Blasnik; window calculation formula from Bronwyn Barry

Apr 10, 2012 1:48 PM ET

How detailed is detailed enough?
by Michael Blasnik


it may surprise you, but the SIMPLE spreadsheet model does include the interactive effects between lighting and heating and cooling loads. it also includes the heat/cool interactions for hot water, plug loads, etc. -- and also the interactions between cool roofs / radiant barriers and duct system efficiencies and attic heat gains/losses.

I guess you could say that SIMPLE is only simple when it comes to reducing the number of required inputs and avoiding hourly simulations. But it is actually fairly complex under the hood. I tried to include significant interactive effects throughout. It's not at all clear that you need to use an hourly simulation or collect a lot of extra data elements to capture these effects reasonably well.

Apr 10, 2012 8:09 PM ET

Modeling use cases
by Gregory Thomas

So why do we model currently and do the modeling tools actually support those use cases? Simply tossing rocks at modeling in general is doing a disservice to the industry when there are actually strong use cases for modeling. That is not what Michael is doing, but it is easy for modeling critics to make it seem that way.

So what are the basic use cases for modeling?

We model to design efficient new homes where potentially each feature could be changed to improve energy use and there is no historical usage for comparison. Seems like detailed modeling might be useful here and it sounds like the models are actually OK at this. You can get picky about which model to use and how each of the models support calculating really low energy use but that is a matter of personal taste and the style of building to some extent. I understand that there is testing underway to compare REM to the Passive House Model to allow REM to be used in that program.

We model to help make us support decisions about retrofit options. We have a billing history to help us improve our pre-retrofit model (if we make use of the billing data to calibrate the model and find data entry errors) and we learn a lot about what saves energy and what does not. But after doing a lot of models do I really need to keep modeling? What more am I learning? Not much. But in my experience, those building performance contractors who have not modeled some good number of homes (50-100?) are not as strong at understanding where to cost effectively save energy and tend to over design energy saving solutions for customers, spending more of the customer’s money unnecessarily. So modeling to learn is good. But modeling once we have learned may not be as valuable.

We model to get incentives. Programs that want to encourage deep retrofits like to use modeling to control access to dollars. That way you get to design the best solution for a home. Deemed savings programs do not fit well with home performance approaches because they tend to have you putting things into homes that do not need them just to get incentives. Other funding sources, such as Congress, like performance based incentives because everyone wins (most everyone anyway). They don’t have to go through getting lobbied by different industries to their energy saving solution included in the law. At a different session at ACI, Jake Oster of Congressman’s Welch's office stated this explicitly. We currently have two national incentive programs introduced with bi partisan support that depend on modeling. Boy wouldn’t that make an impact if these passed! And since they create jobs and have bi partisan support they have a not so terrible chance at passing.

Is the modeling process for getting incentives robust enough to trust with determining incentive values? The proposed laws reference several requirements that were designed to make the process robust and enforceable through quality assurance. First, they require that the software pass the RESNET Audit Software standard. These are a set of physics tests that stretch the tools. Stretching is a good test for relative model robustness but this type of testing is different than the accuracy tests that measure both user error (often complexity driven) as well the performance of the physics at the mean. These tests can and will be improved but they are the best available solution now for a performance based (meaning developers can run the test themselves which helps you develop faster and less expensively) accuracy test that can be referenced as a standard in legislation.

Second, we need to establish a baseline for the energy usage in the building. This is what the incentive dollar amount will depend on. But guess what, we have a simple method for getting a baseline. It is the existing energy bills. If we subtract the weather normalized energy use from the building that has gotten a whole house improvement, we ought to be fairly accurate. The real bills are the real bills after all and we have data showing that improved houses model better than house with all sorts of problems. It is the problems that are the problem. They are hard to model correctly since there are lots of inputs needed, many of the inputs are hard to measure so we estimate them and tend to estimate high and all these assumptions about problems combined with any errors in the modeling software for the actual conditions combine to make it tough to model poor performing homes. But for an incentive calculation we can actually ignore these issues. We can make the baseline the weather normalized actual bills.

But there is an interesting thing. An energy model calibrated to the actual weather normalized bills is functionally the same as the weather normalized bills. And if we calibrate the model we get the extra benefit of eliminating a lot of gross user error that otherwise creeps into the models. So we solve two things at once, we get an accurate baseline and we improve model quality.

BPI has worked with RESNET and a group of industry contributors to create an ANSI standard for this calibration process. This standard is what is referenced in the legislation. The joint BPI and RESNET effort here was considerable and very important. Congressional staff did not believe we could make this joint effort work but we did and congratulations are due to the participants. The joint effort also means that the bill is more likely to pass. The President has endorsed these efforts in his budget, getting the attention of DOE in the process.

Other programs besides Congress use performance based incentives. These programs would benefit from setting baselines using model calibration also. Efforts like Green Button and utility connections to EPA Portfolio Manager also improve access to the energy information needed to do the calibrations. Other efforts like HPXML will make it possible to choose modeling tools and to collect data outside of modeling tools and import that data into modeling for incentive access. There is a lot of infrastructure growing that will make performance based incentives more cost effective to perform and administer.

Finally, we can use modeling to track results. If we don't know how much we expect to save it gets pretty hard to figure out if we are hitting our targets. So modeling helps us improve the performance of our work if we can get access to post performance data and compare it to expected results. (Long discussion about occupancy here but this post is too long already.)

But do I need to run a model to put insulation in a house and air seal without a performance incentive? No. It is not worth the effort and the results with models with enough detail to be used for performance based incentives will be suspect unless I take more time than it is worth. A simple approach might be quicker.

There are a range of simple approaches that reduce user input and error and would make contractors lives easier. But our experience is that a strong minority percentage of program participants like the detailed modeling. So there is no one size fits all. If you make it too simple, some people complain and if you provide more detail people complain about that too. Oh well.

Apr 10, 2012 9:57 PM ET

Edited Apr 10, 2012 11:19 PM ET.

The Future
by Danny Parker


Glad to hear that SIMPLE accounts for the utilizability of internal gains. However, while I particularly like your laundry list of things that have plagued simulations-- and we have worked to address those in current efforts-- I still remain unconvinced that simulation is not the superior tool.

I like to see how loads line up with PV output by hour. And I like the possibility of evaluating technologies that I think have future promise, but complex interactions, such as heat pump water heaters being used to scavenge cooling in hot climates, but being sensitive to location. Hourly simulation whether HES, BEopt, TRNSYS or EnergyGauge allow those possibilities and if we address many of the short comings you mention, we are left with a more powerful saw.

Finally, tools such as BEopt and EnergyGauge with its CostOpt module allow economics to be blended into the evaluation process. This, in turn, allows hundreds of exhaustive simulations to be performed (beyond the patience of reasonable analysts using a single spreadsheet or single set of favorite parameters in a simulation) to locate the approaches that are likely to produce the most cost effective means of reaching energy reduction targets.

I liken the situation today, to that of chess computers 15 years ago. While one can study chess and become reasonably good, fifteen years ago, even a reasonably strong player could best computer programs. Now, the tables have turned and even a grandmaster cannot challenge desktop software. World Champion, Vladimir Kramnik was crushed by Deep Fritz in 2006. Since then, the gap has widened.

Admittedly chess is a mathematically deterministic system, but except for the occupant related variations, this is largely true for buildings too. Physics matter. And if the current understanding is flawed (as you have pointed out several short comings such as windows), it can be made right. [A current favorite problem assumption for me of late is the assumption of uniformity of temperature in buildings-- which can't be right because of interior walls]. Well, there is Passivhaus.

Anyway, improvements are already manifest.Just a matter of time and effort, as we have been involved with recently with HES. We're getting better-- a lot better.

Eventually, I envision software that will best the energy predictions of human operators-- at least in terms of locating the best performing or most cost effective options.

And if you want to add occupant related variation-- think Monte Carlo simulation with triangular probability distributions. Get ready. With billing records, judicious tests and a series of probing questions, the "Deep Energy Blue" of tomorrow will know how to get there in the way you want, and in a way that is more clever than any of our preconceptions.

They will eventually be able to learn about their own modeling shortcomings and suggest code and or algorithms for review and revision.

By then most of the tedium will be gone; the computer will let you know the level of parsimony or complexity required to resolve the question. However, testing, and good billing data (and even end-use data) will continue to be key requirements.

Still, don't be surprised if the computer eventually second guesses some inputs, however!

A long time ago, in a confession telling my age, I eagerly took a college course on the use of the slide rule. What an elegant tool! However, my love affair, was later dashed when Dr. Tamaimi Kusuda (one of my heroes) came up with a method to predict household energy use with a TI-59 calculator, a numbers box, which was God's own salvation for engineers at that point. I was smitten.

Since then, things have changed a lot. And while he was living, Dr. Kusuda eagerly embraced it all and contributed mightily to TARP that was developed by NBS.

Having experienced the beauty of the optimizations made by BEopt or CostOpt-- and how much they have surprised me by their clever approach-- I want to be be around to see that happen.

Yes, you see, I am biased. I find energy simulations beautiful. I just hope they feel the same way about me.

Danny Parker

Apr 10, 2012 11:27 PM ET

Edited Apr 10, 2012 11:28 PM ET.

Replies to questions in Martin's Comment #51
by Evan Mills


Actually, it doesn't seem that we agree on any of those points.... Here are your three bullet points, followed by our responses:

"● The Oregon study revealed defects in some of the algorithms in HES and REM/Rate."

Not at all. First, the opaqueness (thus non-reproducibility) and flaws in the methodology did not give us confidence in the results, and there was no basis for assigning differences between predicted in actual energy use estimates between how the tool was used and hamstrung as opposed to how the tool actually works. Second, the study's results were not presented in a way that would have been useful in diagnosing possible ways to improve the model other than perhaps the most vague indications (e.g., look at electricity outcomes versus gas outcomes). Other, higher-fidelity data sets have proven much more useful in this respect. We can't speak for what the authors of REM/Rate may have gotten out of the analysis, but it's a good question and we encourage you to ask them.

"● Due in part to the Oregon findings, the HES algorithms have been improved."

No. Given the lack of documentation and no response to our follow-up questions, we could not glean anything particularly useful other than curiosity about the true accuracy of our system. Although we made improvements in the intervening 4 years (!) since the Oregon analysis was done, the main difference between our results and theirs is likely that we removed the handicaps they imposed. The frequency with which the study was cited and misrepresented inspired us to dig more deeply into the question of accuracy. Note that if you read the fine print in the study (and not just the headlines), HES did better than the other tools in various respects.

"● While energy modeling may not be appropriate for every energy retrofit project, it is useful for designers of custom zero-energy homes."

No, that isn't our view. It has much broader application than ZNET homes. Again, the choice of analytical approach really depends the underlying purpose of the analysis.

Danny Parker provides more discussion in Comment 54, above.

Apr 11, 2012 8:50 AM ET

Edited Apr 11, 2012 8:55 AM ET.

Response to Evan Mills
by Martin Holladay

My attempt to identify possible points of agreement was extended as an olive branch to Danny Parker. I never implied that you agreed with my three bullet points. However, your response is clear; perhaps never before has an extended olive branch been so forcefully rebuffed. Your disagreements are duly noted.

My comments were directed to Danny Parker, who wrote, "A great many adjustments were made to the simulation algorithms over the last years -- many of these in response to greater scrutiny of comparing model to billing records. If one wishes to assign guilt to that process, then we (and the entire simulation community) are fully at fault. As mentioned, in the blog entry, I credit the Oregon study with starting the ball rolling on that process."

I don't think I was doing much violence to Danny's meaning by interpreting these sentences the way I did. But I'll let Danny speak for himself; perhaps his rebuff will be as forceful as yours.

Your third and final point -- that energy modeling "has much broader application than ZNET homes" -- is one I fully agree with. Your attempt at disagreement is based on a deliberate misreading of what I wrote: "While energy modeling may not be appropriate for every energy retrofit project, it is useful for designers of custom zero-energy homes." Syntax and logic compel you to admit that my contention that energy modeling is useful for designers of custom zero-energy homes does not exclude the possibility that there are many other useful applications of energy modeling. If you return to my original blog, you will in fact see an enumeration of four such uses for energy modeling in the first paragraph I wrote. Moreover, the final two paragraphs of my article include a defense of the value of energy modeling software.

Apr 11, 2012 9:15 AM ET

Response to Danny Parker
by Martin Holladay

I'm old enough to have used a slide rule in high school; my father showed me how to use a slide rule when I was still in elementary school. I bought my first calculator when I was in college; it was amazingly cheap -- only $99. (The reason that it was so cheap was that it had a stylus on the end of electrical cord which was used to tap copper rectangles; there were no buttons. That saved the manufacturer a few bucks.)

I'm afraid that your chess-program analogy is of limited value when discussing building energy software. Computer chess is usually played on a chessboard equipped with sensors, so the computer has access to real-time data -- all data relevant to the game in question. Alas, no one has yet invented a house with enough sensors to provide the relevant home-performance data to a computer.

That means that for the foreseeable future, we're stuck with ordinary data entry by ordinary humans with limited knowledge of all of the factors governing the performance of a house. Even if we granted an auditor a month for data entry, many of the relevant data points can't even be measured.

Apr 11, 2012 1:24 PM ET

response to Greg
by Michael Blasnik

Greg- I very much agree with the importance of calibrating models to actual energy use and did some work on the BPI-2400 standard to help push that along. I also agree about the "use case" discussion but I may draw the line differently. My point is that modeling needs to be cost-effective. We need a range of options when modeling a home so that in a typical home with typical energy use and typical problems very little time is expended on collecting data for models or running models. In more complicated situations more data collection and modeling effort may be justified. The challenge is creating the tools that do this seamlessly. I know you are quite aware of the challenges involved in making this a reality and I think we agree that the industry is generally heading in the right direction on this.

Apr 11, 2012 2:16 PM ET

Edited Apr 11, 2012 2:18 PM ET.

response to Danny
by Michael Blasnik

Danny -

I completely agree that, if done well, a more detailed simulation model should certainly be able to make more accurate assessments of energy use and retrofit savings than a simpler model. I have never disputed that. But the key part of that sentence is the "if done well". A simplified model with decent inputs/defaults/assumptions can easily perform better than a detailed simulation with poor inputs/defaults/assumptions. I think that was the main finding of the Oregon study. It also also unclear how much detail and accuracy is needed in each home -- just because we can model something doesn't mean we should.

But there is also another point worth making -- which is also related to your chess computer analogy -- and that is fundamental uncertainty and the propagation of error.. Unlike with chess computers, energy models of homes we will never be able to provide good field values for many of the modeling inputs -- the distribution of leaks, the site wind speeds, the interior and exterior shading, the properties of the soil, the framing factor of the walls, the regain from duct losses, even the outdoor temperature. The list goes on and on.

The result of all this uncertainty is that each component of the building model has some minimum uncertainty -- often in the range of 10% and often more. If we assume that these errors are uncorrelated in a given home, then propagation of error tells us that we should sum the squares of the errors and then take the square root. I've done some calculations using what I think are reasonable estimates of these uncertainties and found an overall uncertainty in heating use of 10%-15% (with no occupancy effects). I then re-did the analysis using larger component uncertainties to represent a simplified modeling approach and found that the overall uncertainty only went up by about 5 percentage points even though I increased some component uncertainties by much more. This type of exercise shows that even if you reduce the uncertainty in one part of the building model by a lot, you haven't improved the overall uncertainty very much. It's an open question how much more time and effort it is worth to reduce uncertainty from +/-20% to +/-15% -- is it worth an extra hour at every home? two hours? 10 minutes? Does it even affect our actions -- the retrofits or design? Currently, I think many program designs end up requiring too much time spent collecting data for the building model and running the software. Better software design could help change this conclusion and simplified models may be a part of getting there.


p.s i would not suggest that people use spreadsheets out in the field to model homes -- that is why I have worked with groups like Earth Advantage, CSG, and energysavvy so that they can turn algorithms into actual software. Optimization methods can be applied to any set of algorithms whether they involve hourly simulations or not.

p.p.s. To further show that I don't think complicated simulations are useless, I am currently using a dynamic simulation model with a 30 second time step that I developed to explore the dynamics of HVAC equipment control strategies.

Apr 11, 2012 11:58 PM ET

Edited Apr 12, 2012 12:12 AM ET.

Luddite lite?
by Danny Parker

Dear Friends,

Not sure how much longer I can keep this up. Not sure I am really helping as I don’t seem to be convincing anyone of my greater faith in simulation. Nor, am I particularly helping myself; my skin is not thick enough. But I will try one last outing as my opinion isn’t ultimately more important than that of anyone else. So I’ll spill my guts again.

I believe almost everyone in the business is making their best effort to do better to predict and assess energy use. We just don’t all agree on how to do that. But at least the intentions are good, and I acknowledge that. Reducing energy in our homes is important– very important.

For me, I am convinced that simulation is the way to go. For instance, something like HES can be run with a bare minimum of inputs– as few as any other strategy one might consider.

And yet, at the same time, under the hood, HES is running DOE-2.1E which has been hot rodded in recent years. And the problems that Michael Blasnik usefully pointed out are slowly being addressed in a fairly complete fashion. That means that the basic engine underneath the hood is powerful, robust, filled with engineering acumen that is only exceeded by Energy Plus or TRNSYS. That doesn’t mean that a simple calculation cannot be good, or very instructive. It just means that I don’t think it is as intrinsically reliable as an hour-by-hour simulation driven by TMY weather with all the goodies.

Why not have the most powerful tool running underneath the hood? Even if you have a list of only ten inputs? Do those really well.

While Evan is correct that the Oregon study did not give the full information necessary to make things better, Martin is right that the controversy generated from that study did result in deep review of simulation methods from top to bottom, not only at LBNL and FSEC, but at NREL as well. That is still going on with the labs checking on each other in the truest deference to the scientific method.

That multi-year process has borne fruit and continues to result in some surprising findings– for instance Jon Winkler’s dramatic findings on HVAC modeling of the most efficient heat pump equipment and how sensitive it is to the equipment size (through an unintended ARI testing loophole). And there is the more mundane, but much more common influences of insect screening and drapes and blinds on window thermal and solar conductance (much of this being done at the University of Waterloo in Ontario). That, along with the non-uniformity of interior temperature conditions look to be really quite important in simulation.

But as I said earlier, HES is slowly incorporating some of these things and all of it just makes the results more accurate, more robust and believable.

Unfortunately, what we have learned in recent months is that billing data is not enough to help troubleshoot our simulations. That happens because compensating errors can shield the analyst from knowing where there is shortfalls or where they do well. We need end-use data so we can see where the predictions fall apart.

Luckily, we have some detailed monitored data sets where we have such data. And in an ACEEE paper that will be presented this summer, we show how going from “asset” data (and even blind asset data without many details) can be improved by detailed information and finally by “operational level” data. For instance we found one household uses 100 kWh a year for the clothes dryer while another uses 3,500 kWh although they are precisely the same unit. No clothes line involved either. Anyway, you get the picture. Could clothes washer/dryer technology be more important than CFLs, new refrigerator or added ceiling insulation for such a household? Ah, well, yes...for that household.

And the questions HES can pose to users can help to ferret out such an influence and find that such a situation might exist. And use that to inform users that clothes drying energy might be a critical household energy use.

Those unhappy with my chess analogy (Gary Kasparov was destroyed by a computer playing a deterministic game that doesn’t include the uncertainty we face with simulation of houses), I have bad news for you. In 2008, a University Alberta computer program, Polaris, won in Las Vegas playing poker against human experts in over 500 hands at a rate of 3:2. Bad enough, that the houses in Vegas are trying to weed out the “pokerbots.” You see, even if the computer was unable to read their human bluffers (an advantage that the human players took advantage of), they compensated by the shear force of neural networks and Bayesian game theory.

Sorry, but not only are computers good at dealing with deterministic prediction, they also excel in coping with uncertainty. And computers can come to understand where their uncertainty is most critical. Michael is certainly right that we will always face uncertainty in model, engineering and weather. However, we may not agree on the need to use computers to explore that parameter space further.

For instance, I envision that the computer of the future would feed off past success or failure and learn heuristically what questions and unknowns are most important to get answered or clarified (the partial derivatives to any particular input provides a first approximation). And the answers to certain questions may branch off in the most fruitful directions based on past experience. We are working on such a system for HES now. But rather than a simplified calculation, it uses a very detailed calculation with a truncated series of inputs that are based on an expert system of which parameters are likely to be most critical. Simplified inputs/ detailed calculation.

What we need more if good quality end-use data to feed the computer to learn about its shortcomings. We’re getting there.

Yes, simplified calculation can be good. Maybe even very good. But it can’t be best and most rigorous from an engineering standpoint.

Those who disagree might consider if they would be willing to step onto a “Dreamliner” designed by “good enough” engineering. I’ll anticipate the critics saying that “good enough” for houses is not catastrophic like it might be for a jet liner that exceeds design boundaries.

But ponder this: might that kind of thinking has to do with retrofit savings estimates that fall short? Or maybe Zero Energy Homes that don’t reach the mark? I wonder how many of the contestants in the Solar Decathalon are using spreadsheets rather than EnergyPlus or TRNSYS?

I’ll choose improved simulations, thank you. Good luck to those hardy souls on the other side.

Danny Parker

Apr 12, 2012 1:37 AM ET

Beauty is where you find it: (Most everywhere)
by albert rooks


Thanks for your comments and your good work:

"I find energy simulations beautiful. I just hope they feel the same way about me."

Well... for some reason, I'm sure that they do. I find it a far deeper world than we know. (I hope!)

This has been an immensely interesting, rewarding and frustrating discussion to watch. I feel like it's the 1900's and I'm witnessing the horse vs. automobile debate: "In 1903, the president of the Michigan Savings Bank advised Henry Ford’s lawyer not to invest in Ford Motor Company, saying, “The horse is here to stay but the automobile is only a novelty, a fad.”

The need here in a national venue is to continually test the accuracy and value of what is purported to be the "best practice". I appreciate that. It's good for "us".

However ... We need to keep an open mind, not get stuck in our failures, and push ahead. It appears to me that development work that sets the bar higher creates a vacuum that eventually gets filled.

Where would the "Pretty Good House" be without the Passive House? By definition "pretty good" can only be quantified by something "better". It was obviously an outcome of Passive House Development in the US. (Note that I hold both of these "schools" in great respect).

It's wise and great that these modeling programs are being tested and held accountable for their claims today. It's the one thing that will drive improvement for better results.

I'm neither a good horse doctor or energy modeler... However we both know that it would be foolish to discount energy modeling's development and effectiveness in the next decade(s).

Cheers to you Danny!

Albert Rooks
The Small Planet Workshop
USA Reseller of the Passive House Planning Package
(and therefore a BIG proponent of energy modeling)

Apr 12, 2012 1:38 AM ET

Edited Apr 12, 2012 1:55 AM ET.

Nice job Ken Levenson
by albert rooks

It's hard to maintain the "value" of more information and choices (PHPP) when it requires more time and energy. My second job at 16 was in a cabinet shop where the owners quality statement was: "Perfect is good enough" It was pretty hard to argue with him.

We put out some beautiful work...

Keep at it!


Apr 12, 2012 6:39 AM ET

Edited Apr 12, 2012 9:08 AM ET.

Response to Danny Parker
by Martin Holladay

Thanks for taking the time for you long, thoughtful comment. I think this dialogue has be very instructive and interesting.

This time, I'll avoid any temptation to emphasize points of agreement.

As those familiar with the legacy of Energy Design Update (which I edited for seven years) realize, I have long championed the cause of residential energy research. I have often come to the defense, in print, of the work of researchers like Evan Mills and Danny Parker. All of us involved in the design and construction of superinsulated homes stand on their shoulders. We owe them an incalculable debt.

Whatever its flaws, the Oregon study bore useful fruit. The surprising findings of the Oregon study, which Michael Blasnik has accurately shared, should have been exciting to any scientist. Scientists love unexpected data. In fact, it sounds as if Danny Parker, Evan Mills, and others in the modeling community responded to the findings exactly as scientists would be expected to respond -- by reviewing their models to see if any algorithms could be improved. That's the way the system is supposed to work. This is all good news.

The needs of research scientists and residential energy auditors are not the same. A powerful software engine that is capable of hour-by-hour simulations is a wonderful tool for research, and I don't doubt that these tools get more accurate every year. These tools can also be the hidden engines for software used by energy auditors, even when the auditor's version of the software requires only a few inputs.

That said, one of Michael's most important points -- that not every home needs to be a science project, and that not every energy retrofit job requires modeling -- is an astute observation that should help inform people designing residential energy retrofit programs. I feel confident in saying that Michael Blasnik meant no disrespect to Evan Mills when he made this point; nor did I.

Finally, both Michael Blasnik and I have clearly and repeatedly stated that energy modeling software is useful — and for some purposes, essential. Let's use it when we need it, and skip the modeling whenever it distracts us from the tasks at hand.

Apr 12, 2012 8:08 AM ET

Edited Apr 12, 2012 8:09 AM ET.

response to Danny
by Michael Blasnik


I'm not exactly sure who you are arguing with on many of your points -- I think we agree more than you realize.

I think we should use the best models we can but when it comes to field applications for energy labeling or retrofit analysis that we need tools that require as few inputs as possible, are easy to use, and provide solid estimates of energy use and retrofit savings. I'm agnostic as to exactly how that is accomplished. Many people have thought that if you ask for many data inputs and run an hourly simulation model then you are guaranteed more accurate results than using some simplified approach involving fewer inputs and perhaps a simpler modeling engine. I think people are realizing that isn't necessarily true.

I'm not sure where you got the idea that I'm against using computers or detailed modeling or sophisticated algorithms -- that isn't true at all. I'm just aware that in many retrofit programs the audit tools can impose a significant burden in time/cost/focus and still not produce useful results. It's important that the tools are useful and not burdensome.

It's interesting that you mention tools of the future that learn from the past and assess model adjustments based on derivatives of inputs -- I've been working on these very things. The "learning from the past" part has been a manual process -- I've done many research projects and retrofit program impact evaluations over the past 25 years and have tried to incorporate the lessons learned from that work into improved modeling assumptions and methods. The model calibration part involves using those first derivatives you mention (d_energy use/ d_model inputs) along with an estimated var/covar matrix of input uncertainties to develop a unique solution to the model calibration problem while also being able to give feedback to field people about potential problems with the data they are entering. Like you, I love developing and playing around with this stuff and I love simulations and modeling. But I want to make sure that all of this sophistication can take place in the background and that the tools don't become a burden to use and actually provide useful information. I think we can agree on that as well.

Let's all keep working to improve the tools we have and keep checking their outputs against the real world. There's lots to be done.

Apr 12, 2012 8:19 AM ET

response to Albert
by Michael Blasnik

I hope you realize that when Danny is talking about simulation models that would not include PHPP. PHPP is not an hourly simulation model but is truly a spreadsheet. I think that means you are the one arguing in favor of the horse?

I'm just advocating for either a car that actually works or else a horse if there isn't a working car. around or maybe even walking if my destination is nearby.

Apr 12, 2012 2:12 PM ET

Edited Apr 12, 2012 2:25 PM ET.

Well... ok.
by albert rooks


Long thread and I came in late.

I mis-read a few comments as "downplaying the value of modeling" due to the time and cost associated. That was what I was reacting too. I see that they were meant to temper budgets and not devalue the practice overall.

When I was thinking of the future in Modeling, I wasn't thinking of the static PHPP. I was imaging what had been presented as the next stage as WUFI 3D: A 3D dynamic model detailing temperature & humidity at optional selected points throughout the wall/roof assembly. Seems like it would detail performance quite well. Now that'd be the car worth pulling out of the garage.

To me, the PHPP is graceful in it's single dimension, rigorous in it's demand of Therm calcs for bridge areas, it's "ja" or "nein" for airtightness. All in all, a pretty reliable and accurate "horse".

All great tools for new construction. Retrofits will probably remain tough. Perhaps that's really a guided walk.

Apr 13, 2012 10:05 PM ET

by Evan Mills

I believe the discussion following Martin’s article has been a healthy one, and may help to clear up various misconceptions.

No offense was intended by our Comment 56 in response to Martin’s Comment 51. Danny and I discussed the original comment and agreed on the essence a response; Danny was quite busy and asked that I pen the reply. I indeed missed the qualifying term “every” in his third point, for which I apologize; it certainly was not deliberate… There are certainly home energy upgrades that don’t require modeling, or at least the kind of modeling (math) that home performance professionals do. In fact, the consumer version of the Home Energy Saver ( enables laypeople to do those kinds of low-touch assessments with a minimum of time investment, and without having to pay out of pocke for the information.

The Oregon study wasn’t a particular watershed, honestly. Validation work on the underlying DOE2 engine and improvements to the HES system had actually been ongoing long before then. In fact HES was in the middle of a major upgrade exactly when the Oregon runs (unbeknownst to us) were happening, which was a bit disconcerting.

From our perspective HES actually came out better, in many respects, in the Oregon study than the other tools (best symmetry of errors around actual values, and better median results by many metrics) and so we weren’t particularly concerned. Our concern, rather, was around deficiencies in the study methodology and analysis, and repeated questionable interpretation of the study’s findings.

In any case, validation is a highly important and worthwhile pursuit—if done correctly and, ideally, in a way that helps actually improve the models. We’re also encouraged that many of the trends discussed in this thread bode well for smarter and lower-cost ways of bringing good analysis to bear in an increasingly cost-effective manner. In fact, with this very much in mind, DOE is soon to launch the Home Energy Scoring Tool ( for asset rating, which is built on the HES architecture.

Apr 14, 2012 2:27 PM ET

Intent of the EPS Pilot
by David Heslam

Although I love a good hourly building simulation as much as the next person, divining the merits of different modeling approaches was not the focus of the 2008 study. Energy Trust of Oregon sponsored that research to explore the parameters of a cost effective asset rating program for existing homes. As one of the study's authors and the manager of the field work I would like to clarify a few things.

The study's two key findings, that a score should based on a representation total energy consumption and that models optimized for fewer inputs could be developed to deliver such a rating, have both shown relevancy over the past few years. In creating the Home Energy Score, DOE created a metric that is less granular than the report proposed, but is consistent in concept. Also models have been improved for this purpose in the subsequent years.

Does that make it a landmark study? Probably not. Useful for guiding policy and technical development, sure. Was there bias in the study? Not intentionally towards one tool or another that's for sure. Was there measurement error? It was a field study with 5 different auditors, so yes. To minimize this there was very extensive error checking, including follow up phone calls and follow up home visits just to very suspect data points.

The merits of energy modeling is always a hot button issue, so it is not surprising that the study produced debate about the merits of modeling and different modeling approaches in areas of work far outside the study's focus. It should be noted the study pointed out merits of Home Energy Saver and suggested the improved optimization of its inputs if it were to be used for an asset rating.

Evan, Danny and others obviously disagree with the study's methodology and analysis, fair enough. I would just caution that those opinions may not be considering the actual framework of the study; when analyzing modeling tools we wanted to determine whether tools were suitable for delivering a cost effective asset rating for homes. That focus determined the comparative analytics used. Were those analysis methods different those used for other research purposes, yes because we were asking different questions.

On one last note, I would like to point out that Evan and Danny are now working with some of the study data in their current work. The 2008 study did a very thorough (my staff might say excruciatingly thorough) job of collecting and cataloging the Home Energy Saver data. It is doubtful any other study will conduct that level of field data in the near term. Painting the study as technically deficient with a broad brush is not helpful, especially if elements are proving to those very detractors.

Personally I love energy models, I just don't want to waste unnecessary time creating them. For years I utilized them to determine best practices for building high performing homes. Hourly simulation is great for that, but I didn't/don't feel the need to model every home. My focus has changed from that pursuit to making our existing buildings better and generate a rating that let's a building owner take credit for that improvement, for that purpose we need tools that get it right as quickly as possible.

We're making better tools. Let's keep it up.

Apr 15, 2012 12:58 AM ET

Occam's Razor Meets George Jetson
by Danny Parker

Dear Friends,

I hadn’t checked here in a few days and see that I should have taken a look earlier. I think I need to address several individuals in hopes there are no bad feelings.

First to David Heslam, we very much appreciate Earth Advantage sharing the Oregon data to us to till again. That has been in progress in recent weeks and it has been useful. It has been useful the same as other data from Wisconsin, North Carolina and Florida. It is also helpful to know that the Oregon data was prepared with great care and knowing that makes addressing model shortcomings all the more important. Thank you.

To Michael, I would say that we have very similar perspectives. Although this series of commentaries has concerned simulation and modeling, I always believe that measured data is on first and I know you see it that way too. Indeed it is why we are here discussing our ability (and inability) to understand it completely and predict well. Still, I typically find myself nodding to many of the observations you make– particularly the recent one relative to how error from models tend to propagate at the square root of their differences. That makes better prediction tough from the start.

Indeed, that has been a fundamental finding in work over the last months: unless you can see where simulations are making errors in the end-uses, the prospect of improving predictions is pretty daunting as compensating errors from one “refinement” after another does not necessarily take one straight away to reduced error. There are lots of compensating changes where imprecise input leads to some things helping prediction for an individual case while others taking one further away.

My main argument is not with you, but rather directed at the conclusion that seemed to emerge from the Oregon study– that the accuracy problems with prediction lies with simulation itself. To go over that again, I believe simulation is the better way to meet the challenges at prediction:

• Using hour-by-hour simulation provides weather data at fine granularity based on detailed meteorological observation, with respect to outdoor temperature, coincident solar radiation, nighttime long wave irradiance and wind speed. While cooling and heating degree days and information on solar irradiance can be distilled, I like to think that the actual measured weather is important to our models. (Indeed I have been arguing for years that rain should be important to reset the roof temperature when the TMY rain flag shows precipitation). Of course, as Michael has already observed, using these data willy-nilly can be trouble– for instance using airport wind speed at a 10 meter height says little about the wind speeds down at house neutral pressure point and with localized shielding and terrain from the suburban landscape. Otherwise, surface film coefficient on windows are undervalued and infiltration estimates are exaggerated. Indeed, these are some of the improvements that we have been carving into the simulations lately. Makes a difference. Still, I look at detailed weather as a good thing.
• Simulation models such as DOE-2, TRNSYS and EnergyPlus generally have the most rigorous engineering models in them. That doesn’t mean they are correct, however. As I mentioned earlier in this blog, one of the tenets of simulation– that of a homogenous interior temperature– is very often violated in real houses, particularly older, more poorly insulated ones. As Michael knows, there are no bigger knob for space conditioning models than the interior temperature (thermostat) approximation. Thus, our increased attention to this phenomenon and the collective ‘Ugh’ from everyone recognizing that interior walls and thermostat location are about to become important– at least if you want to predict the savings of insulating a turn of the century (the previous one) brick two-story. There is work there, yet to do.
• Hour by hour simulation models allow prediction of hourly loads. This becomes more important as more and more utilities move to time based Time of Use (TOU) rates or even Critical Peak Pricing. As PV costs drop below $6/Watt, you’ll see a lot more of that and how that matches up with TOU rates will be important. Same thing for plug-in hybrids: we’ll eventually need to simulate them and how they effect the TOU mix. This move by utilities will only grow in the future since their costs of generation vary with time, season and weather conditions. The competitive nature of the business dictates that they face us with that music eventually. Simulation will be needed.
• The time required for running an hourly annual simulation is trivial (<4 seconds), compared with the time for input and, particularly of developing input. Why not do the best calculation possible with those precious inputs?

That said, I do think that simple models have some intrinsic advantages over complex simulation. The key one is parsimony where getting things wrong in a model with a simple engine is less likely than getting something wrong in a very complex model, such as DOE-2, where such an eventuality becomes a near certainty.

Even so, the engineering model needs to be as fully complex as the situation demands it, but no more. It’s a difficult edge that evokes Occam’s Razor and allows me to bring up Einstein again: "Everything should be kept as simple as possible,” he said, “but no simpler."

As I have made clear, I believe the effort to get the complex models right and use best quality weather data is worth it to the extent that some phenomenon can otherwise be underdetermined. In any case, the national labs and FSEC have been policing each other as we do pretty thorough examination of comparative models and use that process to illuminate differences.

Yes, we get carried away at times (a science project), but often back off to what works well enough. There is the BESTEST suite of simulation cases which allows one to see how simulations stack up against each other for pre-arranged cases. However, BESTEST is no panacea either. How can it help if all the models are in error? That has already been productive to correct some real differences in simulating windows– something being corrected in BEopt (made clear by differences in the DOE-2.2 and EnergyPlus implementations). Similarly, some inadequacies in the EnergyGauge simulation of heat pumps has been corrected (strip heat is commonly engaged when the reversing valve is activated and defrost is in progress). A variety of improvements have been made in HESPro– particularly for modeling machines, basements and air infiltration. Gotta get those things right– particularly if best assumptions were not used before.

Recently, we have been able to examine the predictions of HES against natural gas consumption for space and water heating in a collective sample of 450 homes around the U.S. While, we still have appreciable scatter (yes, a lot), we are spot on for the averages. Turns out that electricity is another matter– none of the models do that well, including SIMPLE, and for reasons that aren’t immediately apparent. We’re looking into that. Still, I would hope that if Michael runs HES today against his home, it wouldn’t still be high by 40%!

A comment from just about everyone, as Martin clearly reiterated, is that auditors should not be overtaxed in providing information to a model. And that is a fundamental point we agree on.

However, this is one key area where Evan and I are trying to address a misconception. While, one may believe that something like HESPro requires a plethora of inputs, that is not true. It can be run with a very abbreviated list of inputs– as simple as any other model. The key limitation, is that USERS have not been given any guidance about what those “most important” inputs are.

And perhaps we are guilty of not helping with that as much as we could. “Quick inputs” in HES was one answer; but many users choose Detailed mode. Having 150 inputs may suggest all of them need to be filled out, but that is not the case. Not all the inputs are of equal import. And even knowing that only some need to be addressed how is a lay person to know which ones? How to span that gap? Leaving users to their own devices only invites sub-par performance and frustration.

HESscore is one answer to that process– a truncated list of 30-odd inputs based on “expert opinion.” But while adequate and consistent, that fixed series of inputs, however simple, may not provide the most accurate result.

I sounds like Michael B. and I are on the trail of the same thing: using the heuristic smarts of computers to help find the right inputs to demand from auditors based on what is learned from past performance of the system in predicting future loads. Of course, they would not always be the same inputs necessarily. The problem, of course, is that auditors and home owners typically have a limited attention. We need to use their attention to maximum benefit before they glaze over or exceed the audit budget.

Based on Evan’s priorities, we are working on that over the next year for HESPro. We’ll see what we manage. (One trick is obtain two years of billing data; use the first half to tune the model and then see how well it can predict the recent year– Delphi method in action). Cluster analysis might then be used to help sort out the most important groups and their common critical inputs. Or that’s my idea.

It also turns out that end-uses loads are very important, as mentioned in the previous blog. This is important to help understand where prediction error is coming from. Homes are not just heating and cooling; they are water heating (where knowing gallons per day is vitally important), laundry (washer and dryer that are very sensitive to occupancy), refrigeration (guess what? second refrigerators are often way different from the first), fans and blowers, cooking, lighting and entertainment and plug loads. Lots of places for error.

In fact, it is worse than first blush when it comes to predicting retrofit measure savings. For instance, a model that predicts monthly energy right can’t be known to be as reliable for predicting the savings of a heat pump water heater as one that has been subjected to see how well the models are predicting the daily hot water gallons without bias. (By the way, speaking of uncertainty, hot water gallons appears particularly variable even given knowledge of occupants and other fundamental factors). Same for predicting the savings of an air conditioner: better be predicting space cooling well regardless of how close you are on the monthly utility bill. Compensating errors don’t help much then.

Unfortunately, the cases where we have the above end-use information is spotty. But we are in search of it and have found some– that data being uniquely valuable in the quest for the Holy Grail of improved accuracy.

Will we do better? For sure.

Yet, as I mentioned, computers and computing power and the ability of machines to help us understand their own limitations and ours may prove invaluable. Such expert-system applications may play an ever greater role in improving prediction while reducing the onerous nature of audits and lengthy forms. It’s my conviction that simulation with a forward-chaining inference systems of asking the right questions will play an important role in that process. The computer can refine the prediction as the auditor or homeowner provides data and then seek more where it is most needed. It should be able to ask the most important questions first.

That won’t come to energy analyst George Jetson right away, but the next few years could see many improvements. I have been pleased to be able to help with these things, at least in a small way, and I appreciate the efforts of others, even when we do not see eye-to-eye.

Everyone agrees we are all trying to improve things. I’ll add that we should be turned out to pasture if we are not.

Danny Parker

Apr 16, 2012 2:23 PM ET

Yes, but....
by Robert McClellan

I do have to agree that today's energy models remind me of the scene in Animal House where John Belushi carefully measures the windshield of the Cadillac...then smashes it out with a sledge hammer. On the other hand, rocket scientists started with the same sort of inaccurate models but eventually got to the moon. These models may not weight the variables appropriately and we know that we can't model human behavior, but we still want to aspire to having models for heat and moisture flow that approach the reliability of structural force models.

Another reason to use models is strictly cosmetic. If you notice, women's shampoo is often advertised as having some ingredient with a long scientific name like Importantene. The name means nothing, but it does sell shampoo. Likewise, as an instructor, it really helps to have an inscrutable model when you are trying to teach long time builders that fiberglass isn't state of the art anymore.

Apr 22, 2012 2:24 PM ET

Great discussion...
by Martin Holladay

It's a pleasure to come back from a week's vacation and find a series of stimulating, thoughtful posts. Thanks to everyone contributing to this discussion.

Apr 26, 2012 8:07 PM ET

Edited Apr 26, 2012 8:13 PM ET.

Simple. Danny Parker? Simple
by aj builder, Upstate NY Zone 6a

Simple Danny Parker? Not your post at least.

Simple is moving taxes from good to bad.

Raise taxes on fossil fuel, while lowering taxes on green sustainable income.
Mandate solar, outlaw discontinuous insulation.
Deliver only so many BTUs of enery per residence. You build a 1000sqft or 10,000 or 100,000sqft and you are given the same amount or energy to work with per year. So if you have a huge home, you also surely can afford to install enough solar to cover your needs.

My plan makes modeling work because one of the biggest problems with models is the lack of control of the "knob turning" homeowners. With the limited energy delivery scheme it becomes very personal for a homeowner to control his "knob" turning or pay for his lack of knob control not the community, the planet.

Sep 7, 2012 3:07 PM ET

Edited Sep 7, 2012 3:26 PM ET.

GREAT article!
by Ted Kidd


"So why do energy-efficiency programs almost always overestimate anticipated savings? The main culprit, Blasnik said, is not the takeback (or rebound) effect. Citing data from researchers who looked into the question, Blasnik noted, “People don’t turn up the thermostat after weatherization work. References to the takeback effect are mostly attempts to scapegoat the occupants for the energy model deficiencies.”

But following the data too rigidly may have lead to obvious conclusions, and sometimes obvious conclusions are incorrect. I've found this to be false:

"most energy models do a poor job of predicting actual energy use, especially for older houses."

The real problem is data input. If you do a crappy job inputting your data, then don't even bother to true to actual consumption (anybody reconcile their bank account?), of course you get gross savings overestimates.

Add energy program minimum savings, and sales people that consciously or unconsciously want reports to show more savings, cover it all with no accountability for accuracy, and I think blaming the software seems to me jumping to the easy conclusion or "scapegoating" also.

I'd CHANGE THIS: "References to the takeback effect are mostly attempts to scapegoat the occupants for the energy model deficiencies.”

to THIS: "References to the takeback effect are mostly attempts to scapegoat the occupants for the energy MODELLING deficiencies.”

Is it any surprise modelling sucks? All the classes pooh pooh it. Heck, there is no dedicated certification yet it's arguably the most complicated and important piece of the process. It's nearly universally treated as something to be rushed through. With all these pressures against accuracy, where is there any counterbalance? There is none.

Jan 29, 2013 6:22 AM ET

Retrofit studies are consistent: projected savings are overestim
by Hein Bloed

Here a Cambridge study about the so-called "prebound effect", covering Germany, the Uk, Belgium and France:

See bottom of the page for the full paper.

Jan 29, 2013 8:50 AM ET

Response to Hein Bloed
by Martin Holladay

Thanks for the link. The European study reinforces Blasnik's conclusion (and my reporting): “Blasnik cited five studies that found that the measured savings from retrofit work equal 50% to 70% of projected savings. ‘The projected savings are always higher than the actual savings,’ said Blasnik, ‘whether you are talking about insulation retrofit work, air sealing, or lightbulb swaps.’ ... Energy-efficiency programs almost always overestimate anticipated savings ....”

One of the authors of the European study, Dr Minna Sunikka-Blank, noted, "This challenges the prevailing view that large cuts in energy consumption can be achieved by focusing purely on technical solutions, such as retrofitting homes. In some cases, doing so may bring only half the expected savings, perhaps less."

Dec 27, 2013 12:14 PM ET

$ + Hubris = GIGO
by Russell Higgins

Accurate energy modeling is readily achievable, if you remember to KISS ....your S.O. every morning when you leave for work --- no the other KISS, Keep It Stupidly Simple.
Energy savings from retrofit insulation, EASILY done, on a napkin, waiting for your lunch.
then relax, enjoy your BLT and do the window replacement while waiting for the check.
Hubris - Yes, rather than trying for an all seeing and all knowing BEM, modeling smaller, simpler and FEWER the components will give you more reliable information, that is, get the inputs right, remembering all you need to consider, get data, make thoughtfull guesstimates, etc.
Money - Software is supposed to save money making it possible to acheive the improbable for nothing? No, it's just a tool, one that requires KNOWLEDGE, knowing the bulidng type / occupancy, software and the data, to create reliable models. All but the rarest of clients building a state of the art eco-monument, will be able / willing to pay for knowledable staff to accurately collect / estimate the literally thousands of data points necessary for a reliable whole building energy model.

As others said, "Trueing" models to yearlyt energy bills and climate data (adjusted for local micro climate) etc. is MANDATORY for a reliable BEM. Yet new buildings don't have historical energy use data. VERY general info is now used for this, while what we need is the energy data for the building down the block - which means talking them out of it / talking them into doing an energy audit of their building (see how I turned that "problem" / "cost" into an opportunity??? and, you spread the cost of collecting and massaging climatic data over two (or more - why not try for even more synergistic savings, go down the block offering energy audits at "discount" to everyone on the street? it's what driveway contractors do, must work, most of them live better than I do).
For reliability, BEM needs validated open source data summarized by individual and grouped buildings / occupancies listing major data points, size, shape, orientation, occupancy, users, energy per area / user / system, etc for input to short cut the expensive / inaccurate data collection process with known good data.
A non-trivial problem.
Anyone good at grant writing?
Team with your Alma-Mater, get the A/E depts going and some luckly professor published.

Oct 31, 2014 12:26 PM ET

A synopsis of Parker's presentation
by Nate Adams

If it is helpful, this can be read in a few minutes with what I found most poignant, which doesn't mean I'm interpreting as Danny, Evan, et al would like.

No one cared about accuracy in many models, as indicated by Blasnik's presentation that the error 'Was almost entirely from the pre-retrofit usage estimate'. (Page 9, top left slide of presentation below from Summer Camp.) Truing the models to actual consumption would make models accurate, so why aren't we doing that before claiming models are junk?

Dec 10, 2015 10:43 AM ET

Use the utility bills
by Sherwood Botsford

Seems to me that the best sanity check would be to ask them for the last year's utility bills, and from the weather office, last year's degree heating/cooling days. From this you can calculate the true performance overall of the envelope.

Register for a free account and join the conversation

Get a free account and join the conversation!
Become a GBA PRO!