For solar professionals, the weather isn’t boring small talk. Sunshine makes electrons flow, rain slows construction and storms test a system’s ruggedness. Lately, this weather wildcard has been getting wilder, making PV production modeling tougher.
While weather is becoming more variable, weather data is gaining nuance. The combination means choosing the right data should take more thought than simply picking a familiar source from a dropdown menu.
Changing weather demands current data
If you think the weather is changing, you’re right. First, there’s an upward trend in temperature and as anyone who’s read a module datasheet’s PMax value knows, warmer temperatures lower the efficiency — and therefore the output — of solar arrays. On top of that there is an increase in the number of extreme weather events, which can impact production and damage systems.
Akanksha Bhat, SolarAnywhere product manager at Clean Power Research, says there’s been an increase in year-to-year variance in solar resource. “We’ve seen more annual variability in the eastern United States, and also the Pacific Northwest, and a couple of factors that could be causing that could be more cloud formation and more tropical storms and rainfall in these areas.”
Solar resource is measured by global horizontal irradiance (GHI), a calculation of total solar radiation on a horizontal surface that captures not only the direct normal irradiance (DNI) that a flat surface would receive, but also diffuse horizontal irradiance (DHI) and ground-reflected radiation. Meteorological datasets also include humidity, temperature, wind and other factors.
Bhat said that air pollutants such as haze, ash and dust, measured by aerosol optical depth (AOD), are also changing and impacting solar irradiance. “In the eastern United States, there has been a reduction in AOD, and that can be attributed to coal plant retirements and stricter emission standards. So, we’ve seen the AOD going down and sunshine going up because of that. In the western United States, on the other hand, we’ve seen the AOD increase, and that could be attributed to more extreme events like wildfires,” she said.
How significant are the changes? The National Oceanic and Atmospheric Administration’s U.S. Climate Normals study, which it releases every 10 years, showed temperatures in the last 30-year period typically were more than one degree Fahrenheit higher than the 20th century norm. And while extreme weather events such as wildfires can’t be predicted, they are happening more often, and their impact can be significant. A National Center for Atmospheric Research study found that in 2020, smoke in California was so heavy some days that peak production was cut by 10 to 30%.
What this means for weather datasets is that the latest data is the greatest data. Even though no predictive model knows when extreme weather events happen, any data that generalizes too much or doesn’t include the most recent years will be giving a false picture.
The first rule in choosing weather data sets:
Select a source that reflects recent temperature rises
and increased variability.
Bhat warns that some of the free datasets don’t reflect current information. “Having a more up-to-date dataset generally means that you are able to account for some of the recent trends that you’re seeing.”
More granular means better accuracy
While weather is getting wilder, weather data is getting wiser.
Older sources, which derive data from nearby weather stations or use a statistics-based Typical Meteorological Year (TMY), lack the granularity that current datasets offer. For example, the National Renewable Energy Laboratory’s (NREL’s) Physical Solar Model (PSM V3) uses geostationary satellites that provide higher resolution, with two- or four- square kilometer granularity in five- or 10-minute intervals. Commercial providers, such as SolarAnywhere, offer even higher resolution with half- or one-kilometer square measurements.
There’s variation in data, even among the more granular datasets available. Bruno Wittmer, scientific collaborator at PVsyst, said a past study showed data from different weather data providers varied by a few percent. “The simulated PV generation is, in first approximation, proportional to GHI, so any variation of this value between providers will be seen directly in the simulation results.”
Not all analyses need granular time data. “For annual anticipated production from a planned PV plant, hourly data should be adequate,” said Nate Blair, group manager of the distributed systems and storage group at NREL. “For analysis which might include analyzing clipping or electrical impacts (voltage, frequency, etc.) or short timestep impacts on connected batteries, then solar data with much shorter timesteps is often valuable.”
The need for granular data also varies based on the modeling software. For example, Wittmer said the PVsyst simulation “runs in hourly steps, which means that going to sub-hourly data will not bring any improvement in accuracy,” but noted this will change in the medium term.
Bhat pointed out one other data source: many utility-scale providers have ground measurements from on-site pyranometers. However, those data sets only go back as long as the site has been active, so she indicated they are not sufficient alone. “You want to look back in history as much as possible to see how weather has been changing because climate changes over decades, not years.” SolarAnywhere’s data set goes back to 1998. “You can look at the entire history, in fact that’s a key aspect of having a data set that’s bankable.”
The second rule of choosing data:
Understand the source and the use. There will be times when acquiring the most granular database makes a difference, and others when it is less critical.
As for datasets that look ahead? We’re not there yet. Weather data sets will continue to evolve as data collection and predictive models improve, and Blair said, “There are some experiments happening with combining climate change modeling projections with hourly solar data, but most of that data currently is in the research stage.”
Does this data make my production look fat?
There are many reasons why solar assets underperform, but picking an advantageous weather data set is a surefire recipe for missing production targets. The challenge is many in the industry do just that.
“Developers are incentivized to shop their P50 estimates, engineers are incentivized to provide higher estimates, and as a result we end up with some unrealistic expectations about performance,” said Ben Browne, data science manager, specialty insurance at kWh Analytics. While weather data selection is just one input that can lead to those inflated estimates, that inflation is common. In its 2022 Solar Risk Assessment report, kWh Analytics found that “an overwhelming majority of the projects (85%) had aggressive P50 estimates,” as we reported on late last year.
Solar lenders, investors and insurers have learned to question production estimates because optimistic assumptions undermine the potential for a project to produce as predicted. Inflated estimates may juice the short-term value of an individual asset, making it easier to finance or sell, but it damages investor confidence in the solar asset class.
The resi and C&I markets aren’t immune to the temptation of inflating estimates. Shortening system payback can give an installer an edge over other quotes. However, a customer complaining that their system never produced what they were promised is not a good look for any installer or the industry.
The third rule of choosing a weather dataset:
Resist the temptation to choose data that inflates output.
Browne said solar professionals should be wary of consciously or unconsciously skewing results. “The weather file should be chosen according to a predetermined methodology to avoid the temptation of choosing the data source with the highest irradiance.”
The end user of production estimates dictate how confident they need to be about the output. For many uses, P50 — a production estimate that has a 50% chance of being exceeded — is not high enough. Perhaps because of the tendency to inflate yield estimates, most lenders, investors and insurers base their risk analyses on P90 — an estimate with a 90% chance of being exceeded.
Regardless of the degree of confidence needed, if the models use the most favorable weather data, the output will be hard to trust.
Data due diligence
With a variety of private and public data sources available, choosing the right weather data for modeling requires some due diligence.
Running performance assessment models with a variety of both public and private datasets is a good idea, Blair said. “That way, they are reporting a really large range of possible inter-annual variation to the potential buyer and from various sources of solar resource estimate methodologies. Picking just a single TMY for just one grid cell likely won’t tell as robust of a range of outcomes.”
The model that uses the weather database can also have an impact. “To get more realistic estimates, developers will have to closely vet their modeling vendors,” Browne said. He recommends looking for models that have been back-tested against historical performance. Where that’s not available, “evaluate whether the assumptions in the model are overly optimistic. For example, P50 estimates should account for the impact of subhourly clipping,” he said.
The fourth rule of choosing a weather dataset:
Sanity test your outputs by trying different datasets
This due diligence is more important now than ever before, with a proliferation of modeling software and meteorological datasets, a rise in temperatures and extreme weather events, and a growth of technology that delivers more granular weather data.
The bottom line
As temperatures rise and weather becomes more variable, increasingly detailed datasets will feed into more sophisticated models to make more accurate estimates. However, models are just that: models. They are not crystal balls, but they are the best tool for understanding the value of solar assets. Getting your modeling right boils down to four steps: ensure you are using relevant and recent weather data, match the data set to the modeling need, resist the temptation to skew toward sunshine, and check your work by seeing what different data sets and models tell you.
Dej Knuckey is a contributor to Solar Builder. She’s a journalist, author and freelance writer who has covered energy for publications in Australia and the United States.
Listen to more in-depth conversations on Solar Builder’s YouTube channel
Our most popular series include:
Power Forward! | A collaboration with BayWa r.e. to discuss higher level industry topics.
The Buzz | Where we give our 2 cents per kWh on the residential solar market.
The Pitch | Discussions with solar manufacturers about their new technology and ideas.