One of the more attractive tools on real estate search websites these days for prospective homebuyers and sellers alike are those providing automated house value estimates. In today's world the thirst for real-time information is almost unquenchable. Yet arming consumers with such tools without providing them with disclaimers on the limitations of these models can magnify price volatility in housing markets and hurt house sales.
Lying behind online home value estimators are a host of complicated statistical models that attempt to predict a home's price based on comparisons with similar properties (comparable sales) using county property record data on a variety of attributes such as house square footage, number of bedrooms and bathrooms, among a number of features unique to that property. Aggregating and updating property record data from thousands of offices around the country has improved over the years. However, significant data lags of six months or more still exist in many areas depending on the record office's capabilities to update this information.
Moreover, a danger in using this data is that there is a great deal of variation in the quality of information captured by these recording offices. Square footage can be wildly off, as well as the characteristics of the properties. These models have been around for a number of years and during the boom years were used extensively behind the scenes by banks (including those I worked for) as well as Fannie Mae and Freddie Mac as a way of reducing processing costs. During these years, regulatory agencies identified a number of significant limitations surrounding the use of these models. They tend to work better on homes of similar type and quality but break down quickly on custom homes, homes in rural areas, small neighborhoods and nonstandard property features. Further, no statistical model can understand the market and condition issues of a particular property. For example, these models cannot factor in property upkeep or whether the property sits directly across the street from a gas station, which could reduce its value.
Of some concern for regulators is the tradeoff between the percentage of homes where an automated valuation model could be used (known as AVM coverage) and the accuracy of the estimated value. The greater the use of the model across property types, the higher the valuation errors, typically. Those tradeoffs remain, and as we have learned from the crisis, anytime we become overly reliant on modeled outcomes it leads to a misunderstanding of risk.
In the specific application of online valuation engines, buyer offers can be highly influenced by the estimates generated from these models. In forming an offer, a buyer needs to obtain a reasonable view of what a home is worth. The values displayed by online real estate sites provide an easy way for a buyer to develop an offer.
Consider a scenario where, unknown to a buyer, the comps used by a valuation model could be as old as six months and the county had not recorded the 1,000 square feet added to the home the year before. Even though prices were picking up briskly in the last couple of months on this hypothetical property, the data feeding the online valuation model was not reflective of current market conditions. Further, the valuation model used by the buyer further sets a low accuracy tolerance for its use. In other words, to spit out a number it has to be only 65% sure that the property estimate generated doesn't exceed the actual value of the property by more than 15%. Such a low threshold allows more properties to be evaluated by the model, but at the expense of a higher error rate on the valuation. Granted, that particular quirk might partly offset the downside bias from the data lag. Even so, the net effect is a valuation estimate that is 20% less than true market value due to these data limitations of the model. The buyer has no idea of these issues and so a low bid would be the natural outcome.
Such outcomes create an artificial drag on house values during recovery periods and amplify price appreciation trends during boom periods given potential data lags in market pricing. In addition, the use of such valuations in forming bids can lengthen or prevent real estate transactions from being consummated given large potential gaps between sale and offer prices using these estimates.
AVMs may have a limited place in the real estate and mortgage business, but they remain a black box to homebuyers and most market participants, susceptible to a host of data weaknesses despite their technical sophistication. Proper disclosures and limitations on the use of such models should be made to consumers beyond the meager ones that exist today. This should be one area of focus by the Consumer Financial Protection Bureau.
Clifford Rossi is the Professor-of-the-Practice at the Robert H. Smith School of Business at the University of Maryland.