Wheelhouse Comparable Sets
Table of Contents
How we determine which units are competing with you for guests
In the STR space, automatically detecting an accurate comparable set can be quite challenging. This is due to the highly variable supply of the STR space, that sometimes creates non-intuitive comparable sets.
For example, it is not unusual to have a 1 BR unit that can sleep 4 (or even more!) guests. Due to this, it is no surprise to find that an ‘accurate’ comparable set for this unit would include 1BR, 2BR, or even 3BR units.
Additionally complicating matters is the fact that many ‘similar’ nearby units may not actually be truely comparable. (Said differently, not all 1BR places are the same!) This can be due to neighborhood boundaries, differing amenities, reviews, or more. For example, if your unit has a pool, your comparable set should likely include (at least in the summertime!) nearby units that also have pools.
Given this complexity, we have found that the most effective Comparable Set can be determined by determining which units have the most similar combination of features in terms of their value on the market and resulting booking patterns.
How attributes are transformed to similarities
In order to understand similarity, we need to transform the raw attributes (like number of bedrooms, room type, location, etc) into a different form in which a comparison is more adequate. A potential guest usually looks at four important factors (and their trade-offs) to make a decision:
As discussed above, the attributes of a unit or a host are hard to compare directly. However, we have a way to measure the similarity of unit attributes through the Base Price Model. If the attribute is less important to guests as a differentiator, the impact on the average booking price will be low. At the same time, attributes that matter a lot for the decision will significantly change the price guests are willing to pay.
The Base Price Model is a regression model that is trained based on actual market performance data, including the achieved prices and occupancies. As such it presents a transformation from the unit attribute space to the market value space and performs a dimensionality reduction for the comparison analysis.
Another way to look at it is that the Base Price Model imposes an implicit distance metric on the unit attributes. A standard approach to measure distances in high dimensional data points is to take the individual differences per dimension and sum up those differences for the total distance. However, it is not clear how a difference in bedrooms of two compares to a difference in cleaning fees by $20 or to a non-scalar attribute like having/not having a pool. The Base Price Model learns how these can be compared and thus provides a metric.
Next to the unit itself, it is important how the unit is managed by the host and how guests are perceiving the unit and are willing to book it. To capture this factor, an Occupancy Model is formed which is trained in a similar fashion as our Base Price Model. In this case it predicts the occupancy of a unit rather than its price and hence reflects both the host performance and the guest impression.
The Occupancy Model is then used just like the Base Price Model as a similarity measure to compare the booking performance of different units.
For the final distance measure between two units, we combine the predicted base price, the predicted occupancy, and the geographical distance. To be considered as part of the automaticaly detected Comparable Set, the total relative diference in predicted base price and occupancy between two units is required to be below a threshold.
The threshold is relative to the local density of units to reflect the choices a potential guest has. If a location has only few units to offer, the Comprabale Set is a little more lenient, just like a guest would be less strict about attributes with limited supply. Vice versa, in a dense area with a lot of units, the set will be more strict to pick the most similar units.
In addition, units are only considered as comparable if they are not too far away geographically. Again, the threshold depends on the market as well as the local unit density. A rural market will usually consider larger distances than an urban market and sparse neighborhoods will be more lenient than dense areas with many units.
Note: Keep in mind that the desireability of specific neighborhoods is already implicitly covered by the Base Price and the Occupancy Models. See our section on spatial kriging for more details.
How a Comparable Set is chosen in practice
To explore this Comp Set model, let us take a sample ‘private room’ in downtown San Francisco, with zero bedrooms, i.e. a studio, and one bathroom that sleeps two.
In this case, our Comp Set model identifies a comparable set of 90 units. Among this comparable set, 68 units are also zero bedrooms, while the other 32 have one bedroom. And, while 74 of the units are also classified as ‘private rooms’, 16 of the units are entire homes/apartments.
The results are shown in the following visual, which depicts each nearby unit as a singular dot.
As you can see, some of these nearby units are considered too far away, geographically, to be truly comparable (these units are the ‘red circles’).
For the next set of more proximal units, our model removes units for which the base price and the expected occupancy do not align (these units are the ‘blue squares’). However, when both distance and similarity are a ‘match’, our model includes this unit in a default ‘comparable set’ (yellow circles, clustered at the center).
Pretty cool! But… ready to dive in even deeper?
Let us further our comp set analysis by examining the distribution of the two key metrics, base price and occupancy, for the identified comparable units. As we can see on the first chart, our Base Price graph shows where, amongst a comp set, this unit’s Base Price falls. In the second chart, we can see that the distribution around occupancy is actually considerably broader, though still similar, to our sample unit.
And, while some units show a larger deviation in one of these metrics (e.g. base price), these units, in turn, are more similar in the other metric (e.g. occupancy).
The third chart below offers one more vantage point to see how our Comp Set model works.
In short, when we examine the distance metric, you can see that while our comparable set model clearly favors units that are close by, it also considers some units farther away, if they show very similar performance metrics.
In total, Wheelhouse’s Comp Set model analyzes and identifies both a broad and narrow set of comparable units. Over the years, our data has illustrated that it is best to price against a larger set of ‘potential comparable units’, as opposed to a smaller set of ‘certain comparable units’.
Additionally, due to this approach, we can show customers a broader set of ‘potential comparable units’, which can be useful in providing a broader range of insights when either (a) comparing performance metrics or (b) deciding on a pricing strategy.