Tuesday, November 17, 2009

Hyper-realism and Continued Judgement by Probability

We're all familiar with judgement by probability. So familiar in fact that many of us don't realise the extent to which it is our default way of making decisons - and - don't ever ask the question of whether using it so often is wise.
So what is it exactly?
An example would be: choosing a restaurant based on how popular it is. Our reasoning is that in the absence of first hand knowledge this is probably a good choice. I doubt that any of us think it likely that a bad restaurant could be popular.
Another example would be choosing a product in the middle price range figuring that is least likely to be overpriced or of poor quality. We are using an assessed probability to make a judgement.
We often hear the advice has that this is what we should do when we don't have expert knowledge ourselves. I think it should surprise us that the same advice is often given as an indicator of expert strategy.
This begs the question of when should anyone not follow this rule? How often is it not the best choice?
Well, in many situations the concept is based on the idea that we, at that time, are the only ones choosing in that way - i.e, that most other people are indicating a value they idependently recognised, presumably from prior experience.
But this isn't always the case. In my city there is a certain restaurant which is busy and successful, yet it only gets bad reviews and everyone who goes swears never to go there again. It succeeds because it has one unmatchable feature and every goes once for that. My city is large enough that one visit per lifetime for all adults is still plenty of trade for the restaurant to work through for years to come.
Another anomaly is profiling, be it for fraud detection or customer potential. Here an analysis is done, say to find the most common income bracket for a product. Once chosen, that group will be further targeted - the rationale being that this gives the best probability for campaigning value.
[Aside: One fairly obvious place where this doesn't work is in a saturated market. There, it might be more productive to chase the customers unlike those you already have. For the purpose of discussion let's presume that the world in which we are choosing is large and our choice is only a small effect.]
Is choosing the most probable really a good strategy?
There are two different types of situation in this regard, which I will label:
- recursive popularity;
- range frequency.
In recursive popularity the problem is that sometimes the only reason some choice is already more popular than the others is .. because it was already more popular than the others. In the early stages there may not have been any compelling reason at all.
Therefore in choosing it now you are really only selecting the one with a lucky history.
With range frequency, the characteristic is something which has a natural range of values. In people an example would be height. [Aside: you might be surprised how often people make selection biased about height.] A classic example of this is using an economic metric - of a business, or a customer - to bias a selection. Imagine any such metric. Most likely it will have something akin to a normal distribution. There can be many reasons why those parts of the population away from the norm are abnormal in other respects too. If we use height to predict clumsiness we might effectively just discover that its easy to find tall clumsy people because they hit their heads on doorways more than short clumsy people. So if our way of detecting clumsiness is to look for forehead bruises then yes we will probably find that preselecting tallish people is more productive than random selection.
[We may also find it better to concentrate on the averagely tall rather than just start from the tallest and work down. This might be because there are not many very tall people and they know well to duck their heads in doorways.]
This idea of selecting away from the norm has been a catchy meme in some quarters, especially as it seems "targeted" yet conveniently selects a suitable number of potentials who are inherently a bit _different_.
The problem here is that reverse thinking is at work. Even if the subset is truly productive the unasked question is where are most of population that we're seeking? I suspect that many of them remain unnoticed amongst the throng nearer the norm.
These issues become even more pertinent where analysis is trying to uncover the results of human guile. The best place to hide a leaf is in the forest - a principle well known to those with something to hide.

No comments: