It seems I'm not the only one disturbed by Amazon's insensitive plug for their new apparel store. At Better Living Through Software the very same suggestion I was exposed to is also ridiculed.
And there's also an article in The Wall Street Journal about profiling gone wrong.
The examples from that article all seem to be about cultural overfitting in the profiling models. Some random selection by a purchaser is used a high-valued proximity generator in search space, and suggestions for like-minded but fatally off-target titles are the result.
One has to wonder why this happens to these people. Granted, the stickiness of a particular search session of amazon can be annoying (browse one unrelated item and idiotic links will be added to the conversational state of the current search for the duration of the search). In general though, I don't experience this kind of thing very much. But then I don't have a TiVo of course - and would really much rather have ReplayTV if I had the choice.
Among the possible reasons are
- No model for failing to match, so no notion of a too good fit. A bayesian model that does not somehow try to measure when the fit is too tight to the data will quickly become ridiculous
- Limited data on purchaser. If the profile of a particular purchaser only includes a few items, the suggestion mechanism will have very low reliability
- Limited connectedness of the problem space or very bad metric properties of the problem space. The problem space has to be relatively smooth, and I would suggest the arts in general could have very bad continuity properties. Some of the samples would indicate this also. If you are thorougly mystified by this statement it just means that small tweaks in preference space looks to the individual user as a very striking move away from the persons own preferences. This would be a popular theory - persons are complex, sensitive, beings and you can't meter them out like a statistic - but the sad fact is that models of who we are and how our beliefs match our ideas about good taste do work as far as I know
I think the first and second explanations are the most probable and that the third only matters inasmuch as when I'm browsing for technical literature there are very few strong personal feelings attached to the individual book selection and furthermore technical books tend to cover a subject more evenly, so any failure to match true preferences isn't nearly as intrusive as it would be to receive a suggestion to buy a ridiculous band like the now defunct Guns'n'Roses just because I happen to like Iggy Pop.
The failure to model overfitting shouldn't be discredited at all.
A pure bayesian 'most likely secondary purchase' model on a per book basis would probably fail miserably.
Due to the relative sparseness of the purchase space considering the true number of dimensions (3 million books in print) you definitely need to emply some technique to not fit the noise. A principal compenents analysis of the search space is probably a good idea and good incremental algorithms exist to compute one.
It would be interesting to know how people work with overfitting when the only observations that really make any sense are the successes. It would make sense to assume a general decay of success for a particular association of titles and let then to let the successes enforce the probabilities that aren't failures.
Posted by Claus at November 27, 2002 01:00 AM