What? Aren't computers perfect?

I'm so excited someone has finally experimentally recreated what we've all known for years! The most common cause of traffic jams is people driving like idiots!

Anyway, I'm not naïve enough to think that our results should be treated as gospel. The selection committee is made of people. As much as we can try to model their behavior, people are inconsistent and sometimes subjective. I have no doubt that we'll whiff on some of their choices. However, it can be useful to try to understand where our mistakes are and how we can correct them, if indeed we can. (Plus there's no sense in arguing with these ilk of writers who think that the BPs and Bill James of the world are enslaved to what the computer spits out).

One way is to review performance from previous years and look for patterns in the big mistakes. I've incorporated some of those changes into this year's model. Of course, we won't know how that worked until this Sunday.

Unfortunately, there are some aspects of the process that would be difficult, if not downright impossible, to model accurately.

Injuries

We know the committee takes injuries into consideration, but the problem is that each case is unique and subjective. Trying to include that in the model would be tricky and may cause more problems than it solves. Lack of historical data to use in comparisons is also a problem. And what if a team performs better while a key player is missing - are they penalized when the player returns?

As a result, I haven't even tried to include injuries and I don't foresee doing so any time soon. Fortunately, there aren't many Kenyon Martin situations where injuries have a significant impact.

What about this year? Here are a few cases we might mishandle because of injuries. (See this post if you're not familiar with the sparklines used here.)

Dayton - Probably the most affected and most difficult for the committee to judge. They started 14-1 with Chris Wright but are 7-8 without including a number of bad losses. If he can't return to give the committee a chance to see him, they'll likely judge Dayton more on the latter than the former.
Pitt - The Panthers lost Mike Cook and Levance Fields in successive games at the end of a 10-1 start. They were a good (though inconsistent) 7-4 in conference without both which is probably enough to make the field. Since Cook will not be returning and Fields has been back since mid-February, the effect may not be as bad as I fear. In fact, Pitt is 4-4 since his return, though the losses were against four of the top six in the Big East. Seeding will be our biggest problem evaluating Pitt.
Arizona - National POY candidate Jerryd Bayless missed four games, three of which the Wildcats lost. I don't think they'll get much consideration from the committee on these games because one was at likely one-seed Memphis, which they may have lost anyway, and they lost rematches against Arizona State and Oregon after Bayless returned. Even without a free pass on those games, they're in the field as of Thursday.

Jekyll and Hyde teams

Some teams have bipolar profiles and they may be treated differently by the committee than teams with the same overall numbers but a more even level of performance. CTD looks at the overall profile of teams, not caring in what order the numbers accumulated. Two teams this year will provide a good test for how the committee handles such cases.

You may have heard that Kentucky struggled early in the year but turned it around once SEC season arrived . In the years we have data (2000-7), there has not been a team with a Top 10 RPI in conference games (#6) with a lower non-conference RPI (#207) than Kentucky.

Top 10 conference RPI and sub-100 non-conference RPI, 2000-2007

Team	Season	Seed	RPI Rank
Team	Season	Seed	All	Conf	Non-conf
Miami FL	2000	6	37	6	150
Kansas	2006	4	20	6	134
West Virginia	2006	6	38	9	123

The better half of the season is more recent, so conventional wisdom says the committee will overlook that loss to Gardner Webb and include the 'Cats. We train the CTD predictor on data from past seasons. When it encounters a team like this where there is no real comparison in previous years, we can have problems. My guess is that barring an SEC tournament collapse, they're in, but certainly not seeded as high as those comps without a strong tourney.

Ole Miss has a mirror resume to Kentucky, with the good non-conference portion (#8 non-conference RPI) first and the mediocre and maddeningly inconsistent conference portion (#105 conference RPI) second .

Top 10 non-conference RPI and sub-100 conference RPI, 2000-2007

Team	Season	Seed	RPI Rank
Team	Season	Seed	All	Conf	Non-conf
Toledo	2001		84	168	8
Rhode Island	2004		82	160	10
Southern Mississippi	2000		62	127	10

This will be a great test of what the committee looks for. Conference vs. non-conference? (Notice the seed column in the above tables - blank means they didn't make the field.) Overall profile or recent play only? If you look solely at the profiles, Ole Miss would seem to have a slight edge.

			Rank			Conf		N/C	1-25	1-50		1-100		1-200	101+
School	Conf.	W-L	Sag.	Poll	RPI	RPI	+/-	RPI	W	W	+/-	W	+/-	+/-	L
Mississippi	SEC	21-9	44		45	105	-2	8	2	5	2	8	4	7	5
Kentucky	SEC	18-11	51	30	51	6	8	208	2	4	-3	5	-5	1	1

Am I hot or not?

Related to the Kentucky/Mississippi situation in a more generic way is how is a team playing at the end of the season? CTD include each team's record in their last 10 in the last few seasons, but it didn't seem to make much difference. The problem is that last n games (previously 10, but 12 this year) is arbitrary. What the committee looks for is recent performance, which is a bit too Potter Stewart for me. (Hey, Prof. Stone's censorship course finally paid off!) How many games does "recent" include? Do they only look at W/L record or also quality of opponents?

Some of the teams I've mentioned in the other categories can also fall under this one, making it even harder to separate how the committee came to their conclusions. If they include Kentucky, is it because they have won 11 of 13, or because their conference RPI is in the top 10?

Potential vs. resume

I thought that the committee's decision to place Florida as the #1 overall seed (much less a #1 seed period) last year was a terrible decision. There were at least 4 teams with better profiles than the Gators, and the outcome of a one-and-done tournament is irrelevant in the discussion. This is a perfect example of seeding teams based on potential (perceived or otherwise) over the documented accomplishments on the resume. Because this is a subjective measure, there's no way we can account for it.

Kansas is a good example of this mentality this year. The proverbial "many people" (i.e., I'm too lazy to find links right now) have said Kansas may well be the "most talented" team in the land, but haven't been getting much love from CTD or other Web bracketologists. The same thing happened last year with Florida, though Kansas doesn't have the same sense of inevitability as the defending champs did last years. I doubt they'll make it up to the #1 seed line without some help, but with a few pre-Sunday upsets they may sneak in.

Bad calls?

Luke Winn brings up the idea that the committee will entertain evidence that teams were harmed by poor officiating, specifically mentioning Villanova's tough zebra luck in losses against N.C. State and Georgetown. It's hard to know what effect that will have; bad calls are a part of the game, and everyone seems to get their fair share. What about teams that benefit from a few favorable calls? (I'm looking at you, UCLA.) Can arguments be made against those teams? I'm not going to worry too much about this one.