2008 review

As promised, let's look at the hows and whys of the big misses (and hits) from last year's bracket, and what we can learn for this year.

Missed selections

Illinois State (CTD had #12 seed)

South Alabama (24-6) and Illinois State (23-9) had nearly identical profiles.

	vs RPI			AvgRPI		Conf tourn
	1-25	26-100	101-200	W	L	Conf tourn
Ill. St.	0-5	5-0	12-3	150	75	2-1 neutral
S. Ala.	0-1	4-2	9-3	189	82	1-1 at home

Guess which team was selected? Maybe the 30 point loss in the Missouri Valley final left a bad taste in the committee's mouth, but that was just one game. Or the bad loss (in December) at Eastern Michigan. I'm not sure what can be learned from this one. I'd like to think it wasn't about choosing a BCS conference team (Oregon) over a non-BCS. Committee chair Tom O'Connor says:

The committee doesn't get into scoring differential into one game [losing large to Drake in the MVC final]. Illinois State had four losses below [RPI top] 100 and the strength of schedule in the 70s. That's a factor in the process.

OK, then. South Alabama had three loses below 100, a strength of schedule in the 120s, and worse average RPI wins and losses. Why are they in but Illinois State is not?

Dayton (CTD had #11 seed)

There wasn't a whole lot we could have done about Dayton in 2008. I mentioned a few days before Selection Sunday that I was afraid CTD would botch Dayton because of Chris Wright's injury, and unfortunately I was right. Only 4 of the 53 Web bracketologists included Dayton, and Crashing the Dance was put to shame. Gotta do something about those injuries...

Oregon (#9 seed, CTD had 2nd team out)

Though they were our second team out, their profile left a lot to be desired. The Web crowd agreed, with only 38% going with the committee on Oregon - easily the most missed team in the field. The #9 seed was also too high; even those in the Web crowd who included Oregon put them at the #11 seed line on average. No great wins (home vs. Stanford was their best) and a sub-.500 record against top 200 teams made this one a head scratcher. I guess 4-9 against the top 50 and 8-11 against the top 100 were good enough.

The only explanation I could find was from committee chair Tom O'Connor:

They had three really good wins against the top 100 on the road. Where you play is important. Quite frankly, the strength of schedule was pretty good.

Of course, one of those three "really good" wins was at Cal, which barely fell in the top 100 (#92) and lost seven other games at home. (The other two were #37 Arizona and #50 Kansas State - a combined 53-40 between the three teams.) And their non-conference strength of schedule (the one they have control over) was ranked 167th. Even if you think they deserved to be in, you have to admit that the #9 seed is a reach.

Kentucky (#11 seed, CTD had 8th team out)

Another team I was worried about before Selection Sunday. My lesson learned here is that a top 10 RPI in conference games trumps a sub-200 in non-conference games, which is almost like saying a strong second half of the season can nearly cancel out a putrid first half. Jerry Palm agrees:

Q: Why does Kentucky (18-12. RPI 57) get in over Ole Miss (21-10, RPI 48)?

Palm: Because the committee, all things being equal, would rather take a team that did well in conference (especially the same conference) than one that did better out of conference. Ole Miss was in the weaker division (albeit slightly) of a down league and went 7-10. While they were undefeated in conference, their only win of note was over Clemson on a neutral floor. They also had a 3-point win over South Bama at home. I also think UK got some consideration for their injury struggles in non-conference play.

What does that teach us for this year? It should bode well for teams like South Carolina (conference RPI #14; non-conf RPI #84) and not for teams like Georgetown (conf RPI #84; non-conf RPI #7).

Bad seeds

Butler (CTD had #3, actual #7)

The model had Butler as #12 on the S-Curve, and was pretty confident about it. The Web consensus had them as the top #5 seed (17th on their S-Curve). Yet, the committee put the Bulldogs on the #7 line. What does that mean? Does the model overrate the profiles of teams from non-BCS conferences? Not necessarily - CTD nailed Xavier as a #3 seed and missed #5 seed Drake by one line.

What made the model think they deserved a #3 seed? I think the biggest factor was their #10 coaches poll ranking (and in fact hinted as much last Selection Sunday morning). Here are the teams in the CTD training database that were ranked 10th in the last coaches poll after selection.

	Year	Seed	Sag	RPI	W-L	1to25	1to50	1to100	101+ L
Tennessee	2000	4	19	11	23-6	5-2	6-4	12-6	0
Kentucky	2001	2	8	10	22-9	7-5	11-7	17-9	0
Marquette	2002	5	9	23	26-6	3-3	4-4	10-5	1
Illinois	2003	4	11	18	24-6	2-1	6-5	14-5	1
Wisconsin	2004	6	16	12	24-6	2-2	5-3	12-4	2
Kansas	2005	3	9	1	23-6	3-3	8-4	20-6	0
Florida	2006	3	9	15	27-6	3-2	6-3	12-6	0
Southern Ill.	2007	4	17	7	26-6	2-1	8-4	13-5	1
Butler	2008	7	24	17	29-3	0-1	2-1	10-3	0

Butler is the only one without a win against an RPI Top 25 team, and they only had one game (Drake) against such teams. Their record against RPI Top 50 teams also does not compare to the others. For comparison, here are the actual #3 seeds from last year.

	Poll	Sag	RPI	W-L	1to25	1to50	1to100	101+ L
Wisconsin	5	6	11	29-4	5-2	6-4	10-4	0
Xavier	12	12	9	27-6	2-1	9-4	12-6	0
Stanford	11	10	14	26-7	3-3	7-4	13-7	0
Louisville	13	13	13	24-8	5-4	7-6	11-6	2

Each had at least two wins (and three games) against RPI Top 25 and six wins (and ten games) against Top 50. Butler was nowhere close.

The lesson learned here is that while coaches poll is important, the other data has to support a high seed. A top 10 ranking in the coaches poll doesn't guarantee a top 3 seed. 10 of the 90 teams in the training database ranked in the coaches poll top 10 were seeded #4 or worse.

	Seed	Year	Poll	Sag	RPI	W-L	1to25	1to50	1to100	101+ L
Louisiana St.	4	2000	9	13	12	26-5	6-2	9-4	14-4	1
Tennessee	4	2000	10	19	11	23-6	5-2	6-4	12-6	0
Marquette	5	2002	10	9	23	26-6	3-3	4-4	10-5	1
Gonzaga	6	2002	6	10	21	28-3	1-2	4-3	7-3	0
Illinois	4	2003	10	11	18	24-6	2-1	6-5	14-5	1
Wisconsin	6	2004	10	16	12	24-6	2-2	5-3	12-4	2
Louisville	4	2005	4	7	12	27-4	3-1	7-2	12-3	1
Boston Coll.	4	2006	7	16	22	25-7	2-3	3-4	11-6	1
Southern Ill.	4	2007	10	17	7	26-6	2-1	8-4	13-5	1
Butler	7	2008	10	24	17	29-3	0-1	2-1	10-3	0

For Butler the other data did not support the high seed, and we were fooled. (Actually, the best lesson here is that the "coaches" voting in the poll can also be fooled.)

Indiana (CTD had #5, actual #8)

Another case where the Web consensus was also fooled (#22 on their S-Curve, or a #6 seed) nearly as much as we were. Even the top human predictors were fooled - Joe Lunardi, Jerry Palm, and Bracketology 101 (the most trustworthy bracketologist) each put Indiana on the #5 line.

This is as good an example of subjective analysis as I've seen since I've been doing this. Recall that the Hoosiers went through some off-the-court troubles last year with Kelvin Sampson's resignation and the team never fully rallying under interim coach Dan Dakich. However, despite a 29 point loss at Michigan State and OT loss at Penn State, their on-the-court performance was largely solid, certainly deserving of a top 6 seed. Perhaps, as The Dagger suggests, the committee saw the Dakich-led Hoosiers as a lame duck team.

Other than adding a "good-team-quits-under-lame-duck-coach" factor to the model, I don't think there was much I could have done here. The good thing is, nobody else foresaw it either.

Others

These are the other teams the model missed by 2 or more seed lines

	Seed			Poll	Sag	RPI	W-L	1to25	1to50	1to100	101+ L
	Real	CTD	Web	Poll	Sag	RPI	W-L	1to25	1to50	1to100	101+ L
Wash. St.	4	6	5	21	11	23	24-8	0-5	4-7	12-8	0
Oklahoma	6	8	8		31	28	22-11	0-5	6-8	11-10	1
Southern Cal	6	8	7	29	21	31	21-11	2-7	4-8	12-10	1
Miami FL	7	9	9		38	38	21-10	2-3	4-4	8-6	4
Miss. St.	8	10	8	28	40	41	22-10	0-3	1-6	5-10	0
Kent St.	9	7	9	30	42	21	28-6	0-2	2-2	11-3	3
Kansas St.	11	9	9		32	45	19-11	1-3	3-6	6-10	1
Winthrop	13	15	14		110	104	20-11	0-0	1-3	3-4	7

In all but two of these, the model underseeded the team. One of the two overseeds was a non-BCS (Kent State) and the other was the enigma that was Michael Beasley and Kansas State. Winthrop's case was odd. Teams on the #13 seed line or lower typically aren't missed by 2 lines. Perhaps they were one of the teams affected by Georgia's inclusion?

Good picks

These are cases where the model outperformed the wisdom of the bracketology crowds (i.e., the Web consensus after excluding CTD).

	Seed			Poll	Sag	RPI	W-L	1to25	1to50	1to100	101+ L
	Real	CTD	Web	Poll	Sag	RPI	W-L	1to25	1to50	1to100	101+ L
Vanderbilt	4	4	6	19	35	12	26-7	1-1	4-3	12-6	1
W. Virginia	7	7	8		20	27	23-10	3-6	3-8	5-9	1
Arkansas	9	8	7		36	33	22-11	3-1	5-3	9-7	4
Temple	12	12	11		64	48	21-12	1-2	4-4	8-8	4

Not much to say here. Vandy was a good catch, but the others were marginal.

Overall, I'm pleased with the improvement over the last three seasons of doing this. Only two weeks to Selection Sunday 2009!