In other words, insights from your "gut" often smell like Sh#$.
Experts have been bested by simple models in a variety of arenas. Some favorites:
Leli and Filskov (1979) reported cross-validated classification accuracy that equalled 83% for a discriminant function derived on two measures of intellectual deterioration. This investigation made a preliminary assessment of the clinical utility of this function through a clinical-actuarial classification paradigm. Wechsler-Bellevue Intelligence Scale Form I protocols from 12 nonpsychotic nonimpaired and 12 cerebrally impaired individuals were used by experienced clinicians and predoctoral interns to identify the presence of intellectual deterioration associated with brain damage through their own clinical experience (Clinical Judgment condition) and, then, in conjunction with the discriminant function (Clinical-Actuarial condition). The classification accuracy from the discriminant function weights (Actuarial condition) and those from clinicians in the Clinical-Actuarial condition were statistically comparable and significantly above chance levels. These results indicate that the clinician who is assessing for the presence of intellectual deterioration associated with brain damage should rely heavily upon a valid actuarial index.
But, what do we call that process which is characterized by a disruption of the naturally occurring order of observations plus immediate feedback on cue-criterion links, followed by some concrete form of tallying the accuracy of one's hypotheses? We call it RESEARCH [authors emphasis]