Abstract

The focus of the article is the application of modern Bayesianism in the context of forensic science, as advocated by many in England and Europe. The article reviews the many aspects of modern Bayesianism that extend well beyond the analytic truth of Bayes’s Theorem, and focuses, among other things, on the limits of what can be accomplished by the invocation of ‘subjective’ probabilities. In a sense, all probabilities are subjective, since they are all mind dependent. However, the important issue in forensic contexts, as in others, is not the subjective nature of invoked probabilities, but the characteristics of the belief warrant to be required for them in different decisional contexts.

One of the usual suspects (either Shaw or Wilde or somebody else1) famously said that UK and USA are two countries separated by a common language. It turns out that in things legal it is worse than that—we are separated by path dependence in the scholarly ideas that we examine closely and internalize. In fact countries which any time in the past 60 years have regarded the Queen as the nominal sovereign have a roughly common academic pool generally centered on England, while the United States remains largely on its own. This is largely true in regard to forensic science issues as well,2 and increasingly, the English pool has in some areas expanded to include important continental sources and ideas also. This does not necessarily give one academic pool an advantage over the other in terms of truth and justice, either in regard to general legal ideas or those related to forensic science, but it does mean that we in the United States are sometimes only vaguely aware of, and fail to come to grips with, developments in the other pool. I have become exquisitely sensitized to this by virtue of preparing for the conference that precipitated this artivle.3 Starting in the early to mid 1990s there have been major developments in the English forensic science pool4 which have given rise to a body of doctrine on the proper mode of approaching the interpretation of the results of forensic science examinations and the proper means of presenting those interpretations in court (referred to by its proponents as the Bayesian approach) of which I must freely admit, mea culpa, I was only vaguely aware until fairly recently.

This is not to say that Bayes’s Theorem5 was completely unknown to me. I lived through the ‘Bayes Wars’ of the 1980s and 1990s which were fought over the general theory of evidence and inference.6 But I was more an observer than a participant, and at least to me those debates seemed more driven by different visions of human capacities than by full-scale commitment to ‘Bayesianism’ (in quotations) as the only revealed way to rational inference from evidence. And there is no doubt that the term ‘Bayesianism’ now covers schools of thought that would have surprised the Reverend Bayes himself,7 and which embody a set of doctrines at least some of which are not implied by Bayes’s Theorem itself.

But let me not put forth such a perhaps controversial and contentious proposition on my own authority. Let me quote the eminent statistician, the late Jack Good, an early adherent, who said only a few years ago that the term Bayesian is now usually used to refer to ‘a whole philosophy or methodology in which “subjective” or logical probabilities are used’.8 So Bayes’s Theorem does not define modern Bayesianism (or ‘neo-Bayesianism’ as it is sometimes called) so much as the mysterious concept of ‘subjective’ probabilities9 does (that is, mysterious at least in some ways to me10). And indeed, a little research reveals what I am sure many readers already know, that there are ‘objective’ applications of Bayes’s Theorem, so-called ‘frequentist’ applications of Bayes’s Theorem, and so forth,11 and these would hardly qualify for a seat at the table among today’s Bayesians, the many of whom appear to be committed to the celebration of subjective probability without much attention to the issue of varying the belief warrant required for the assignment of those probabilities in different classes of decision problems.12

There seems to be a consensus that neo-Bayesianism, or simply ‘Bayesianism’ full stop, as it is now known, as a ‘whole philosophy or methodology’ began to emerge as a movement in the 1950s, and came to full flood in the 60s with, among other things, the emergence of influential rational choice models of expected utility decision theory.13 These models, perhaps best represented by the work of Howard Raiffa and Robert Schlaifer,14 have plenty of helpful applications, but they may not be as universalizable as their proponents believe, for reasons I will return to later.15 But nevertheless, Bayesianism based on freewheeling notions of subjective probability became a constellation of beliefs and investments that attracted many adherents. In this, it reminds me somewhat of another movement of a hundred years earlier which I have given some study to (and written about a bit), that is, August Comte’s Positivism, which was driven to a similar secular desire to identify and embody pure rationalism, and which ironically turned into a kind of secular religion.16

Whether Bayesianism has actually gone that far I leave it to others to judge,17 but if it might be viewed in this light, then we might view those who have been the apostles of Bayesianism to the Forensic Science community as a kind of missionary order, with the statistician Dennis Lindley as its original prophet,18 the statistician Colin Aitken as its later messenger from the central church19 (along with those prophets from afar, Robertson and Vignaux20), and the most influential early converts from inside the forensic science community, Ian Evett, Graham Jackson, James Lambert, R. Cook and P.J. Jones,21 later joined and reinforced by wisemen from afar—that is to say, Lausanne, such as Chrisophe Champod,22 Franco Taroni23 and the sometimes miracle worker, Cedric Neumann,24 and by other wisemen from the lowlands, such as Charles Berger.25 These were all joined (sometimes with reservations) by some of the most influential members of the English legal academy on proof matters, such as Mike Redmayne26 and Paul Roberts,27 leaving only one important recalcitrant group of wayward non-believers, the Judiciary (as shown by the recent High Court Appeals decision Regina v. T.,28 among others).

In theoretic things, I tend not to be a joiner of schools, but something of a pragmatic magpie, taking a bit from here and there with regard to how it might work in the context with which I am dealing. And in the area of forensic science, the context is ultimately the criminal jury trial, at least in the United States, and in most English-speaking jurisdictions. And my concerns with Bayesian orthodoxy in forensic science is that it may in some circumstances not be fit for purpose, at least as the details are currently conceived, in the context of criminal law enforcement and the jury trial.

Before I elaborate on these concerns, let me say that I am very much attracted to Bayesianism as an alternative to current American practices involving flat source attribution29 (In this, I might be viewed as a largely unchurched catechumen who has not yet accepted the faith.) These are what I see as the strengths of Bayesianism applied to forensic science: The first strength is the emphasis on taking into account some explicit alternative hypothesis inconsistent with the strongest hypothesis which points toward guilt. I consider this an extremely important step in the journey toward neutralizing forensic science practice so that it is not functionally inclined toward mere confirmation of the prosecution’s hypothesis pointing toward the guilt of the suspect or defendant on trial. It is at least in part a fulfillment of the obligation I believe exists on the part of all participants in the criminal justice system, including prosecutors and case detectives, to look for evidence inconsistent with or disconfirming the hypothesis of guilt,30 which has been part of science since Sir Francis Bacon,31 and for ethical reasons should be at least as large a part of criminal prosecution as it is a part of science.32

The second strength is an exquisite sensitivity to the habit of humans in making an important kind of error when faced with expressions bearing on the probability as the result of tests. That tendency is to simply substitute the results of the test for all previous information bearing on the issue, rather than combining the results of the test with the previous information. This problem, in Bayesian terms, is the problem of ignoring prior probabilities,33 and is often exacerbated by committing the prosecutor’s fallacy, that is, the transposition of the conditional, so that the probability of seeing the evidence given guilt is turned into the probability of guilt given the evidence. 34

However, commendable sensitivity to these problems does not mean that the solutions proposed are either inevitable, or necessarily desirable in all circumstances. Let me use my first reservation as an initial illustration. In a very influential series of articles in the late 1990s,35 Evett, Jackson, Cook, Jones and Lambert set out their ‘hierarchy of propositions’, for what we must assume were limited to what has been called traditional forensic identification or source attribution techniques,36 where they defined three levels of propositions which in their opinion could be the subject of what they called the prosecution hypothesis, and what they called the corresponding defence hypothesis. Before getting to the levels, let me register a fairly strong objection to their characterization of these hypotheses in their model structure. I suppose that, given the prosecution’s burden of producing evidence and burden of persuasion, it is trivial to object to the labelling of the proposition that most favours guilt in regard to the interpretation of a forensic test ‘the prosecution hypothesis’. But that is not the case in regard to the so called ‘defense hypothesis’. As someone who has done a little criminal defence work, I am here to tell you that, in the United States at least, the defence is under no obligation to have a hypothesis, and often has none beyond the failure of the prosecution’s proof to meet the required standard of proof. To my mind it is a gross invasion of the right to stand mute to characterize the forensic scientist’s competing hypothesis as the ‘defense hypothesis’, and indeed, this problem has been at least alluded to in the orthodox literature.37

But if all that were at stake was labelling, proper labels such as the ‘hypothesis inconsistent with the prosecution hypothesis’ could be settled on. But under the ‘hierarchy of propositions’ vision of the ‘optimum usefulness’ of likelihood ratios, the problem is not so easily solved.

As I have already said, Evett et al. set out a three-level classification of hypotheses applicable to criminal cases. They called these levels ‘the offence level’ (which generally corresponds to the elements of the crime), the ‘source level’ (which generally corresponds to the attribution of a common source to a known and questioned sample) and the intermediate ‘activity level’, where propositions are about not just the origin of trace evidence, but the actions of the perpetrator.38 As to the ‘offence level’, they generally conceded that for the forensic practitioner, at least in the forensic identification areas, to make statements concerning facts corresponding exactly to these ultimate legal issues, is generally beyond the scope of their forensic expertise, and also usually deals with judgements reserved for the factfinder above any claimed expertise.39 Further, it seemed obvious to them, as it generally does to me, that judgements concerning hypotheses on the ‘source level’ do not present any problems, assuming that anything properly called reliable expertise exists in the first place.40 But it is at the ‘activity level’ where I believe the wheels threaten to come off of forensic science Bayesianism as it is now conceived, in my opinion. For it is on this level where the now standard doctrine encourages the forensic practitioner to obtain lots of case-specific information, in the effort to aid the case detectives in the formulation of their hypothesis and the ‘defense’ hypothesis.41 I and my co-authors Michael Saks, William Thompson and Robert Rosenthal have cautioned against the temptation for forensic scientists to begin to view their role as general detectives,42 and I have written at length on my own view that Sherlock Holmes was one of the most destructive influences on forensic science and its practitioners.43 To be sure, some forensic scientists would make very good general detectives, but allowing themselves to indulge this attractive temptation always risks a confusion of roles to the detriment of whatever science they bring to bear.44

I have spent the better part of the last decade attempting to convince the forensic science community of the distorting effects of domain-irrelevant information, the concomitant need for masking protocols in forensic science practice, and the responsibility of each forensic discipline to define the line between the information needed for the exercise of their forensic expertise (domain-relevant information) and information outside their expertise. Expanding the forensic scientists’ domain to the ‘activity level’ destroys the line between their expertise in their specific forensic discipline and a more general (and dangerous) claim to general investigative expertise. In so doing, I fear it justifies exposure to case information that will inevitably distort the results of the application of their actual expertise in a forensic discipline in light of case information they should not have and should not want to know.

Indeed, the danger of such induced pro-prosecution bias has been the subject of a recent editorial by Ian Evett in Science and Justice,45 though his solution, a kind of willed professional neutrality, is unlikely to be effective in the face of all the evidence that has been accumulated on the insidious effects of such biasing information despite the best of intentions.46 This problem can be ameliorated a great deal by the adoption of a sequential unmasking model, where the forensic scientist who deals with the case detective is obliged to send the evidence with only domain-relevant case information, in the least biasing order, to the person who is doing the testing and the initial characterizations and interpretations.47 But all in all, I would strongly suggest generally sticking to ‘source level’ hypotheses, or such ‘activity level’ hypotheses as are only slightly removed from source level and are conditioned on facts not capable of reasonable dispute. These are not only the least controversial, they are the easiest to put into binary form, and while I know that likelihood ratios can be used to judge the relative strengths of non-binary hypotheses, the proper formulation of such, especially at the ‘activity level’, is a treacherous undertaking best left alone. In my opinion.

I have two more reservations, which I will outline briefly. The first deals with what I consider to be the abuse of the notion of subjective probability to allow people to render opinions, and develop likelihood ratios, too, on what is sometimes no better than a guess. This is, to reiterate, a question not so much about the nature of probability statements as it is about the belief warrant required for probability statements in specific decisional contexts.48 To be sure, you don’t have to be a Bayesian to do that, as any number of faulty forms of assertion by putative forensic scientists have shown, both in the United States and elsewhere. My current favourite is bite mark identification, the weaknesses of which, presented Bayesianly or not, are being revealed more and more each day by the superb research of Mary and Peter Bush and David Sheets and their collaborators.49 But it seems to me that the general modern Bayesian tradition, which includes expected utility versions of decision theory, encourages faulty expression by making it seem acceptable for people to obtain the probabilities that are incorporated into likelihood ratios by simply making their best guess from experience when more should be required. This may be a more or less educated guess, but it remains a guess nonetheless, and does not justify the apparently exact quantified form in which it is expressed. In the context for which such decision theories were developed, which was primarily personal or corporate business decisions, there can be no real objection if people want to act on their best estimations and then use the theory to discipline themselves to act consistently. Their utilities are by definition their utilities, after all, and they will bear the costs of error (at least absent a bailout). But this should not be the approach when the fate of others is the subject of the exercise. Here there is not a free choice of utilities on the part of the decision maker. The utilities are generally supposed to be supplied by the law itself in ways that obligate the decision maker. It’s a little bit like the difference between Poker and litigation, as outlined by Steven Lubet in his book Lawyer’s Poker.50 In Poker, you are playing with your own money. In Litigation, you are playing with the clients—not just with their money, but with the clients themselves.51 And that goes for all players in the system, not just lawyers, but judges and forensic scientists also. This being said, I must observe that it is my impression that forensic scientists in the non-American pool often seek and value real data on issues before them more than similarly situated American practitioners, so perhaps I shouldn’t be so hard on the Bayesian position. But there is something about the generation of likelihood ratios with numbers from nowhere that tends to cover up the weakness of the ingredients, I believe, so my reservation remains. The problem of proper warrants for the subjective assessment of conditional probabilities, and the biasing effect of the case information necessary to construct hypotheses at the activity level, feed on each other. Although the process of rendering an opinion regarding the likelihood ratio in regard to the source level usually does not involve enough iterations using empirically derived information not to remain exquisitely sensitive to initial subjective estimates of base rates and other such quantities. I might be willing to accept the ‘experience’ of the forensic practitioner as a sufficient warrant to set that initial value in an environment properly masked from information irrelevant to that task. But I am not ready to concede a sufficient warrant for such a subjective assessment at the event level, especially when it necessarily depends on case information that is irrelevant to the source level determination, and will necessarily bias the conditional probabilities that are set for source level conclusions.

Finally, I must register my reservation about the terms in which testimony is recommended to be given as recommended by the standard Forensic Bayesian model. This is the general form: While the ‘provides moderate (or strong, or very strong, etc.) support for the prosecution proposition’ scale was intended to be an improvement over other verbal scales of using other expressions of estimative probability, I am not sure it actually delivers what was intended, especially since it will have to be completely unpacked on cross examination. True, the steps in the verbal scale are tied to specified likelihood ratios.52 And if the derivation of those ratios were themselves reliable, the scale would perhaps manifest an improvement in reliability, if not in validity, over competing approaches. But I remain skeptical that this reliability is real, and I also believe the different levels cover too broad a range of likelihood ratios, even if those ratios were reliably derived.53 However, that is a long discussion, and one that must be put off for another time.54

So in conclusion, I am quite attracted to the Bayesian Persuasion, at least in regard to forensic science interpretation, but I am not yet ready to submit completely. Perhaps after some reformation. But as of now, I remain attracted, but not fully converted.

I would like to thank William C. Thompson for helpful comments on the final draft of this article, and two anonymous referees for observations that made the final product better.

1 Wilde, Shaw and Bertrand Russell, among others, apparently made somewhat similar observations, but not as pithily as the saying finally became. Wilde was first, saying in 1887 (in The Cantorville Ghost) ‘We really have everything in common with America nowadays, except, of course, language.’ See the discussion at http://forum.wordreference.com/showthread.php?t=146783.

2 The major exception to this generalization regarding forensic science seems to be Canada, where forensic science practices correspond more closely to those of its southern neighbor, the United States. Conversation with Dr. Madlen Margau, Forensic Scientist, Chemistry Section, Centre of Forensic Sciences, Toronto, Ontario, Canada.

3 Eighth International Conference on Forensic Inference and Statistics, Seattle, Washington, 18 July 2011.

4 Which of course includes Australia and New Zealand, New Zealand being the source of one of the major texts, Bernard Robertson and G. A. Vignaux, Interpreting Evidence: Evaluating Forensic Science in the Courtroom (J. Wiley, 1995). But one must remember that Augustine wrote in Hippo, not in Rome.

5 There is some disagreement over whether the proper possessive form is ‘Bayes’s Theorem’ or ‘Bayes’ Theorem’. Compare, e.g. Stephen M. Stigler, Who Discovered Bayes’s Theorem? 37 Amer. Statistician 290 (1983), reprinted in Stephen M. Stigler, Statistics on the Table 291 (1999), with the Wikipedia entry ‘Bayes’ Theorem’, http://en.wikipedia.org/wiki/Bayes'_theorem. Under the tutelage of my wife, I have finally settled on the general rule that I will always add the possessive ‘s’ any time it is not horribly unpronounceable, as I have done here (although I will not ‘sic’ the other form in the notes that follow). Of course, as everyone knows, in its usual algebraic forms, it was not originated by Bayes at all, but by Laplace, but that’s another story for another footnote.

6 For a prècis, see D. Michael Risinger, Introduction to “Bayes Wars Redivivus,” 8 International Commentary on Evidence, article 1 (2010).

7 This is the footnote promised in footnote 5. The exposition in the single article on the subject attributed to Thomas Bayes, published posthumously in 1763 (after editing by Richard Price, and with an introduction and commentary by him) in the Philosophical Transactions of the Royal Society, set out certain important aspects of what is now known as Bayes’s Theorem, without any specific algebraic formulae. See (as the title quaintly puts it) An Essay towards Solving a Problem in the Doctrine of Chances. By the Late Rev. Mr. Bayes, F.R.S. Communicated by Mr. Price, in a Letter to John Canton, A.M.F.R.S., 53 Philosophical Transactions 370 (1763), available at http://rstl.royalsocietypublishing.org/content/53/370 But it was clearly Laplace, almost certainly independently, whose work gave the more generalized notion its impetus and content. See generally Stephen M. Stigler, The History of Statistics: Measurement of Uncertainty before 1900 (Belknap Press, 1986). As Stephen Fienberg has noted (noting others who have noted the same thing), ‘Bayes did not actually give us a statement of Bayes’ Theorem, either in its discrete form (this came with Laplace), or in its continuous form with integration, although he solved a special case of the latter.’ Stephen E. Fienberg, When did Bayesian Inference Become “Bayesian”?Bayesian Analysis 1, 3 (2006) (formula and note omitted). Available from http://ba.stat.cmu.edu/journal/2006/vol01/issue01/fienberg.pdf

8 Quoted from Good’s personal correspondence with Stephen Fienberg in Fienberg, supra n 7, at 15.

9 Professor Fienberg traces the genesis of the explicit notion of ‘subjective’ probabilities as degrees of belief to John Maynard Keynes’s 1921 Treatise on Probability, although he notes that Keynes ‘allowed for the possibility that degrees of belief might not be numerically measurable’. Fienberg, supra n 6, at 9. Fienberg also notes the use of the notion of ‘degrees of reasonable belief’ (the qualifier seems later to have fallen out) by Harold Jeffreys and Dorothy Wrinch in a 1919 paper. Later, beginning with Borel and Ramsay, approaches to quantifying subjective probability through mind experiments involving betting under various assumptions and constraints were put forth as adequate mechanisms for quantification.

10 I am aware of one sense in which all probabilities must be regarded as, in that sense, subjective, in the same way that language, being an abstraction, is in the same sense subjective. Both are products of mind, and are therefore mind-dependent. Something like this seems to have been the starting point for Bruno De Finetti. See Bruno De Finetti, Theory of Probability, vol. 1, p. 1 (English translation, J. Wiley, 1974) “Probability does not exist”, quoted in Robert F. Nau, De Finetti Was Right: Probability Does Not Exist, 51 Theory & Decision 89 (2001). Perhaps the most mysterious thing, then, is why anyone claims any probabilities are not subjective. However, I think I get the main point of the frequentists: despite various levels of mystery (including the ontological status of mathematics in general, see Mary Leng, Mathematics and Reality (Oxford University Press, 2010), objective or ‘frequentist’ probability is the appropriate model when one has reason to believe that abstract notions of exact mathematical probability specification (represented in their simplest form by specified ‘mental experiment’ urn drawing problems or similarly specified mental experiments regarding gambling games) actually represent a reasonable model of the empirical world in which a problem is faced and must be answered. (real poker games, etc.). When such circumstances do not exist (and more often than not they don’t) best guesses (subjective probabilities) must be resorted to, or else we must defer decision.

 But in both language and in notions of probability, there is a necessary contribution from an exterior physical reality, and more so in regard to warranted probability statements about factual conditions than in regard to some kinds of natural language expressions such as value statements. So my own emphasis is not on the subjective nature of probabilities vel non in this general sense, but on the empirical warrant that is, or ought to be, necessary to derive and use a probability statement in different contexts. It may be that numbers derived from subjective individual mental betting experiments are perfectly sufficient for some kinds of problems (contexts in which no iterative process is available to refine prior assumptions out of results, a decision must be made, and all the costs will fall on the betting party or parties, for instance), but not in others. However, appropriate criteria for acceptable best guesses for different classes of problems seem hardly ever to be discussed, at least so far as I have been able to discover. That is the most mysterious part about subjective probabilities, to me.

11 In a recent popular book on Bayesianism, Sharon Bertsch McGrain attributes to Jack Good the tongue-in-cheek claim to have identified 46 656 varieties of Bayesians. She then lists the following varieties: subjective, personalist, objective, empirical, semi-empirical, semi-Bayes, epistemic, intuitionist, logical, fuzzy, hierarchical, pseudo, quasi, compound, parametric, nonparametric, hyperparametric and non-hyperparametric Bayes. See Sharon Bertsch McGrain, The Theory that Would Not Die: How Bayes’ Rule Cracked the Enigma Code, Hunted Down Russian Submarines & Emerged Triumphant from Two Centuries of Controversy 129 (Yale University Press, 2011) (e. e. cummings-like lack of initial capitalization copied directly from spine and title page of book). These categories obviously are not mutually exclusive. A further word about this book might be in order. It is an exercise in a kind of triumphalist Whig history, which fails to make very clear the actual contributions of ‘Bayes’ to many of the episodes it recounts as examples of Bayesian triumph. On the other hand, it is a terrific compendium of anecdotes concerning the contempt in which ‘frequentists’ held ‘Bayesians’ (and vice versa) in the statistics community for over half a century. I have blessedly never seen such vitriol in the circles I inhabit. Who would have guessed that statisticians could be so much more vicious than lawyers in any context.

12 See discussion in fn 10 supra. In fact, in many scientific contexts, ‘subjective’ priors are not assigned by a subjective process at all, but according to formulas that appear to give reasonable prior assumptions for classes of cases. See Robert E. Kass and Larry Wasserman, The Selection of Prior Distributions by Formal Rules, 91 J. Am. Stat. Assoc. 1343 (1996).

13 Fienberg, supra n 7, at 14–20; McGrain, supra n 11, at 97–107.

14 The wellspring work is Howard Raiffa and Robert Schlaifer, Applied Statistical Decision Theory (Harvard University, 1961).

15 See generally Matt Jones and Bradley C. Love, Bayesian Fundamentalism or Enlightenment? On the Explanatory Status and Theoretical Contributions of the Bayesian Model of Cognition, 34 Behav. & Brain Sci. 169 (2011).

16 See D. Michael Risinger, Boxes in Boxes: Julian Barnes, Conan Doyle, Sherlock Holmes and the Edalji Case, 4 Int’l Comment. on Evidence, Iss. 2, Art. 3 at 7, n. 13, available at http://www.bepress.com/ice/vol4/-iss2/art3.:

The foundational theorist of 19th-century positivism was Auguste Comte (1798–1857), who was, among many other things, dedicated to the proposition that the methods of science could be extended to give sure predictive knowledge to human social actions. See generallyThe Positive Philosophy of Auguste Comte freely translated and condensed by Harriet Martineau (1855); see alsoJohn H. Zammito, A Nice Derangement of Epistemes: Post-Positivism from Quine to Latour 6–8 (2004). Comte, like Marx, was an enemy of religion who rather paradoxically founded a sort of social religion. While Comte’s positions are looked upon as foundational to the social sciences, his views influenced popular views of science more than they influenced science itself. Ironically, he wrote at a time when the switch among natural scientists from certain knowledge goals to probabilistic goals was just beginning to take hold. SeeLarry Laudan, Science and Values 8385 (1984).

17 In light of the extremely partisan nature of the conflict between ‘frequentists’ and ‘Bayesians’, the religious trope in this context has become almost a cliché. See, e.g. Appendix A to Mcgrain, supra n 11, at 253 (I can’t keep up the uncapitalized forms in that book) by Michael J. Campbell, Professor of Medical Statistics at the University of Scheffield, which begins:

As one gets older one’s thoughts turn naturally to religion, and I have been pondering the religious metaphors in statistics. Clearly the frequentists are the metaphorical Catholics … .the frequentist prayer is ‘Our Fisher, who art in Heaven …’ On the other hand Bayesians are born-again fundamentalists. One must be a ‘believer’ and Bayesians can often pinpoint the day when Bayes came into their lives …

See also Mike Redmayne, Expert Evidence and Criminal Justice 51–52 (Oxford University Press, 2001), where Professor Redmayne notes the almost ‘religious fervor’ of many adherents of Bayesianism that led one prominent Bayesian, perhaps tongue in cheek, to refer to ‘Bayesianity’. However, I believe I may be the first to adopt the stance of critical catechumen, at least in print.

18 For decades, Professor Dennis V. Lindley has been a leader of the Bayesian movement in statistics. See http://en.wikipedia.org/wiki/Dennis_Lindley, McGrayne, supra n 9, at 99–107. Although forensic science does not appear to have been a major focus for him, his 1977 article A Problem in Forensic Science, 64 Biometrika 207, on the interpretation of crime scene glass fragments, introduced forensic scientists to the whole structure of Bayesian interpretation. It is referred to as the ‘fons et origo’ of Bayesian interpretation in forensic science by Mike Redmayne. See Redmayne, supra n 17, at 37. Its influence was profound. See Ian Evett, Interpretation: A Personal Odyssey in Colin G.G. Aitken and David Stoney (eds), The Use of Statistics in Forensic Science (E. Hoorwood, 1991).

19 See Colin G.G. Aitken, Statistics and the Evaluation of Evidence for Forensic Scientists (J. Wiley, 1995).

20 See Robertson and Vigneaux, supra n 3.

21 The most influential series of articles on the topic of interpretation of the results of forensic testing to appear in any British forensic science journal is undoubtedly R. Cook, I.W. Evett, G. Jackson, P.J. Jones and J.A. Lambert, A model for case assessment and interpretation, 3 Sci. & Justice 151 (1998); R. Cook, I.W. Evett, G. Jackson, P.J. Jones and J.A. Lambert, A hierarchy of propositions: deciding which level to address in case work, 38 Sci. & Justice 231 (1998); and I.W. Evett, G. Jackson and J.A. Lambert, More on the hierarchy of propositions: exploring the distinctions between explanations and propositions, 40 Sci. & Justice 3 (2000). In the first two, the author grouping of the first two names is alphabetical by institution (Metropolitan Laboratory), but Dr. Evett is listed as the contact author, and by all subsequent indices it must be said that he is the Bayesian leader among the English forensic science practitioners. He was exploring Bayesian doctrine even before the publication of Aitken, supra n 19. See, e.g. I W. Evett and J. S. Buckleton, Some aspects of the Bayesian approach to evidence evaluation, 29 J. Forens. Sci. Society 317 (1989). Graham Jackson has also been particularly active in publishing works on Bayesian interpretation in forensics. See particularly Graham Jackson, Understanding Forensic Science Opinions, Ch. 16 in Jim Fraser and Robin Williams (eds), The Handbook of Forensic Science (Willan Publishing, 2009).

22 Professor Champod is on the faculty of law and forensic science, University of Lausanne, Lausanne, Switzerland. Lausanne is one of the leading forensic science institutions in the world, and, it is fair to say, a hotbed of Bayesianism. Professor Champod has published many works with Bayesian themes, including ones co-authored with Dr. Evett, such as C. Champod C and I. W. Evett, A probabilistic approach to fingerprint evidence, 51 J. Forens. Ident. 101 (2001). Others from the Lausanne faculty could be listed, but I have confined myself to the attendees at the 2011 ICFIS Conference, where the original paper on which the text is based was delivered.

23 Professor Taroni is also on the faculty of law and forensic science at the University of Lausanne. He co-authored the second edition of Dr. Aitken’s book, supra note 17, see Colin Aitken and Franco Taroni, Statistics and the Evaluation of Evidence for Forensic Scientists (2nd ed., J. Wiley, 2004), and is more recently the lead author on two further volumes, Franco Taroni, Colin Aitken, Paolo Garbolino and Alex Biedermann, Bayesian Networks and Probabalistic Inference in Forensic Science (J. Wiley, 2006), and Franco Taroni, Sylvia Bozza, Alex Biedermann, Paolo Garbolino and Colin Aitken, Data Analysis in Forensic Science (J. Wiley, 2010).

24 Dr. Cedric Neumann was trained at Lausanne, and is now an assistant professor of statistics and forensic science at Penn State University. I refer to him as a miracle worker because he is the lead author on Cedric Neumann, Ian Evett and J. Skerrett, Quantifying the Weight of Evidence from a Forensic Fingerprint Comparison: A New Paradigm, set for publication in the Journal of the Royal Statistical Society in 2012. Without going into too much detail, the article shows how to use common minutiae in fingermarks as anchors for triangles that can then be specified in such a way as to act as virtual alleles for that fingermark in a search of a large database. The search will give a population incidence rate for that triangle. Repeat it with other such triangle specifications and you can potentially get a truly useful random match probability that can easily be turned into a warranted likelihood ratio, should that be your preferred means of understanding such information. Brilliant, whether you are a Bayesian or not.

25 Dr. Charles Berger is principal scientist at the Netherlands Forensic Institute. At the conference, which generated this article, he joined Professor Aitken in giving a one-day workshop on Bayesian interpretation in forensic science. I am informed that Bayesian interpretation is not uniformly followed on the European continent, Lausanne and the Netherlands being its main centres. It is not generally used in Germany, for instance. Conversation with Dr. Stefan Becker, Scientific Director, Forensic Science Institute, Federal Criminal Police Office, Wiesbaden, Germany, August, 2011.

26 I am perhaps overstating Professor Redmayne’s commitment to the Bayesian approach. In Redmayne, supra n 17, he gives a careful evaluation of the pros and cons of the Bayesian approach, registering some of the same reservations that I have. However, his recent article in the wake of Regina v. T., EWCA Crim. 2439 (2009), 1 Cr. App. R. 9 (2011), co-authored with Paul Roberts, Colin Aitken and Graham Jackson, seems more positive. Perhaps it was the rather extreme, and not altogether clear, exposition of the Court of Appeal in that case that precipitated this. See Mike Redmayne, Paul Roberts, Colin Aitken and Graham Jackson, Forensic Evidence in Question, 2011 Crim. L. Rev. 347 (2011).

27 Professor Roberts hardly mentions Bayesianism in connection with expert evidence and forensic evidence interpretation in the chapter on expert evidence in his leading treatise (with Adrian Zuckerman), Paul Roberts and Adrian Zuckerman, Criminal Evidence 467–509 (2nd ed., Oxford University Press, 2010). However, he not only co-authored Redmayne et al., supra n 26, he is also the co-author of the thoroughly Bayesian Colin Aitken, Paul Roberts and Graham Jackson, Fundamentals of Probability and Statistical Evidence in Criminal Proceedings: Guidance for Judges, Lawyers, Forensic Scientists and Expert Witnesses (Royal Statistical Society, 2011).

28 EWCA Crim. 2439 (2009), 1 Cr. App. R. 9 (2011). This opinion’s restrictions on the use of likelihood ratios in developing and presenting evidence to a jury has precipitated a good bit of response, some flatly critical and some attempting to cabin the decision narrowly. Aside from Redmayne et al., supra n 26, see, e.g. Norman Fenton and Martin Neil, Limiting the Use of Bayes in Presenting Forensic Evidence: An Irrational but Understandable Ruling (manuscript on file with author); The Forensic Institute, Commentary on the case of R. v. T. on footwear mark evidence and use of likelihood ratios, available at http://www.theforensicinstitute.com/PDF/Commentary%20on%20R%20v%20T%202011.pdf; Geoffrey S. Morrison, The likelihood-ratio framework and forensic evidence in court: a response to R v T., 16 Int. J. Ev. and Proof 1 (2012); and Bernard Robertson, C.A. Vignaux and Charles E. H. Berger, Extending the Confusion about Bayes, 74 MLR 444 (2011), among others.

29 that is, testimony in the form of ‘this trace came from this source to the exclusion of all other potential sources inthe world’, or some explicit or implied variant thereof, perhaps with a less absolute qualifier. This approach is described and critiqued in Dawn McQuiston-Surrett and Michael J. Saks, Communicating Opinion Evidence in the Forensic Identification Sciences: Accuracy and Impact, 59 Hastings L. J. 1159 (2008). This approach is generally defended by the American footwear expert William Bodziak in William J. Bodziak, Traditional conclusions in footwear examinations versus the use of the Bayesian approach and likelihood ratio: A review of a recent UK appellate court decision. Law Probability and Risk, doi: 10.1093/lpr/mgs018 (2012).

30 See D. Michael Risinger and Lesley C. Risinger, Innocence is Different: Taking Innocence into Account in Reforming Criminal Procedure, 56 N.Y.L. L. Rev. 869 (2012).

31Id, at 890–894.

32Id.

33 See the discussion in Michael J. Saks and D. Michael Risinger, Baserates, the Presumption of Guilt, Admissibility Rulings, and Erroneous Convictions, 2003 Mich. St. L. Rev. 1051 (2003) (discussing the work of Daniel Kahneman and Amos Tversky on heurtistics and biases, reflected, e.g. in Amos Tversky and Daniel Kahneman, Heuristics and Biases, in Daniel Kahneman, Paul Slovic and Amos Tversky., (eds), Judgment Under Uncertainty: Heuristics and Biases (Cambridge University Press, 1982). But as I and another co-author have previously observed:

It should be noted that there is an ongoing debate over the extent to which ordinary humans are subject to processing errors from “probability blindness” in circumstances of decision presented by everyday life in the modern world. Some, most notably Gerd Gigerenzer, assert that the poor performance of people in laboratory experiments are more an artifact of the artificiality of the way information is presented in the experiment than a function of inaccurate judgment in normal circumstances. See, e.g., Gerd Gigerenzer & Peter M. Todd, Fast and Frugal Heuristics: The Adaptive Toolbox, inSimple Heuristics That Make Us Smart (Gerd Gigerenzer et al. eds., 1999); Gerd Gigerenzer, How to Make Cognitive Illusions Disappear: Beyond “Heuristics and Biases,” in 2 Eur. Rev. Soc. Psychol. (Wolfgang Stroebe & Miles Hewstone eds., 1991). The debate appears to be a debate over whether our cognitive cup is half or more empty, or half or more full, since both sides concede that there are some problems we solve well, and some problems we deal with poorly.

D. Michael Risinger and Jeffrey C. Loop, Three Card Monte, Monty Hall, Modus Operandi and “Offender Profiling”: Some Lessons of Modern Cognitive Science for the Law of Evidence, 24 Cardozo Law Rev 193, 196, fn 10 (2002). For a recent review of the current state of the Kahneman/Tversky-Gigerenzer debate and the implications for the law, see generally Mark Kelman, The Heuristics Debate(Oxford University Press, 2011).

34 The term ‘prosecutor’s fallacy’ for the transposition of the conditional was coined by William Thompson and Edward Schumann in a 1987 article, William C. Thompson and Edward L. Schumann, Interpretation of Statistical Evidence in Criminal Trials: The Prosecutor’s Fallacy and the Defense Attorney’s Fallacy, 11 L. & Human Behav. 167 (1987). Why this is an error (if it is not already well known to the reader, or obvious up a moment’s reflection) is fully discussed in that article.

35 See Cook et al., supra n 21.

36 This is because the hierarchy of propositions as set out and discussed concentrates on source attribution techniques. Obviously, much forensic science identification of chemical compounds is undertaken in the context of drug crimes or other possessory crimes, where the identity of the compound is in itself an element of the crime. In such situations, the distinction between the ‘source level’ and the ‘offense level’ disappears, and there is likewise little room left for propositions on the ‘activity level’ also.

37 See, e.g. Robertson & Vignaux, supra n 3, at 44–45.

38 Cook et al., supra n 21, at 232–33.

39 ‘With [offence level] propositions there are, in general, considerations which are completely outside the domain of the scientist.’ Id, at 233.

40Id, at 232–33.

41 ‘Another important difference is that [source level] propositions generally require little in the way of circumstantial information, but [activity level] propositions cannot be addressed without a framework of circumstances.’ Id, at 233.

42 See the extended discussion in D. Michael Risinger, Michael J. Saks, William C. Thompson and Robert Rosenthal, The Daubert/Kumho Implications of Observer Effects in Forensic Science: Hidden Problems of Expectation and Suggestion, 90 Cal. L. Rev. 1, 27–30 (2002), entitled ‘Proper and Improper Information in the Forensic Science Practice’, and warning that ‘A forensic scientist is not a detective’. Mike Redmayne noted the potential for such a problematic expansion of role under the ‘hierarchy of propositions’ approach in 2001, see Redmayne, supra n 17, at 41. Perhaps such role expansion would be less problematic in the setting of non-jury criminal adjudications, where the expert can be anointed as a kind of official semi-judge if that role is thought to be desirable from an epistemic point of view, but in any system still committed to retaining lay juries to determine guilt of serious crime, I believe it ought to be deeply disturbing.

43 See Risinger, supra n 16, at 5–19.

44 The amount of case information and the level of detail thought appropriate in conducting ‘activity level’ analyses is well illustrated by the checklist given in Graham Jackson and Philip J. Jones, Case Assessment and Interpretation, (Wiley Online Library 2009), available at http://onlinelibrary.wiley.com/doi/10.1002/9780470061589.fsa124/abstract. The list of appropriate questions set out when treating a hypothetical case in which a car was used in an armed robbery and then abandoned, includes such questions as ‘Do the police suspect the person of driving the car?’ and ‘What strength of evidence is required?’ which was notionally answered in the hypothetical (with implied approval), ‘Suspect has been charged but there is very little other evidence against the suspect and so strong evidence would be required to proceed with the charge; if strong evidence is not forthcoming, the charge will be dropped.’ Id, at 8. Both the problem of defining the proper scope of the expertise in question, and the biasing effects of such information, seem obvious.

45 Ian Evett, Evaluation and Professionalism (editorial), 49 Science & Justice 159 (2009).

46 For discussions of such ‘observer effects’, their immunity to well-intended acts of will, and the pertinent literature, see Risinger, Saks, Thompson and Rosenthal, supra n. 42, at 6–27, 51; D. Michael Risinger, The NAS/NRC Report on Forensic Science: A Glass Nine/Tenths Full (This Is About the Other Tenth), 50 Jurimetrics 21, 22–34 (2009). Itiel E. Dror and Greg Hampikan, Subjectivity and Bias in Forensic DNA Mixture Interpretation, 51 Science & Justice 204 (2011).

47 See Risinger et al., supra n. 42, at 45–47; Dan E. Krane, Simon Ford, Jason R. Gilder, Keith Inman, Allan Jamieson, Roger Koppl, Irving R. Kornfeld, D. Michael Risinger, Norah Rudin, Marc Scott Taylor, and William C. Thompson, Letter to the Editor: Sequential Unmasking: A Means of Minimizing Observer Effects in Forensic DNA Interpretation, 53 J. Forensic Sci. 1006 (2008); William C. Thompson, Simon Ford, Jason R. Gilder, Keith Inman, Alan Jamieson, Roger Koppl, Irving Kornfield, Dan E. Krane, Jennifer L. Mnookin, D. Michael Risinger, Norah Rudin, Michael J. Saks, and Sandy Zabell, Commentary on Thornton re Sequential Unmasking (Letter to the Editor), 55 J. Forensic Sci. 1663 (2010).

48 See discussion in n. 10 supra.

49 These studies show the exquisite unreliability of the attribution of the source of bitemarks on human skin under most conditions that will be met with in practice. See Mary A. Bush, Raymond G. Miller, Peter J. Bush, and Robert B. J. Dorion, Biomechanical Factors in Human Dermal Bitemarks in a Cadaver Model, 54 J. Forensic Sci. 167 (2009); Raymond G. Miller, Peter J. Bush, Robert B. J. Dorion, and Mary A. Bush, Uniqueness of the Dentition as Impressed in Human Skin: A Cadaver Model, 54 J. Forensic Sci. 909 (2009); Mary A. Bush, Kyle Thorsrud, Raymond G. Miller, Robert B. J. Dorion, and Peter J. Bush, The Response of Skin to Applied Stress: Investigation of Bitemark Distortion in a Cadaver Model, 55 J. Forensic Sci. 71 (2010); Mary A. Bush, Howard I. Cooper, and Robert B. J. Dorion, Inquiry into the Scientific Basis for Bitemark Profiling and Arbitrary Distortion Compensation 55 J Forensic Sci 976 (2010); Mary A. Bush, Peter J. Bush, and H. David Sheets, Similarity and Match Rates of the Human Dentition In Three Dimensions: Relevance to Bitemark Analysis. 125 International Journal of Legal Medicine 779 (2010). H. David Sheets and Mary A. Bush, Mathematical Matching of a Dentition to Bitemarks: Use and Evaluation of Affine Methods. 207 Forensic Science International 111(2011); Mary A. Bush, Peter J. Bush, and H. David Sheets, Statistical Evidence for the Similarity of the Human Dentition. 56 J Forensic Sci. 118 (2011); H. David Sheets, Peter J. Bush, Cynthia Brzozowski, Lillian A. Nawrocki, Phyllis Ho, and Mary A. Bush, Dental Shape Match Rates in Selected and Orthodontically Treated Populations in New York State: A Two Dimensional Study, 56 J Forensic Sci, 621 (2011); Mary A. Bush, Peter J. Bush, and H. David Sheets, A Study of Multiple Bitemarks Inflicted in Human Skin by a Single Dentition Using Geometric Morphometric Analysis. 211 Forensic Science International 1 (2011).

50 Steven Lubet, Lawyers’ Poker: 52 Lessons That Lawyers Can Learn from Card Players (Oxford University Press, 2006).

51Id, at 215–18.

52 That is, to the extent they are still usable after Regina v. T. The chart showing the likelihood ratios associated with each of the six affirmative categories of strength of support for the prosecution’s proposition (weak support, moderate support, moderately strong support, strong support, very strong support and extremely strong support) are given in Association of Forensic Science Providers, Standards for the formulation of evaluative forensic science expert opinion, 49 Sci. and Justice 161, 163.

53Id. Each of the first four categories cover likelihood ratios varying by an order of magnitude, and the fifth by two orders of magnitude. This scheme of expressions of ‘estimative probability’, as they are sometimes called, seems to lack actual function, since it seems that the actual likelihood ratios will have to come out on direct or cross examination in order to explain to the jury what is intended by the verbal expressions.

54 I know that one does not have to be a Bayesian to see the utility of likelihood ratios, and to generate them. But they are a device heavily associated with forensic science Bayesianism. Frankly, I think random match probabilities are sometimes more effective in communicating with ordinary folks than likelihood ratios, since I believe they invite simple set theory explanations that are easier to understand, and which can be graphically represented visually more easily in argument. I find it more congenial when talking to regular people about the implications of a well warranted random match probability to explain what it suggests about how many people would match the criterion in question in a population of a given size—a perpetrator candidate population, if you will. An RMP or similar measure enables that discussion directly in a way that a likelihood ratio does not. Plus, a likelihood ratio can generate its own version of the prosecutor’s fallacy. See Jonathan J. Koehler, On Conveying the Probative Value of DNA Evidence: Frequencies, Likelihood Ratios, and Error Rates, 67 U. Colo. L. Rev. 859, 883 (1996). There has been a great deal of research in the past 25 years aimed at determining how well ordinary people of the sort that end up on juries deal with probabilistic information supplied by experts when combining it with evidence of the more ordinary sort, and how best to communicate the meaning of probabilistic information so that jurors neither overvalue nor undervalue it. That literature is reviewed in McQuiston-Surrett and Saks, supra, n 29. Suffice it to say that the sparse research record concerning likelihood ratios as a means of communication does not establish the superiority of likelihood ratios in performing this task either in relation to population incidence numbers derived from formal studies, or in regard to such incidence numbers derived from clinical judgements based on experience (which the McQuiston-Surrett and Saks refer to as ‘guesstimates’.). But a full airing of these issues is another discussion for a different time.