Conviction in criminal trials in the USA, the UK and many other common law countries requires establishing a defendant’s guilt beyond a reasonable doubt. By contrast, in Title IX proceedings at American colleges and universities, allegations of wrongdoing are adjudicated according to a much lower ‘preponderance of the evidence’ standard. Victims’ rights advocates correctly argue that a lower burden of proof makes it easier to ensure that the guilty are punished. But there is also a mathematically inevitable corollary: a lower burden of proof increases the probability of concluding that the innocent are guilty. This article provides a framework for using information regarding false conviction probabilities in criminal trials to model the probability of false guilty verdicts in Title IX proceedings in American colleges and universities. The quantitative results presented herein show that an innocent defendant faces a dramatically increased risk of conviction when tried under the preponderance of the evidence standard as opposed to under the beyond a reasonable doubt standard.

1. Introduction

In relation to the U.S. federal statute known as Title IX, there has been dramatic recent growth in what amounts to a parallel justice system at American colleges and universities. While the concept of campus disciplinary proceedings in relation to issues such as academic dishonesty dates back centuries, since 2011 American institutions of higher education have been under a federal mandate relating to Title IX to decide guilt or innocence in relation to accusations of sexual violence and sexual harassment. And, under the same federal mandate, these accusations must be adjudicated not under the ‘beyond a reasonable doubt’ burden of proof used in criminal courts, but instead under the much lower ‘preponderance of the evidence’ standard.

There is a robust policy debate regarding both the merits and the perils of using on-campus tribunals1 to assess guilt or innocence for actions that constitute serious crimes. This article aims to contribute to that debate by examining an issue that is of critical importance in any justice system: the probability of false convictions.

Given the recency and procedural opacity of campus Title IX tribunals, it will be many years, if ever, before a sufficient body of data exists to directly examine the rates of false guilty findings in this context. To help address this gap, this article proposes that information regarding the much better studied issue of false convictions in the traditional criminal justice system can be used, with the aid of an appropriately designed and parameterized probabilistic model, and with suitable adjustments to account for the different burden of proof levels, to make inferences regarding the probabilities of false determinations of guilt in the context of Title IX proceedings.

The approach is based on the premise that in any justice system, when considered in the aggregate over a large number of proceedings, the assessments regarding the probability of guilt reached by triers of fact can be mathematically modelled using random variables.2 More specifically, one random variable can be used to model the decisions of a tribunal with respect to innocent defendants and a different random variable can be used to model the decisions of the tribunal with respect to guilty defendants. The probability distributions associated with these random variables overlap, reflecting the fact that no justice system is perfect: sometimes the guilty are acquitted and sometimes the innocent are convicted. More formally, expressed in the language of hypothesis testing, under a ‘type I’ error, an innocent defendant is wrongly convicted3 and under a ‘type II’ error, a guilty defendant is wrongly declared innocent.4

There is a trade-off between these two types of errors. If the burden of proof necessary to find a defendant guilty is very low, there will be an unacceptably high rate of innocent defendants being found guilty (i.e. too many type I errors). If the burden of proof is made higher, type I errors become less frequent but type II errors become more common. At the hypothetical other end of the spectrum—i.e. a system in which guilt can be declared only when there is 100% confidence that the defendant is truly guilty—then there will be many type II errors; i.e. many guilty people will be wrongly declared innocent.

The ‘beyond a reasonable doubt’ standard used in criminal trials represents a widely accepted solution that balances this trade-off while also reflecting the belief, long present in English and American law, that to wrongly convict an innocent defendant presents a greater injustice than wrongly acquitting a guilty one.5 The issue of what specific level of confidence to associate with beyond a reasonable doubt has been the subject of long-standing interest in the legal literature. Enquiries on this issue have aimed to answer the following question: when a tribunal is asked to render a decision, what is the confidence threshold regarding the likelihood of guilt that must be exceeded to satisfy ‘beyond a reasonable doubt’?

Studies of the assessed likelihood of guilt associated with beyond a reasonable doubt have often yielded probabilities of approximately 90%. For example, in a 1970s law review article, Simon and Mahan reported results from a survey of judges that yielded a median response of 88% as the requisite probability threshold.6 Simon and Mahan also surveyed prospective jurors and college students and reported median probabilities of 86% and 91% respectively.7 In an early 1980s survey of 171 federal judges by McCauliff, the median probability level associated with beyond a reasonable doubt was 90% and the average was 90.3%.8 Other probability thresholds that have been suggested include 95%9 and ‘well above’ 80%.10

By contrast, the ‘preponderance of the evidence’11 standard used in Title IX proceedings on U.S. campuses is a much lower burden of proof. This is often expressed using the phrase ‘more likely than not’, meaning that a tribunal is instructed to return a guilty finding if its assessment of the likelihood of guilt exceeds 50%. Inevitably, there will be many defendants who would be declared innocent if the evidence is evaluated using a beyond a reasonable doubt standard but found guilty if that same evidence is evaluated using a preponderance of the evidence standard. Consider a tribunal that, after weighing the evidence, believes that there is a 70% likelihood that the defendant is guilty. Given the corresponding 30% likelihood of innocence, the tribunal could not reasonably find the defendant guilty beyond a reasonable doubt. Yet under a preponderance of the evidence standard, the tribunal would declare the defendant to be guilty.

It is a mathematical and logical inevitability that the odds that an innocent defendant will be found guilty are higher under a preponderance of the evidence standard than under a beyond a reasonable doubt standard. This article aims to move beyond this qualitative characterization and offer a framework for quantitatively evaluating the issue of false guilty verdicts under a preponderance of the evidence standard.

The remainder of this article is organized as follows. Section 2 provides background regarding Title IX and on the 2011 letter from the U.S. Department of Education that spurred the growth in on-campus Title IX tribunals and the associated preponderance of the evidence standard. Section 3 describes the probabilistic modelling framework and provides numerical results for several different probability models. Additional discussion and analysis are presented in Section 4 and conclusions are offered in Section 5.

2. Title IX and the preponderance of the evidence standard

Title IX, which was enacted by the U.S. Congress as part of the Education Amendments of 1972,12 provides that ‘No person in the United States shall, on the basis of sex, be excluded from participation in, be denied the benefits of, or be subjected to discrimination under any education program or activity receiving Federal financial assistance.’13 Title IX is extremely broad in scope, impacting all U.S. universities, colleges, and elementary and secondary schools that receive federal funds. During the first several decades after its enactment, much of the attention regarding Title IX focused on its impact on athletic programmes, in particular in relation to the relative participation rates of men and women in sports.

Recent years have seen a significant expansion of the impact of Title IX on U.S. institutions of higher education. In April 2011, the Office of Civil Rights (OCR) within the U.S. Department of Education issued a letter14 stating that ‘[s]exual harassment of students, which includes acts of sexual violence, is a form of sex discrimination prohibited by Title IX.’ The letter also stated that ‘Title IX regulations require all recipients [of federal financial assistance] to adopt and publish grievance procedures providing for the prompt and equitable resolution of sex discrimination complaints.’15

In addition, the OCR letter addressed the issue of burden of proof, stating that ‘in order for a school’s grievance procedures to be consistent with Title IX standards, the school must use a preponderance of the evidence standard (i.e., it is more likely than not that sexual harassment or violence occurred)’.16 Notably, OCR explicitly rejected the ‘clear and convincing’ standard, a burden of proof that lies between preponderance of the evidence and beyond a reasonable doubt, writing that ‘the “clear and convincing” standard … is a higher standard of proof. Grievance procedures that use this higher standard are inconsistent with the standard of proof established for violations of the civil rights laws, and are thus not equitable under Title IX. Therefore, preponderance of the evidence is the appropriate standard for investigating allegations of sexual harassment or violence.’17

There has been vigorous debate regarding both the response of university administrations18 to OCR’s requirements as well as whether OCR overstepped its authority19 by issuing what amounts to new regulation without the requisite notice and comment opportunities required under the rule-making provisions of the Administrative Procedure Act.20 However, colleges and universities are understandably unwilling to risk jeopardizing their federal funding, so have treated the OCR letter as if it was binding regulation. As a result, the preponderance of the evidence standard has been very widely adopted on U.S. college and university campuses.

This standard has an enormous potential negative impact on the subset of defendants in Title IX proceedings who are innocent. Wrongly accused defendants face the sobering prospect of having their cases adjudicated under a system in which the threshold for declaring them guilty is only a hair over 50%. This means that even if the tribunal reviewing the evidence concludes there is a 49% chance that a defendant did not engage in the accused conduct, he or she will be pronounced guilty. In addition to receiving punishment imposed by their educational institution, defendants wrongly found guilty will often be subject to widespread opprobrium on social media, including by people unaware of or uninterested in the fact that a low burden of proof that was used in reaching the guilty verdict. Given the high social costs of improper findings of guilt in this context, it is important to understand the statistics that will govern how often they will occur. An additional issue is that a lower burden of proof may increase the likelihood that proceedings will be initiated in the first place. As Allen has written (in an article addressing burdens of proof generally, not in the context of Title IX proceedings), ‘what the standard is affects the decisions that people make about whether to risk trial. If the standard is lowered, prosecutors will have the incentive to bring cases that they would not bring if the standard is higher.’21

3. When an innocent defendant is found guilty: modelling the probabilities

3.1 Framing the probabilities

In any proceeding designed to assess guilt or innocence, there are multiple ways to measure the probabilities associated with false findings of guilt. One possibility is to measure the probability that an innocent defendant will be found guilty. This can be expressed as the conditional probability P(conviction | innocent); i.e. the probability that a person will be convicted given that he or she is innocent. Another way to measure false convictions is to consider the fraction of all convicted persons who are innocent. This can be expressed P(innocent | conviction); i.e. the probability that a person who has been convicted is innocent. These two probabilities are related through Bayes’ theorem.

For example, consider a group of 100 people accused of independent acts wrongdoing, 84 of whom are guilty and 16 of whom are innocent. Suppose a tribunal convicts22 76 of the 84 guilty people but acquits the remaining 8. In addition, suppose that the tribunal acquits 12 of the 16 innocent people but erroneously declares 4 of them guilty. Thus, in this example P(conviction | innocent) = 0.25, since 4 of the 16 innocent people were found guilty. In addition, P(innocent | conviction) = 0.05, since 4 of the 80 people found guilty were in fact innocent.23

While both P(conviction | innocent) and P(innocent | conviction) are important metrics, they measure very different things. From the standpoint of someone who has been wrongly accused and is facing a trial, the fact that in the example above P(innocent | conviction) = 0.05—i.e. that ‘only’ 5% of those found guilty are in fact innocent—would furnish little comfort. Instead, the far more pertinent metric is P(conviction | innocent) (25% in the above example), since it is a direct measure of the odds of a miscarriage of justice with respect to a wrongly accused person.

This article focuses primarily on the probability that an accused innocent person will be found guilty, P(conviction | innocent), though through Bayes’ theorem the results presented here could be mapped to P(innocent | conviction). More specifically, this article focuses on modelling the decisions of a tribunal in possession of imperfect information, and tasked with making a determination of the probability that an innocent defendant (whose innocence is unknown to the tribunal) is guilty.

3.2 Modelling tribunal decisions using probability density functions

With respect to any particular defendant, a tribunal is asked to make a binary decision: either the defendant can be declared guilty or the defendant can be declared innocent.24 However, to arrive at that decision the tribunal first assesses guilt along a continuum, with that assessment subsequently mapped to one of the binary outcomes. For example, a jury in a bank robbery trial might collectively conclude that there is a 99% chance that the defendant committed the crime, and, therefore, decide that he is guilty beyond a reasonable doubt. Alternatively, if the same jury instead believes that there is only a 20% chance that the defendant committed the crime, they will return a verdict of innocent. Note that while the above examples cite specific likelihood values, the decision process used by a jury or other tribunal can be probabilistically modelled even if the persons on it never explicitly contemplate or discuss specific numerical probabilities. For instance, the members of a jury might conclude that they are ‘very sure’ that the defendant did indeed rob the bank, and on that basis decide to return a verdict of guilty. By contrast, if the jury concludes that the defendant ‘probability didn’t’ rob the bank, they will know—even if they never make an explicit mathematical comparison—that this does not satisfy beyond a reasonable doubt, and will return a verdict of innocent.

Against this backdrop, a tribunal’s assessments of the probability of guilt can be modelled as realizations of a random variable described by a probability density function fXx that must be zero in the range outside the interval 0x1 since the range of valid probabilities is bounded (in percentage terms) between 0% and 100%. If the assessed probability exceeds the threshold associated with the applicable burden of proof, the tribunal will return a verdict of guilty; otherwise, the verdict will be innocent.

For Title IX proceedings on U.S. college and university campuses, the preponderance of the evidence standard is used. Under this standard, if the tribunal concludes that the likelihood that the defendant committed the accused act exceeds 50%, the tribunal will declare him or her to be guilty.25 Given this threshold and a probability density function that models decisions regarding innocent defendants, it is straightforward26 to identify the probability that an innocent defendant will be declared guilty.

The same approach can also be used to find the probability that an innocent defendant will be declared guilty under a beyond a reasonable doubt standard. The difference, of course, is that the requisite confidence threshold for a tribunal to declare guilt under that standard is much higher. Consistent with the studies cited earlier, this article will use a beyond a reasonable doubt threshold of 90%,27 though the approach presented here can easily be applied for different threshold levels. All else being equal, the probability of a guilty verdict for an innocent defendant will be much higher under the preponderance of the evidence standard than under the beyond a reasonable doubt standard.

3.3 Comparing false convictions across different burden of proof levels

This framework makes it possible to relate the probabilities of wrongful conviction under different burden of proof levels. In other words, given a particular probability that an innocent defendant will be found guilty under the beyond a reasonable doubt standard, the model here allows computation of the corresponding probability of finding an innocent defendant guilty under the preponderance of the evidence standard.

This is useful because while there is very little information regarding the statistics of false findings of guilt in campus Title IX proceedings, the issue of false convictions in criminal trials has been a topic of long-standing interest and research.28 False convictions are in turn a subset of a broader class of problematic outcomes that can occur in adjudicative proceedings. In his 2004 book titled Errors of Justice, Forst explained that an ‘error of justice is any departure from an optimal outcome of justice for a criminal case’.29 This includes errors of impunity (i.e. when a criminal escapes conviction) as well as errors of due process (which include cases when an innocent person is convicted).30

Despite the requirement of proof ‘beyond a reasonable doubt’ in criminal trials, false convictions appear to occur with alarming frequency. In a widely cited 2014 paper published in the Proceedings of the National Academy of Sciences, Gross et al. reported on a study of exonerations among death-sentenced defendants in the USA over a period of three decades and found that ‘if all death-sentenced defendants remained under sentence of death indefinitely, at least 4.1% would be exonerated. We conclude that this is a conservative estimate of the proportion of false conviction among death sentences in the United States.’31

The 4.1% figure is likely conservative in another sense as well, since it arises from a study focused only on death-sentenced defendants. Among those defendants who were found guilty of capital crimes but not sentenced to death, as Gross et al. noted, ‘the rate of innocence must be higher’.32 Among the much broader set of people who have been convicted for any crime at all, the rate could be even higher.

It is also important to note that Gross et al. were examining the fraction of innocent persons among those who had been convicted, and further, sentenced to death. To convert this using Bayes’ theorem to the probability that an innocent person, once charged with capital murder, would be convicted and sentenced to death, would require knowledge of conviction rates (among all capital murder defendants, both guilty and innocent), and further, the fraction of those convicted who are sentenced to death. It would also require knowledge of the fraction of defendants who were innocent.

There is some published information regarding conviction rates more broadly. For example, one source from the U.S. Department of Justice gives a 70% conviction rate for murder (including but not limited to capital murder) defendants whose cases were adjudicated in a 1-year tracking period.33 In addition, Gross et al. note examples of U.S. states where between 29% and 49% of those convicted of capital murder are actually sentenced to death. In combination, this suggests that at least in approximate terms, the probability that an innocent defendant, once charged with capital murder, will be found guilty and also sentenced to death is likely to be in the range of several percent or higher.34 Of course, the overwhelming majority of crimes are not capital murder. For prosecutions of more general crimes, the probability that an innocent person will be found guilty may be higher than what application (through Bayes’ theorem) of the results from Gross et al. might suggest because, among other reasons, non-capital crimes do not involve an additional determination regarding whether to apply the death sentence.

More broadly, it is not necessary for the purposes of the present article to identify the specific probability that an innocent but nonetheless criminally charged defendant will be found guilty under the beyond a reasonable doubt standard. Rather, the goal of the foregoing discussion is to show generally that numbers in the range of several percent are not only reasonable but in fact likely conservative. Thus, the following analysis will ask and answer the following two questions: first, if the probability that an innocent person, once charged, will be found guilty under the beyond a reasonable doubt standard is 4%, all else being equal, what is the probability of such a person being found guilty under the preponderance of the evidence standard? Secondly, if the probability that an innocent person, once charged, will be found guilty under the beyond a reasonable doubt standard is 1%, all else being equal, what is the probability of such a person of being found guilty under the preponderance of the evidence standard? We will consider these questions for several different probability models.

3.4 Results for various probability models

The above approach admits an arbitrary choice of probability density functions. However, the range of choices can be narrowed by assuming that the probability density function, which as noted above can be non-zero only in the range 0x1, is monotonically decreasing within that range. This reflects the fact that on average a tribunal considering evidence involving an innocent defendant should be more likely to conclude that there is a lower probability of guilt than a higher probability of guilt.35

The first function considered is a truncated normal distribution (also called a truncated Gaussian distribution).36 If the truncated normal distribution is placed with its maximum37 at x = 0, then there is only one further degree of freedom. That degree of freedom is removed when a particular probability of conviction under the beyond a reasonable doubt standard is specified. 38Figure 1a shows a truncated normal distribution for modelling a tribunal’s assessment of the likelihood of guilt of an innocent defendant. The distribution is parameterized39 so that there is a 4% probability that a tribunal using a beyond a reasonable doubt threshold of 90% will return a guilty verdict, as shown by the shaded region in the figure spanning the horizontal axis from 0.9 to 1. Figure 1b shows the same curve as in Fig. 1a but shaded to show the range of assessed likelihoods of guilt (i.e. those exceeding 50%) that would lead to a guilty verdict under the preponderance of the evidence standard. Under this model, when the preponderance of the evidence standard is used to judge an innocent defendant, a guilty verdict would be returned with probability 33%.
A probability density function in accordance with the truncated normal distribution model. As shown by the shaded regions, an innocent person facing a 4% of being found guilty under the beyond a reasonable doubt standard would face a 33% probability of being found guilty under a preponderance of the evidence standard. (a) Beyond a reasonable doubt. (b) Preponderance of the evidence.
Fig. 1.

A probability density function in accordance with the truncated normal distribution model. As shown by the shaded regions, an innocent person facing a 4% of being found guilty under the beyond a reasonable doubt standard would face a 33% probability of being found guilty under a preponderance of the evidence standard. (a) Beyond a reasonable doubt. (b) Preponderance of the evidence.

Figure 2a illustrates a truncated normal distribution parameterized40 so that there is a 1% probability that a tribunal using a beyond a reasonable doubt threshold of 90% will return a guilty verdict despite the defendant’s innocence. As shown in Fig. 2b, the corresponding probability of a guilty verdict under the preponderance of the evidence standard is 19%.
A probability density function in accordance with the truncated normal distribution model. As shown by the shaded regions, an innocent person facing a 1% of being found guilty under the beyond a reasonable doubt standard would face a 19% probability of being found guilty under a preponderance of the evidence standard. (a) Beyond a reasonable doubt. (b) Preponderance of the evidence.
Fig. 2.

A probability density function in accordance with the truncated normal distribution model. As shown by the shaded regions, an innocent person facing a 1% of being found guilty under the beyond a reasonable doubt standard would face a 19% probability of being found guilty under a preponderance of the evidence standard. (a) Beyond a reasonable doubt. (b) Preponderance of the evidence.

Another possible model is a truncated one-sided exponential distribution.41 This distribution is shown in Fig. 3a parameterized42 so that there is a 4% probability that a tribunal using a beyond a reasonable doubt threshold of 90% will return a guilty verdict. Figure 3b shows the same curve as in Fig. 3a, but with the shading corresponding to a guilty finding of an innocent defendant when the preponderance of the evidence standard is used. As indicated in Fig. 3b, the resulting probability is 29%. It is also possible to generate a truncated exponential distribution (plot not shown) where there is a 1% probability that a tribunal using a beyond a reasonable doubt threshold of 90% will return a guilty verdict, and then to compute the resulting probability of a guilty verdict (which is 13%) under the preponderance of the evidence standard.
A probability density function in accordance with the truncated exponential model. As shown by the shaded regions, an innocent person facing a 4% of being found guilty under the beyond a reasonable doubt standard would face a 29% probability of being found guilty under a preponderance of the evidence standard. (a) Beyond a reasonable doubt. (b) Preponderance of the evidence.
Fig. 3.

A probability density function in accordance with the truncated exponential model. As shown by the shaded regions, an innocent person facing a 4% of being found guilty under the beyond a reasonable doubt standard would face a 29% probability of being found guilty under a preponderance of the evidence standard. (a) Beyond a reasonable doubt. (b) Preponderance of the evidence.

It is also interesting to examine the most conservative model possible in terms of minimizing the difference between the conviction probabilities for the two burden of proof levels under consideration here. Put another way, this conservative model, while unrealistic, represents the answer to the question: given an innocent defendant and a particular probability of conviction of that defendant under the beyond a reasonable doubt standard, what is the form of the probability density function that will minimize the probability of conviction of that defendant under the preponderance of the evidence standard? This model is shown in Fig. 4a for the case where an innocent defendant stands a 4% probability of being convicted under the beyond a reasonable doubt standard. Figure 4b shows the same distribution, shaded to show the probability of conviction if the preponderance of the evidence standard is used.
A probability density function in accordance with the most conservative possible model. As shown by the shaded regions, an innocent person facing a 4% of being found guilty under the beyond a reasonable doubt standard would face a 20% probability of being found guilty under a preponderance of the evidence standard. (a) Beyond a reasonable doubt. (b) Preponderance of the evidence.
Fig. 4.

A probability density function in accordance with the most conservative possible model. As shown by the shaded regions, an innocent person facing a 4% of being found guilty under the beyond a reasonable doubt standard would face a 20% probability of being found guilty under a preponderance of the evidence standard. (a) Beyond a reasonable doubt. (b) Preponderance of the evidence.

As the figure shows, this minimizes the area under the distribution in the range between 0.5 and 0.9.43 It should be emphasized that while this is clearly an unrealistic model (among other reasons, because of the sharp drop-off at 0.5 on the horizontal axis), it is useful because, given the other constraints, it provides a lower bound on the probability of conviction under the preponderance of the evidence standard.

The results from the above examples44 can be summarized in Table 1.

Table 1 Probabilities of conviction of an innocent defendant under the preponderance of the evidence standard (bottom row) as a function of assumed probabilities of conviction of an innocent defendant under the beyond a reasonable doubt standard (top row)

Model: truncated normal distribution (%)Model: truncated exponential distribution (%)Model: conservative (%)
If the probability that an innocent person, once charged, will be found guilty under the beyond a reasonable doubt standard is:414141
Then the probability that an innocent person, once charged, will be found guilty under the preponderance of the evidence standard is:33192913205
Model: truncated normal distribution (%)Model: truncated exponential distribution (%)Model: conservative (%)
If the probability that an innocent person, once charged, will be found guilty under the beyond a reasonable doubt standard is:414141
Then the probability that an innocent person, once charged, will be found guilty under the preponderance of the evidence standard is:33192913205

Table 1 Probabilities of conviction of an innocent defendant under the preponderance of the evidence standard (bottom row) as a function of assumed probabilities of conviction of an innocent defendant under the beyond a reasonable doubt standard (top row)

Model: truncated normal distribution (%)Model: truncated exponential distribution (%)Model: conservative (%)
If the probability that an innocent person, once charged, will be found guilty under the beyond a reasonable doubt standard is:414141
Then the probability that an innocent person, once charged, will be found guilty under the preponderance of the evidence standard is:33192913205
Model: truncated normal distribution (%)Model: truncated exponential distribution (%)Model: conservative (%)
If the probability that an innocent person, once charged, will be found guilty under the beyond a reasonable doubt standard is:414141
Then the probability that an innocent person, once charged, will be found guilty under the preponderance of the evidence standard is:33192913205

4. Discussion

One of the most interesting aspects of the results in Table 1 is the relatively limited impact of the choice of model on the outcome. For example, for the case where conviction under the beyond a reasonable doubt standard occurs with 4% probability, the truncated normal and exponential models provide conviction probabilities under the preponderance of the evidence standard of 33% and 29%, respectively, a difference of only four percentage points despite the significant differences between those two models.45 Even when the conviction probability is minimized using the conservative model, the result is a still-sobering 20%. For the case where conviction under the beyond a reasonable doubt standard occurs with 1% probability, the differences among the models are wider but still not enormous, with the truncated normal and exponential models providing conviction probabilities under the preponderance of the evidence standard of 19% and 13%, respectively.

It is also interesting to use the numbers in the table to form ratios indicating how much higher the risk of an improper guilty verdict is under the preponderance of the evidence standard than under the beyond a reasonable doubt standard. Under the (unrealistic) conservative model (the right two columns in the table), that ratio is 5, i.e. for that model, moving from beyond a reasonable doubt to preponderance of the evidence increases the probability of finding an innocent defendant guilty by a factor of 5.46 For the more realistic truncated normal and exponential models, the ratios are higher. Among the examples discussed here, the maximum ratio occurs for the truncated normal model if conviction under the beyond a reasonable doubt standard occurs with 1% probability: in that case, moving to a preponderance of the evidence standard would multiply the risk of finding an innocent defendant guilty by a factor of 19. The inescapable conclusion—which is not at all unexpected but is borne out here in stark quantitative terms—is that the preponderance of the evidence standard places innocent defendants at dramatically greater risk of improper guilty findings.

It is important to address some counterarguments that could be raised with respect to the approach used in this article. One counterargument would be to assert that comparing criminal trials under a beyond a reasonable doubt standard to Title IX proceedings at U.S. universities under a preponderance of the evidence standard is an apples to oranges comparison. This is a true statement, but not in a way that favours the counterargument. As multiple legal experts have pointed out,47 campus Title IX proceedings often lack many of the most basic features of due process that help ensure the rights of the accused in criminal trials. Thus, using the same probability model for both settings is in at least this respect unfairly generous to Title IX proceedings. Put another way, innocent defendants in Title IX proceedings may be even more exposed to guilty verdicts than the approach used here suggests.

Another potential counterargument could question the choice of probability models (the truncated normal and exponential distributions) used in this article. The response to this is twofold. First, the truncated normal and exponential distributions are well-known models applicable to a broad range of phenomena. While we do not have enough information to know the ‘true’ model that characterizes tribunal decisions regarding guilt (or whether there is even any single model that would suffice), considering both truncated normal and exponential models provides insight into how different models impact the risk of false conviction.48

Secondly, as the values in the table above show, among the more realistic models—and thus excluding the conservative model shown in Fig. 4—there is surprisingly little variation in the results. In other words, while the choice of probability model matters, it does not alter the overarching conclusions regarding the dramatically increased exposure faced by innocent defendants when a preponderance of the evidence standard is used instead of beyond a reasonable doubt.

An additional counterargument would question the choice in the examples above to use 4% and 1% for the probabilities of conviction of innocent defendants in criminal trials under the beyond a reasonable doubt standard. In response, it should be emphasized that the analysis could just as easily be performed for different conviction of rates of innocent defendants under a beyond a reasonable doubt standard. Also, given the discussion in Section 3.3 regarding applying Bayes’ theorem in light of the conclusions of Gross et al., the true rate of convictions for innocent criminal defendants is very unlikely to be ‘lower’ than 1%. In short, the choice to consider 1% is conservative. After all, it would strain credulity to suggest that verdicts in the criminal justice system are so accurate that fewer than 1 out of every 100 innocent people subjected to criminal trials are found guilty. Additionally considering 4% has the dual advantage of being more likely to be realistic, and also, when combined with the 1% results, enabling insight into how changes in the rate of improper findings of guilt under a beyond a reasonable doubt standard correlate to the rate of improper findings of guilt under the preponderance of the evidence standard. Finally, it is important to emphasize that this article presents a generalized framework that is not tied to the specific probability models (truncated normal and exponential) or false beyond a reasonable doubt conviction probabilities (1% and 4%) explored here. Once this framework is created, it is straightforward to apply it in many different ways.

A final, overarching potential counterargument would hold that the very idea of using probabilistic concepts with respect to burdens of proof is problematic. There are plenty of examples of statements on both sides of this argument. After all, it is impossible to reconcile the position of a U.S. federal appeals court judge who wrote:

All burdens of persuasion deal with probabilities. The preponderance standard is a more-likely-than-not rule, under which the trier of fact rules for the plaintiff if it thinks the chance greater than 0.5 that the plaintiff is in the right. The reasonable doubt standard is much higher, perhaps 0.9 or better.49

with that of another U.S. federal appeals court judge, who wrote ‘I believe that the entire effort to quantify the standard of proof beyond a reasonable doubt is a search for fool's gold.’50 Similarly, while one can find a law review article arguing that decisions in the legal system involve ‘an evidence threshold, denoted here by xT, which indicates the value of x above which liability will be assigned and below which there is no liability’,51 one can also find law review articles stating that there are ‘serious and fundamental impediments to scholars hoping to articulate a probabilistic theory of evidence’,52 and that ‘[m]ost evidence scholars believe that adjudicative factfinding is fundamentally incompatible with mathematical probability.’53

It is well beyond the scope of this article to attempt to resolve the decades-old dispute over the best role for probability theory in modelling legal proceedings. Rather, the approach in this article is premised on the belief that, as imperfect as probabilistic models may be in the legal context, they nonetheless can be highly instructive in illustrating—or at least in framing—the quantitative consequences of decisions regarding which burden of proof standard to apply.

5. Conclusions

This article has presented a framework for calculating the risk that an innocent defendant, when subjected to a judicial proceeding using the preponderance of the evidence standard, will be found guilty. This is a particularly critical issue in light of the significant growth on U.S. college and university campuses of Title IX proceedings which, due to a 2011 mandate from the U.S. Department of Education, must be conducted using preponderance of the evidence standard. Even under the most conservative mathematical assumptions possible under the framework and examples presented herein, this article has demonstrated that an innocent defendant faces a five times higher risk of being wrongly found guilty when a preponderance of the evidence standard is used as opposed to under a beyond a reasonable doubt standard. Under many circumstances, including the more realistic (relative to the ‘conservative’ model) probability models explored here, the relative risk would be even higher, often by a very large margin. While the examples presented in this article used truncated normal and exponential distributions, the framework is general and can be applied to any probability model.

In any justice system, including the systems that U.S. colleges and universities have created in response to recent interpretations of the requirements of Title IX, the pool of defendants will include both those who are guilty and those who are innocent. Victims’ rights advocates correctly argue that a lower burden of proof makes it easier to ensure that the guilty are punished.54 Without in any way diminishing the importance of holding the guilty to account, a holistic understanding of the social impacts of these on-campus justice systems requires exploring not only their effectiveness in punishing the guilty but also the exposures faced by those who are innocent but nonetheless find themselves exposed to proceedings to establish guilt. It should come as no surprise that when the burden of proof is preponderance of the evidence, the risks faced by innocent defendants will be substantial. This article has demonstrated in quantitative terms just how substantial those risks can be.

1 These on-campus tribunals are convened in addition to, not in place of, any proceedings in the traditional criminal justice system.

2 The use of probabilistic concepts in association with decisions in a legal context is not new, and there is a wide range of views on when and how those concepts should be applied. See e.g. Allen, R. J. & Stein, A. (2013) Evidence, probability, and the burden of proof. Arizona Law Review, 55, 557 and the references cited therein; and Schwartz, D. L. & Seaman, C. B. (2013) Standards of proof in civil litigation: an experiment from patent law. Harvard Journal of Law and Technology, 26, 429 and the references cited therein.

3 In this context, a type I error occurs when the null hypothesis (i.e. that the defendant is innocent) is incorrectly rejected.

4 In this context, a type II error occurs when the null hypothesis (i.e. that the defendant is innocent) should be rejected, but is not.

5 Despite Blackstone’s oft-quoted statement ‘[B]etter that ten guilty persons escape, than that one innocent suffer’ (Blackstone, W. (1769) Commentaries on the laws of England, Book the fourth, Clarendon Press, Oxford, p. 352), views on exactly how ‘much’ worse it is to convict an innocent person than to acquit a guilty person have ranged widely. See e.g. Volokh, A. (1997) n Guilty Men. University of Pennsylvania Law Review, 146, 173.

6Simon, R. J. & Mahan, L. (1971) Quantifying burdens of proof: a view from the bench, the jury, and the classroom. Law and Society Review, 5, 319, 324.

7Ibid.

8 See McCauliff, C. M. A. (1982) Burdens of proof: degrees of belief, quanta of evidence, or constitutional guarantees? Vanderbilt Law Review, 35, 1293, 1325. McCauliff did not report a median or an average; the respective values of 90% and 90.3% provided herein were computed using the data in the table on p. 1325 of McCauliff.

9 See e.g. Weinstein, J. B. & Dewsbury I. (2006) Comment on the meaning of ‘proof beyond a reasonable doubt’. Law, Probability, and Risk, 5, 167, 169 (stating ‘[w]e personally favour burden of proof in the realm of 95+% probability of guilt’).

10 See Franklin, J. (2006) Case comment—United States v. Copeland, 369 F. Supp. 2d 275 (E.D.N.Y. 2005): quantification of the ‘proof beyond reasonable doubt’ standard. Law, Probability, and Risk, 5, 159, 165 (stating ‘proof beyond reasonable doubt means “well above a probability of 0.8”. Any suggestion from a jury that 0.8 or less is adequate can be ruled out, while the qualification ‘well above’ will avoid any suggestions that something just above 0.8 is in fact adequate, and will not obstruct any later attempts to quantify the standard more exactly.’).

11 The preponderance of the evidence standard is widely used in civil litigation. However, in this article, the focus is on the use of that standard to make determinations regarding guilt in the context of campus Title IX tribunals.

12 Pub. L. No. 92-318, 86 Stat. 235 (1972), codified at 20 U.S.C. §§ 1681–1688. The associated implementing regulations are at 34 C.F.R. § 106.

13 20 U.S.C. §1681(a).

14 U.S. Dept. of Education, OCR (4 April 2011) Dear Colleague letter, http://www2.ed.gov/about/offices/list/ocr/letters/colleague-201104.pdf [accessed 14 June 2016].

15 34 C.F.R. § 106.8(b) provides that ‘A recipient shall adopt and publish grievance procedures providing for prompt and equitable resolution of student and employee complaints alleging any action which would be prohibited by this part.’

16 Dear Colleague letter, supra n. 14 at 11, parentheses in original.

17Ibid.

18 See e.g. the 2014 letter from 28 Harvard Law School faculty members asserting that ‘Harvard has adopted procedures for deciding cases of alleged sexual misconduct which lack the most basic elements of fairness and due process, are overwhelmingly stacked against the accused, and are in no way required by Title IX law or regulation.’ Bartolet, E. et al. (15 October 2014) Rethink Harvard’s Sexual Harassment Policy. The Boston Globe, https://www.bostonglobe.com/opinion/2014/10/14/rethink-harvard-sexual-harassment-policy/HFDDiZN7nU2UwuUuWMnqbM/story.html. See also Cohn, J. (1 October 2012) Campus Is a Poor Court for Students Facing Sexual-Misconduct Charges. Chronicle of Higher Education, http://chronicle.com/article/Campus-Is-a-Poor-Court-for/134770/ (stating that under the preponderance of the evidence standard required by OCR, ‘without any of the safeguards designed to increase the reliability and fairness of civil trials, the risk of erroneous findings of guilt increases substantially, especially when a fact finder is asked to decide only if it is merely 50.01 percent more likely that a sexual assault occurred’). See also the counterpoint raised in Hogshead-Makar, N. & and Sokolow, B. A. (15 October 2012) Setting a Realistic Standard of Proof in Sexual-Misconduct Cases, Chronicle of Higher Education, http://chronicle.com/article/Setting-a-Realistic-Standard/135084/ (arguing that the preponderance of the evidence standard is ‘is the only standard that is equally fair to men and women’).

19 See e.g. the 7 January 2016 letter from Senator James Lankford (Chair of the Subcommittee on Regulatory Affairs and Federal Management, Committee on Homeland Security and Government Affairs) to Acting Secretary John B. King, Jr. of the U.S. Department of Education, expressing ‘continued alarm regarding’ OCR’s Dear Colleague letters of 23 October 2010 (on harassment and bullying) and 4 April 2011 (on sexual violence). https://www.scribd.com/doc/294821262/Sen-Lankford-letter-to-Education-Department [accessed 14 June 2016].

20 Pub. L. No. 79-104, 60 Stat. 237 (1946). Rule-making under the Administrative Procedure Act is codified at 5 U.S.C. § 553.

21Allen, R. J. (2014) Burdens of proof. Law, Probability, and Risk, 13, 195, 212.

22 The terms ‘convict’ and ‘conviction’ herein are used generically to describe the outcome when a tribunal concludes that a defendant is guilty. When discussing Title IX proceedings specifically, phrases such as ‘find guilty’ will be used as the persons conducting those proceedings are not empowered to criminally convict defendants.

23 As noted in the text, P(conviction | innocent) and P(innocent | conviction) are related through Bayes’ theorem. In this example, the probability of conviction P(conviction) is 0.8 (since 80 out of the 100 defendants were convicted) and the probability of being innocent is 0.16 (since 16 of the 100 defendants are innocent). Under Bayes’ theorem, P(conviction |innocent)=P(innocent|conviction)P(conviction)P(innocent)=(0.05)(0.8)/(0.16)=0.25.

24 Of course, there is also the possibility of a hung jury, or its equivalent, in which no decision is reached. However, in this article we focus on the situation in which a finding regarding guilt or innocence is rendered. In addition, this article focuses on the assessment of guilt as a single decision—i.e. whether the defendant is, or is not, guilty. This single decision can in some instances correspond to an aggregation of multiple sub-decisions (e.g. a defendant is found guilty only if he or she is found to have committed both of acts ‘A’ and ‘B’).

25 In general, it would theoretically be possible to argue that ‘preponderance of the evidence’ might be interpreted to require a threshold other than 50% (e.g. 60%). However, in the context of Title IX proceedings, the explicit instruction to assess whether ‘it is more likely than not that sexual harassment or violence occurred’ (see Dear Colleague letter, supra n. 14 at 11) constitutes a de facto instruction to return a guilty verdict if the defendant is deemed more than 50% likely to have committed the accused act.

26 More specifically, computing the probability that an innocent defendant will be found guilty requires finding the area under the pdf in the range 0.5 <x ≤ 1.

27 Under this threshold for beyond a reasonable doubt, computing the probability that an innocent defendant will be declared guilty requires finding the area under the pdf in the range 0.9 <x ≤ 1.

28 See e.g. Garrett, B.L. (2012) Convicting the Innocent: Where Criminal Prosecutions Go Wrong, Harvard University Press, Cambridge, MA.

29Forst, B. (2004) Errors of Justice: Nature, Sources, and Remedies, Cambridge University Press, New York. The quoted sentence is on p. 4. In addition, in chapter 6 (pp. 57–65) of Errors of Justice, Forst provides an analysis that is both relevant and complementary to the approach presented in the present article. Forst provides a series of tables and figures exploring the impact of conviction rates on groups of hypothetical defendants in which various percentages (70, 80 and 90%) of those defendants are truly guilty. Among other things, the figures and tables explore the number of ‘offenders freed per innocent convicted’ (figures 6.1a and 6.1b) as well as the ‘number of offenders acquitted per innocent person found guilty’ (table 6.5). By contrast, the present article presents a general framework based on underlying probability distributions, and then applies that framework to enable information regarding false conviction rates under one burden of proof standard (beyond a reasonable doubt) to be used to infer false conviction rates under a different burden of proof standard (preponderance of the evidence). In addition, the present article explores the impact of several specific probability density functions on the outcomes.

30Ibid. at pp. 4 and 5. In Forst, errors of due process include not only false convictions but also more broadly cases in which an innocent person ‘is harassed, detained, or sanctioned’ as well as ‘excessive intrusions against those who violate the law’.

31Gross, S. R., O’Brien, B., Hue, C. and Kennedy, E. H. (2014) Rate of false conviction of criminal defendants who are sentenced to death. Proceedings of the National Academy of Sciences, 11, 7230. It is also worth noting that in a 2006 New York Times op-ed, an Oregon District Attorney estimated the rate of false convictions at 0.027% (Marquis, J. (20 January 2006) The Innocent and the Shamed. New York Times). This 0.027% rate was in turn cited by Justice Scalia in Kansas v. Marsh, 548 U.S. 163, 182 (2006) (Justice Scalia concurring in the judgment). However, this rate was rebutted by Gross et al. (at p. 7235).

32Ibid. at 7235.

33 See U.S. Department of Justice, Bureau of Justice Statistics. What is the probability of conviction for felony defendants?http://www.bjs.gov/index.cfm?ty=qa&iid=403 [accessed 14 June 2016]. This is approximately consistent with other reported statistics. In Errors of Justice (supra n. 29) at p. 58, Forst, citing several other publications from the Bureau of Justice Statistics as well as his own prior work, writes (with respect to criminal trials generally, not necessarily murder) that ‘available data from state and local prosecutors and courts suggest that the current standard of evidentiary proof results in about 75% of all persons whose cases come to trial being found guilty’.

34 This is the case because P(conviction and death sentence | innocent) = P(innocent | conviction and death sentence) P(conviction and death sentence)/P(innocent). P(conviction and death sentence) can be roughly estimated at 0.7 × 0.29 = 0.2, corresponding to an estimated 70% capital murder conviction rate and lower (conservative) 29% rate of sentencing those convicted of capital murder to death cited above. If the fraction of innocent defendants is 20%, that would mean that if P(innocent | conviction and death sentence) is 0.04 in accordance with the ‘conservative estimate’ provided by Gross et al., then P(conviction and death sentence | innocent) would also be 0.04, or 4%. In the unlikely event that 40% of defendants were innocent (and again assuming P(innocent | conviction and death sentence) is 0.04), that would reduce P(conviction and death sentence | innocent) to 0.02, or 2%. If 10% of defendants were innocent, P(conviction and death sentence | innocent) would be 0.08, or 8%.

35 More formally, a monotonically decreasing pdf will ensure that given any two different but equally wide ranges of probabilities (e.g. from 10% to 15% and from 60% to 65%), when adjudicating the case of an innocent defendant, the tribunal has a greater likelihood of concluding that the probability of guilt lies in the lower range than the higher range.

36 A regular (non-truncated) normal distribution with mean x = µ and standard deviation γ has a pdf fX(x)=1γ2πexp((xμ)22γ2). (In this equation using ‘γ’ instead of the more conventional ‘σ’ will help reduce the potential for confusion since, after truncation, the standard deviation will change.) To truncate the normal distribution, multiplication by a rect function is performed, where rect(xbw) is defined to have value 1 in the range bW2xb+W2 and zero otherwise. For the model in this article, µ=0 (since, for an innocent defendant, the maximum of the pdf should occur at a probability of guilt equal to zero) and the rect function is parameterized by b =0.5 and W =1 (since probabilities of guilt must be in the range from 0% to 100%, meaning that the truncation must occur outside the range from 0 ≤ x ≤ 1). In addition, truncation requires rescaling to ensure that the non-truncated portion has a total area of 1. The resulting pdf is fX(x)=1γ2πexp(x22γ)12erf(1γ2)rect(x0.5), where erfz is the error function defined by erf(z)=2π0zexp(τ2)dτ.

37 Maximum here is assumed to refer to the peak of a normal distribution, where the slope is zero. It is possible, of course, to envision other truncations in which the peak is removed, and in which the maximum would not have a zero slope.

38 The statement that there is one further degree of freedom assumes, as specified earlier, that beyond a reasonable doubt is associated with a 90% threshold. Of course, removing that constraint and allowing values other than 90% introduces an additional degree of freedom.

39 For Figures 1a and 1b, γ = 0.5843 was used in the equation for the truncated normal distribution, supra n. 36.

40 For Figures 2a and 2b, γ = 0.5843 was used in the equation for the truncated normal distribution, supra n. 36.

41 A non-truncated exponential pdf can be specified by fX(x)=λexp(λx)u(x), where λ is positive and u(x) is a unit step function that has value 1 for nonnegative x and zero otherwise. The corresponding cdf is FX(x)=(1exp(λx))u(x). As with the truncated normal function discussed previously (supra, n. 36), truncation over the range from 0 to 1 can be accomplished by multiplying by rect (x – 0.5) and appropriately renormalizing, yielding the truncated exponential pdf fX(x)=λexp(λx)1exp(λ)rect(x0.5). (The unit step function is no longer needed since it is subsumed by the rect function).

42 For Figures 3a and 3b, λ = 1.76 was used in the equation for the truncated exponential distribution. See ibid.

43 Of course if there were no constraints at all, minimization of the pdf in the range from 0.5 to 0.9 would require setting the pdf to zero in that range while maintaining it at a non-zero level in the range 0.9 to 1. However, this would violate the requirement discussed earlier that the pdf must be monotonically decreasing. The pdf shown in Figure 4 is non-increasing in the interval from 0.5 to 1, so represents what a monotonically decreasing pdf could approach, in the limit, over this range.

44 In the interest of brevity, the curves for the truncated exponential distribution for the case where PBRD(conviction | innocent) = 1% (obtained when the equation for the truncated exponential pdf, supra n. 41, is parameterized by λ = 3.88; ‘BRD’ indicates ‘beyond a reasonable doubt’) are not shown, though the corresponding probabilities are included in the table.

45 Among other differences, the truncated Gaussian has a form in the vicinity of the origin that is concave downward (i.e. shaped like a portion of an umbrella), while the truncated exponential has a form that is concave upward (i.e. shaped like a portion of an upside down umbrella).

46 Given the flat nature of the pdf in the conservative model, the ratio of 5 is a simple consequence of the fact that the 50% threshold associated with preponderance of the evidence is five times as far from 100% as is the 90% threshold associated with beyond a reasonable doubt.

47 See e.g. Bartolet et al., supra n. 18 and Cohn, supra n. 18.

48 Generally, normal distributions are more common than exponential distributions. However, an exponential distribution is helpful to include as an example of a more conservative model (conservative in the sense that, for a given probability of conviction under a beyond a reasonable doubt standard, the risk faced by an innocent defendant is lower under the exponential model than under the normal model).

49Brown v. Bowen, 847 F.2d 342, 354 (7th Cir. 1988).

50In Re As. H., 851 A.2d 456, 463 (D.C. 2004) (Judge Farrrell dissenting).

51Kaplow, L. (2012) Burden of proof. Yale Law Journal, 121, 738, 758 (emphasis in original).

52Cheng, E. K. (2013) Reconceptualizing the Burden of Proof. Yale Law Journal, 122, 1254, 1257.

53Allen, R. J. & Stein, A. (2013) Evidence, probability, and the burden of proof. Arizona Law Review, 55, 557, 562.

54 See e.g. Hogshead-Makar & Sokolow, supra n. 18 (writing that ‘a higher standard, such as clear and convincing evidence, would make it less likely that those who commit sexual misconduct would be held accountable’.).

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.