In this vignette, we will explain how to compute a Bayes factor for informed hypotheses on multiple binomial parameters.
As example for an inequality-constrained hypothesis on multiple
independent binomials, we will use the dataset, journals
,
which is included in the package multibridge
. The dataset
is based on a study by Nuijten et al.
(2016) which evaluated the prevalence of statistical reporting
errors (i.e., inconsistencies between the reported test statistic and
degrees of freedom, and the reported p-value) in the field of
psychology. In total, the dataset contains a summary of statistical
reporting errors of 16,695 research articles reporting results from null
hypothesis significance testing (NHST). The selected articles were
published in eight major journals in psychology between 1985 to 2013:
Developmental Psychology (DP), Frontiers in Psychology
(FP), the Journal of Applied Psychology (JAP), the Journal
of Consulting and Clinical Psychology (JCCP), Journal of
Experimental Psychology: General (JEPG), the Journal of
Personality and Social Psychology (JPSP), the Public Library of
Science (PLoS), and Psychological Science (PS).
## journal articles_downloaded articles_with_NHST perc_articles_with_NHST
## 1 DP 3379 2607 77.2
## 2 FP 2139 702 32.8
## 3 JAP 2782 1638 58.9
## 4 JCCP 3519 2413 68.6
## 5 JEPG 1184 821 69.3
## 6 JPSP 5108 4346 85.1
## 7 PLOS 10299 2487 24.1
## 8 PS 2307 1681 72.9
## nr_NHST mean_nr_NHST_per_article_with_NHST
## 1 37658 14.44
## 2 10149 14.46
## 3 15134 9.24
## 4 27429 11.37
## 5 18921 23.05
## 6 101621 23.38
## 7 31539 12.68
## 8 15654 9.31
## mean_nr_NHST_per_article_all_included errors dec_errors perc_errors
## 1 11.14 3821 654 10.15
## 2 4.74 1045 134 10.30
## 3 5.44 1146 302 7.57
## 4 7.79 3396 463 12.38
## 5 15.98 1495 210 7.90
## 6 19.89 8484 1139 8.35
## 7 3.06 4151 488 13.16
## 8 6.79 1423 191 9.09
## perc_dec_errors perc_articles_with_errors perc_articles_with_dec_errors
## 1 1.74 50.90 15.15
## 2 1.32 50.85 11.97
## 3 2.00 33.64 12.45
## 4 1.69 48.90 11.56
## 5 1.11 54.81 12.91
## 6 1.12 57.62 15.81
## 7 1.55 49.70 11.50
## 8 1.22 39.74 6.48
## APAfactor
## 1 0.7827710
## 2 0.7408296
## 3 0.7321825
## 4 0.7710022
## 5 0.8031367
## 6 0.8106632
## 7 0.6701959
## 8 0.7795777
The model that we will use assumes that the elements in the vector of successes x1, ⋯, xK and the elements in the vector of total number of observations n1, ⋯, nK in the K categories follow independent binomial distributions. The parameter vector of the binomial success probabilities, θ1, ⋯, θK, contains the probabilities of observing a value in a particular category; here, it reflects the probabilities of a statistical reporting error in one of the 8 journals. The parameter vector θ1, ⋯, θK are drawn from independent beta distributions with parameters α1, ⋯, αK and β1, ⋯, βK. The model can be described as follows:
Here, we test the inequality-constrained hypothesis ℋr formulated by Nuijten et al. (2016) that the prevalence for statistical reporting errors for articles published in social psychology journals (i.e., JPSP) is higher than for articles published in other journals. We will test this hypothesis against the encompassing hypothesis ℋe without any constraints. In addition, this hypothesis will be tested against the null hypothesis ℋ0 that all journals have the same prevalence to include an article with a statistical reporting error:
To evaluate the inequality-constrained hypothesis, we need to specify (1) a vector with observed successes, and (2) a vector containing the total number of observations, (3) the informed hypothesis, (4) a vector with prior parameters alpha for each binomial proportion, (5) a vector with prior parameters beta for each binomial proportion, and (6) the labels of the categories of interest (i.e., journal names):
# since percentages are rounded to two decimal values, we round the articles
# with an error to receive integer values (step 1)
x <- round(journals$articles_with_NHST * (journals$perc_articles_with_errors/100))
# total number of articles (step 2)
n <- journals$articles_with_NHST
# Specifying the informed Hypothesis (step 3)
Hr <- c('JAP , PS , JCCP , PLOS , DP , FP , JEPG < JPSP')
# Prior specification (step 4 and 5)
# We assign a uniform beta distribution to each binomial propotion
a <- rep(1, 8)
b <- rep(1, 8)
# categories of interest (step 6)
journal_names <- journals$journal
With this information, we can now conduct the analysis with the
function binom_bf_informed()
. Since we are interested in
quantifying evidence in favor of the restricted hypothesis, we set the
Bayes factor type to BFre
. For reproducibility, we are also
setting a seed with the argument seed
:
ineq_results <- multibridge::binom_bf_informed(x=x, n=n, Hr=Hr, a=a, b=b,
factor_levels=journal_names,
bf_type = 'BFre',
seed = 2020)
m1 <- summary(ineq_results)
m1
## Bayes factor analysis
##
## Hypothesis H_e:
## All parameters are free to vary.
##
## Hypothesis H_r:
## JAP , PS , JCCP , PLOS , DP , FP , JEPG < JPSP
##
## Bayes factor estimate BFre:
##
## 7.4289
##
## Based on 1 independent inequality-constrained hypothesis.
##
## Relative Mean-Square Error:
##
## 0.066
##
## Posterior Median and Credible Intervals Of Marginal Beta Distributions:
## alpha beta 2.5% 50% 97.5%
## 1 JAP 1 + 551 1 + 1087 0.314 0.337 0.360
## 2 PS 1 + 668 1 + 1013 0.374 0.397 0.421
## 3 JCCP 1 + 1180 1 + 1233 0.469 0.489 0.509
## 4 PLOS 1 + 1236 1 + 1251 0.477 0.497 0.517
## 5 DP 1 + 1327 1 + 1280 0.490 0.509 0.528
## 6 FP 1 + 357 1 + 345 0.472 0.509 0.545
## 7 JEPG 1 + 450 1 + 371 0.514 0.548 0.582
## 8 JPSP 1 + 2504 1 + 1842 0.561 0.576 0.591
From the summary of the results we can see that the overall relative mean-square error of 0.066 is quite high, which might suggest unstable results. This becomes apparent if we look at the result as percentage error which can be extracted from the output object:
## re2 cv percentage
## 1 0.06597936 0.2568645 25.6864%
Here the percentage error is at almost 25.6864%. For that reason, we
will rerun the binom_bf_informed()
with more samples.
ineq_results <- multibridge::binom_bf_informed(x=x, n=n, Hr=Hr, a=a, b=b,
factor_levels=journal_names,
bf_type = 'BFre',
seed = 2020,
niter = 2e4)
m2 <- summary(ineq_results)
With more samples, the percentage error is considerably smaller:
## re2 cv percentage
## 1 0.003796574 0.06161634 6.1616%
Now, the overall relative mean-square error is 0.0038, which translates to a percentage error of 6.1616%.
The data are more likely under the informed hypothesis than under the alternative hypothesis; in fact we collected moderate evidence for the informed hypothesis. The results suggest that the data are 7.29 more likely under the informed hypothesis than under the hypothesis that all parameters are free to vary.
If we would like to compare the inequality-constrained hypothesis
ℋr against the null
hypothesis ℋ0 which states
that the probability for a statistical reporting error is equal across
all journals, we can set the Bayes factor type in
binom_bf_equality()
to BFr0
.
eq_results <- multibridge::binom_bf_informed(x=x, n=n, Hr=Hr, a=a, b=b,
factor_levels=journal_names,
bf_type = 'BFr0',
seed = 2020,
niter = 2e4)
m3 <- summary(eq_results)
m3
## Bayes factor analysis
##
## Hypothesis H_0:
## All parameters are exactly equal.
##
## Hypothesis H_r:
## JAP , PS , JCCP , PLOS , DP , FP , JEPG < JPSP
##
## Bayes factor estimate BFr0:
##
## 5.3802e+68
##
## Based on 1 independent inequality-constrained hypothesis.
##
## Relative Mean-Square Error:
##
## 0.0038
##
## Posterior Median and Credible Intervals Of Marginal Beta Distributions:
## alpha beta 2.5% 50% 97.5%
## 1 JAP 1 + 551 1 + 1087 0.314 0.337 0.360
## 2 PS 1 + 668 1 + 1013 0.374 0.397 0.421
## 3 JCCP 1 + 1180 1 + 1233 0.469 0.489 0.509
## 4 PLOS 1 + 1236 1 + 1251 0.477 0.497 0.517
## 5 DP 1 + 1327 1 + 1280 0.490 0.509 0.528
## 6 FP 1 + 357 1 + 345 0.472 0.509 0.545
## 7 JEPG 1 + 450 1 + 371 0.514 0.548 0.582
## 8 JPSP 1 + 2504 1 + 1842 0.561 0.576 0.591
The resulting Bayes factor that compares the informed to the null hypothesis provides extreme evidence for the informed hypothesis; the data are 5.38^{68} more likely under the informed hypothesis than under the null hypothesis. But since the data provided only moderate evidence against the encompassing hypotheses (i.e., BFre=7.29), it would be more sensible to say that this result suggests a misspecification of the null model rather than a well specified informed hypothesis.