We just discovered and fixed an overflow bug in chi_squared_test that has existed in the 0.2 development branch since commit 01875754aa921fab22e1c4e8f11949e21fdc884c
merged on June 25, 2018.
If you used a version since that commit to run chi_squared_test
with counts a
, b
, c
, and d
then:
- if
a * d
orb * c
exceeded 2,147,483,647, the odds ratio is wrong. E.g. a minimal table with incorrect odds ratio is (46341, 1, 1, 46341). - if
(a + b) * (c + d) * (b + d) * (a + c)
exceeded 2,147,483,647, the p-value is wrong. E.g., two minimal tables with incorrect p-value are (108, 108, 108, 107) and (216, 0, 0, 215).
Tables with total count below 431 were unaffected.
The bug also applied to contingency_table_test when the total cell count was at least min_cell_count
, since in this case the function calls chi_squared_test
.
The fisher_exact_test
is unrelated and unaffected.
Here is the problematic Scala implementation:
val ad = a * d
val bc = (b * c).toDouble
val oddsRatio = ad / bc
val det = ad - bc
val chiSquare = (det * det * (a + b + c + d)) / ((a + b) * (c + d) * (b + d) * (a + c))
val pValue = chiSquaredTail(chiSquare, 1)