Fixed overflow bug in 0.2 chi_squared_test / contingency_table_test, present since June 25, 2018

We just discovered and fixed an overflow bug in chi_squared_test that has existed in the 0.2 development branch since commit 01875754aa921fab22e1c4e8f11949e21fdc884c merged on June 25, 2018.

If you used a version since that commit to run chi_squared_test with counts a, b, c, and d then:

  • if a * d or b * c exceeded 2,147,483,647, the odds ratio is wrong. E.g. a minimal table with incorrect odds ratio is (46341, 1, 1, 46341).
  • if (a + b) * (c + d) * (b + d) * (a + c) exceeded 2,147,483,647, the p-value is wrong. E.g., two minimal tables with incorrect p-value are (108, 108, 108, 107) and (216, 0, 0, 215).

Tables with total count below 431 were unaffected.

The bug also applied to contingency_table_test when the total cell count was at least min_cell_count, since in this case the function calls chi_squared_test.

The fisher_exact_test is unrelated and unaffected.

Here is the problematic Scala implementation:

    val ad = a * d
    val bc = (b * c).toDouble
    val oddsRatio = ad / bc
    val det = ad - bc
    val chiSquare = (det * det * (a + b + c + d)) / ((a + b) * (c + d) * (b + d) * (a + c))
    val pValue = chiSquaredTail(chiSquare, 1)