It’s a big surprise that I got different p-values from hail and from R. The input data as I manually checked is identical. The hail code is as follows:
For one row, the p-value from hail is 0.0995 (beta = -2.19, se = 1.23, t = -1.79), while the p-value from R is 0.086 (beta = -2.19150, se = 1.17954, t = -1.858). I used python statsmodels.api.OLS with constant and got the same p-value as R (0.086).
The fact that the beta values agree with each other suggests there is no input data difference. Not sure why the SE differs from each other.
Forgot to say, the discrepancy occurs when I use 20 samples. If I use 25,000 samples, the p-values are identical to each other if ignoring precision point.
Could you elaborate a little bit more on “entry wise”? The only difference between the 20 samples & 25k samples is the sample size, all data structures are the same.