I get the following error after importing a table and trying to use the tiebreaker expression in idb_prune:
Can you copy the full text of the command and the error trace, rather than a screen shot? You can use triple backticks to format code blocks. Thanks!
We’d also strongly recommend switching to 0.2 – we will deprecate (stop supporting) 0.1 soon. This will unfortunately mean rewriting pipelines and recreating data files.
I cant really copy the code as text, so I hope that giving this screenshot will help? Otherwise I have to type it all by hand.
Ah, the problem is with your filterSamplesExpr - the inbreeding aggregator returns a structure with several values.
Probably you want to compare the f statistic to something?
How many samples do you have?
Also, do you have plans to move to 0.2? We’ll be fully deprecating 0.1 soon.
I also hope to move to 0.2, it is not entirely up to me. 18.000 samples.
I’m guessing that the java process died during the local step of IBD prune (or possibly LD prune).
How much memory does the master machine have?
I am working on 3 nodes with 350 gb RAM each at the moment. It is on a small data subset with only few variants included.
For some reason a second run does not end in the same error. But the ibd_prune has been running for hours now on the small subset, is there an issue or is it just very inefficient?
I believe we made the maximal_independent_set algorithm immensely faster from 0.1 to 0.2.
It’s possible that the 0.1 algorithm is unusable on sample sizes greater than a few thousand.
The ibd_prune step is the same computation whether you have 1000 variants or 100M - pairwise comparisons of 18K samples.