Issues with tiebreaker expression

Mathias_Hansen · October 25, 2018, 9:40am

I get the following error after importing a table and trying to use the tiebreaker expression in idb_prune:

jbloom · October 25, 2018, 11:24am

Hi Mathias,
Can you copy the full text of the command and the error trace, rather than a screen shot? You can use triple backticks to format code blocks. Thanks!

tpoterba · October 25, 2018, 11:32am

We’d also strongly recommend switching to 0.2 – we will deprecate (stop supporting) 0.1 soon. This will unfortunately mean rewriting pipelines and recreating data files.

Mathias_Hansen · October 25, 2018, 12:40pm

I cant really copy the code as text, so I hope that giving this screenshot will help? Otherwise I have to type it all by hand.

tpoterba · October 25, 2018, 12:54pm

Ah, the problem is with your filterSamplesExpr - the inbreeding aggregator returns a structure with several values.

tpoterba · October 25, 2018, 12:55pm

Probably you want to compare the f statistic to something?

Mathias_Hansen · October 29, 2018, 12:38pm

That helped, however, I keep having problems with a NullPointerException when using idb_prune.

tpoterba · October 29, 2018, 12:54pm

How many samples do you have?

Also, do you have plans to move to 0.2? We’ll be fully deprecating 0.1 soon.

Mathias_Hansen · October 29, 2018, 1:01pm

I also hope to move to 0.2, it is not entirely up to me. 18.000 samples.

tpoterba · October 29, 2018, 1:08pm

I’m guessing that the java process died during the local step of IBD prune (or possibly LD prune).

How much memory does the master machine have?

Mathias_Hansen · October 29, 2018, 1:26pm

I am working on 3 nodes with 350 gb RAM each at the moment. It is on a small data subset with only few variants included.

Mathias_Hansen · November 2, 2018, 8:03am

For some reason a second run does not end in the same error. But the ibd_prune has been running for hours now on the small subset, is there an issue or is it just very inefficient?

tpoterba · November 2, 2018, 10:34am

I believe we made the maximal_independent_set algorithm immensely faster from 0.1 to 0.2.

It’s possible that the 0.1 algorithm is unusable on sample sizes greater than a few thousand.

The ibd_prune step is the same computation whether you have 1000 variants or 100M - pairwise comparisons of 18K samples.

tpoterba · November 2, 2018, 7:25pm

@Mathias_Hansen - what is it going to take to get you to 0.2?

We are imminently going to announce end-of-life for 0.1 – meaning no support for external users running 0.1.

Topic		Replies	Views
Error while LD pruning variants - hail.utils.java.FatalError: IllegalArgumentException: requirement failed Hail Query & hailctl	2	422	May 3, 2023
LD pruning and IBD Hail Query & hailctl	9	1372	November 10, 2023
Ld_prune starts and stops error Hail Query & hailctl	1	682	May 30, 2019
Ld_prune() returns SparkException Hail Query & hailctl	16	820	December 11, 2018
LD pruning not finishing running Hail Query & hailctl	1	391	April 28, 2022

Issues with tiebreaker expression

Related topics