Hi,
Is it possible for me to write a matrix table to a csv file
Thank you
Hi @Jaden30 !
It sure is possible! The docs are hidden at Expression.export
:
>>> small_mt.GT.export('output/gt.tsv')
>>> with open('output/gt.tsv', 'r') as f:
... for line in f:
... print(line, end='')
locus alleles 0 1 2 3
1:1 ["A","C"] 0/1 0/1 0/0 0/0
1:2 ["A","C"] 1/1 0/1 1/1 1/1
1:3 ["A","C"] 1/1 0/1 0/1 0/0
1:4 ["A","C"] 1/1 0/1 1/1 1/1
You’ll want to use delimiter=','
to get a CSV instead of a TSV.
Hi Danking,
the examples in the code only show writing a field to a tsv/csv file, there are no ways for me to write the entire matrix table to a csv file?
If you want to write the entire entry as a JSON object, that works:
In [1]: import hail as hl
...:
...: mt = hl.balding_nichols_model(1, 3, 3)
...: mt = mt.annotate_entries(AD=5)
...: mt.entry.export('/tmp/bar.tsv')
In [2]: !cat /tmp/bar.tsv
locus alleles 0 1 2
1:1 ["A","C"] {"GT":"0/1","AD":5} {"GT":"1/1","AD":5} {"GT":"0/1","AD":5}
1:2 ["A","C"] {"GT":"1/1","AD":5} {"GT":"0/1","AD":5} {"GT":"1/1","AD":5}
1:3 ["A","C"] {"GT":"0/1","AD":5} {"GT":"0/0","AD":5} {"GT":"0/0","AD":5}
You can include more row fields by adding them to the key:
In [4]: import hail as hl
...:
...: mt = hl.balding_nichols_model(1, 3, 3)
...: mt = mt.annotate_entries(AD=5)
...: non_key_row_fields = set(mt.row) - set(mt.row_key)
...: mt.key_rows_by(*mt.row_key, *non_key_row_fields).entry.export('/tmp/bar.tsv')
In [5]: !cat /tmp/bar.tsv
locus alleles af ancestral_af 0 1 2
1:1 ["A","C"] [0.44805795611590166] 5.39051e-01 {"GT":"0/1","AD":5} {"GT":"0/1","AD":5} {"GT":"0/1","AD":5}
1:2 ["A","C"] [0.7042578478282053] 8.67678e-01 {"GT":"1/1","AD":5} {"GT":"1/1","AD":5} {"GT":"1/1","AD":5}
1:3 ["A","C"] [0.35246827252547935] 4.37646e-01 {"GT":"0/1","AD":5} {"GT":"0/0","AD":5} {"GT":"0/1","AD":5}
If you don’t want JSON for entries, you can do this, admittedly very ugly, thing:
In [23]: import hail as hl
...:
...: mt = hl.balding_nichols_model(1, 3, 3)
...: mt = mt.annotate_entries(AD=5)
...: mt = mt.annotate_cols(entry_id=list(range(len(mt.entry))))
...: mt = mt.explode_cols(mt.entry_id)
...: mt = mt.key_cols_by(sample_id = hl.str(mt.sample_idx) + hl.literal('_') + hl.literal(list(mt.entry))[mt.entry_id])
...: mt = mt.select_entries(entries_as_str = [hl.str(mt[f]) for f in mt.entry])
...: mt = mt.select_entries(the_entry=mt.entries_as_str[mt.entry_id])
...: non_key_row_fields = set(mt.row) - set(mt.row_key)
...: mt.key_rows_by(*mt.row_key, *non_key_row_fields).the_entry.export('/tmp/bar.tsv')
In [22]: !cat /tmp/bar.tsv
locus alleles af ancestral_af 0_GT 0_AD 1_GT 1_AD 2_GT 2_AD
1:1 ["A","C"] [0.5383546190066579] 5.39051e-01 0/1 5 0/0 5 0/1 5
1:2 ["A","C"] [0.9595560241510789] 8.67678e-01 1/1 5 0/1 5 1/1 5
1:3 ["A","C"] [0.5301809406988318] 4.37646e-01 0/1 5 0/0 5 0/1 5
As for the column fields, you cannot include those in the CSV. I’m not really sure how to do that in CSV? I would store the column fields in a separate CSV file.
Also, be ware that commas often appear in the JSON that Hail generates for non-scalar fields like the alleles list.
Thank you very much for your response
Traceback (most recent call last):
File "transvcf.py", line 66, in <module>
fire.Fire(VCF)
File "/Users/jaden/miniconda3/envs/hail/lib/python3.7/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/Users/jaden/miniconda3/envs/hail/lib/python3.7/site-packages/fire/core.py", line 471, in _Fire
target=component.__name__)
File "/Users/jaden/miniconda3/envs/hail/lib/python3.7/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "transvcf.py", line 50, in identify_AF
dataset = hl.import_vcf(vcf, skip_invalid_loci = True, array_elements_required=False)
File "<decorator-gen-1348>", line 2, in import_vcf
File "/Users/jaden/miniconda3/envs/hail/lib/python3.7/site-packages/hail/typecheck/check.py", line 576, in wrapper
args_, kwargs_ = check_all(__original_func, args, kwargs, checkers, is_method=is_method)
File "/Users/jaden/miniconda3/envs/hail/lib/python3.7/site-packages/hail/typecheck/check.py", line 543, in check_all
args_.append(arg_check(args[i], name, arg_name, checker))
File "/Users/jaden/miniconda3/envs/hail/lib/python3.7/site-packages/hail/typecheck/check.py", line 584, in arg_check
return checker.check(arg, function_name, arg_name)
File "/Users/jaden/miniconda3/envs/hail/lib/python3.7/site-packages/hail/typecheck/check.py", line 82, in check
return tc.check(x, caller, param)
File "/Users/jaden/miniconda3/envs/hail/lib/python3.7/site-packages/hail/typecheck/check.py", line 328, in check
return f(tc.check(x, caller, param))
File "/Users/jaden/miniconda3/envs/hail/lib/python3.7/site-packages/hail/genetics/reference_genome.py", line 10, in <lambda>
reference_genome_type = oneof(transformed((str, lambda x: hl.get_reference(x))), rg_type)
File "/Users/jaden/miniconda3/envs/hail/lib/python3.7/site-packages/hail/context.py", line 554, in get_reference
Env.hc()
File "/Users/jaden/miniconda3/envs/hail/lib/python3.7/site-packages/hail/utils/java.py", line 55, in hc
init()
File "<decorator-gen-1714>", line 2, in init
File "/Users/jaden/miniconda3/envs/hail/lib/python3.7/site-packages/hail/typecheck/check.py", line 577, in wrapper
return __original_func(*args_, **kwargs_)
File "/Users/jaden/miniconda3/envs/hail/lib/python3.7/site-packages/hail/context.py", line 252, in init
skip_logging_configuration, optimizer_iterations)
File "/Users/jaden/miniconda3/envs/hail/lib/python3.7/site-packages/hail/backend/spark_backend.py", line 176, in __init__
self._jbackend, log, True, append, branching_factor, skip_logging_configuration, optimizer_iterations)
File "/Users/jaden/miniconda3/envs/hail/lib/python3.7/site-packages/py4j/java_gateway.py", line 1305, in __call__
answer, self.gateway_client, self.target_id, self.name)
File "/Users/jaden/miniconda3/envs/hail/lib/python3.7/site-packages/py4j/protocol.py", line 328, in get_return_value
format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling z:is.hail.HailContext.apply.
: is.hail.utils.HailException: Hail requires Java
I had that error when running hail, I do not understand it or know how to fix it, can anybody help. Thank you
Did you install Java? It’s listed in our installation instructions: Hail | Installing Hail
Yes i have using the instructions. Still returning the error
This looks like it’s missing part of the message. It should say something like:
Hail requires Java 8 or 11, found 12
What version of java (java -version
) do you have installed?