Hi Roy,
We don’t have any functions to do this for you right now, but I agree it would be useful to add!
It’s possible to do this in Python, though it looks a bit ugly. It’s also going to be slow-ish, since the implementation is a bit mangy, but it works!
def lower_va_schema(vds):
# the below function is called on structs to recursively generate the expr from lower case names
def generate_struct_expr(schema, prefix):
assert isinstance(schema, TStruct)
exprs = []
for field in schema.fields:
name = field.name
typ = field.typ
full_name = prefix + '.`{}`'.format(name)
lower = name.lower()
if isinstance(typ, TStruct):
right_hand = generate_struct_expr(typ, full_name)
else:
right_hand = full_name
exprs.append('`{}`: {}'.format(lower, right_hand))
return '{' + ','.join(exprs) + '}'
va_expr = generate_struct_expr(vds.variant_schema, 'va')
return vds.annotate_variants_expr('va = {}'.format(va_expr))
Here’s a test:
Pre-lowering schema:
In [21]: pprint(vds.variant_schema)
Struct{
rsid: String,
qual: Double,
filters: Set[String],
info: Struct{
NEGATIVE_TRAIN_SITE: Boolean,
HWP: Double,
AC: Array[Int],
culprit: String,
MQ0: Int,
ReadPosRankSum: Double,
AN: Int,
InbreedingCoeff: Double,
AF: Array[Double],
GQ_STDDEV: Double,
FS: Double,
DP: Int,
GQ_MEAN: Double,
POSITIVE_TRAIN_SITE: Boolean,
VQSLOD: Double,
ClippingRankSum: Double,
BaseQRankSum: Double,
MLEAF: Array[Double],
MLEAC: Array[Int],
MQ: Double,
QD: Double,
END: Int,
DB: Boolean,
HaplotypeScore: Double,
MQRankSum: Double,
CCC: Int,
NCC: Int,
DS: Boolean
},
FOO: Struct{
BAR: Struct{
BAZ: Int
}
},
f: Struct{
F: Struct{
f: Struct{
F: Int
}
}
}
}
Now calling the function:
In [22]: pprint(lower_va_schema(vds).variant_schema)
Struct{
rsid: String,
qual: Double,
filters: Set[String],
info: Struct{
negative_train_site: Boolean,
hwp: Double,
ac: Array[Int],
culprit: String,
mq0: Int,
readposranksum: Double,
an: Int,
inbreedingcoeff: Double,
af: Array[Double],
gq_stddev: Double,
fs: Double,
dp: Int,
gq_mean: Double,
positive_train_site: Boolean,
vqslod: Double,
clippingranksum: Double,
baseqranksum: Double,
mleaf: Array[Double],
mleac: Array[Int],
mq: Double,
qd: Double,
end: Int,
db: Boolean,
haplotypescore: Double,
mqranksum: Double,
ccc: Int,
ncc: Int,
ds: Boolean
},
foo: Struct{
bar: Struct{
baz: Int
}
},
f: Struct{
f: Struct{
f: Struct{
f: Int
}
}
}
}