I would like to use the Hail team’s hosted version of Hail Batch (https://batch.hail.is) with datasets stored in Terra. How do I do that?
Great question! First you must register your service account with Firecloud/Terra. You can do that using Hail Batch like this:
import hailtop.batch as hb
b = hb.Batch(backend=hb.ServiceBackend(
billing_project='YOUR_HAIL_BATCH_BILLING_PROJECT',
remote_tmpdir='gs://YOUR_BUCKET/hail-batch-tmp'
))
j = b.new_job()
j.image('danking00/firecloud-tools:413.0.0-slim')
j.command('python3 /scripts/register_service_account/register_service_account.py -j $GOOGLE_APPLICATION_CREDENTIALS -e YOUR_BROAD_EMAIL')
b.run()
The job should run successfully and the output should look like:
The service account dking-rew76@hail-vdc.iam.gserviceaccount.com is now registered with FireCloud. You can share workspaces with this address, or use it to call APIs.
Now you just need to grant permission for your service account to access the bucket containing the data of interest.
1 Like