Variant Transforms supports custom networks. This can be used to start the processing VMs in a specific subnetwork of your Google Cloud project as opposed to the default network.
Specify a subnetwork by using the --subnetwork
flag and provide the name of the subnetwork as follows: --subnetwork my-subnet
. Just use the name of the subnet, not the full path.
Example:
COMMAND="/opt/gcp_variant_transforms/bin/vcf_to_bq ...
docker run gcr.io/cloud-lifesciences/gcp-variant-transforms \
--project "${GOOGLE_CLOUD_PROJECT}" \
--subnetwork my-subnet \
...
"${COMMAND}"
Variant Transforms allows disabling the use of external IP addresses with the
--use_public_ips
flag. If not specified, this defaults to true, so to restrict the
use of external IP addresses, use --use_public_ips false
. Note that without external
IP addresses, VMs can only send packets to other internal IP addresses. To allow these
VMs to connect to the external IP addresses used by Google APIs and services, you can
enable Private Google Access
on the subnet.
Example:
COMMAND="/opt/gcp_variant_transforms/bin/vcf_to_bq ...
docker run gcr.io/cloud-lifesciences/gcp-variant-transforms \
--project "${GOOGLE_CLOUD_PROJECT}" \
--use_public_ips false \
...
"${COMMAND}"
By default Variant Transforms uses a custom docker image to run the pipeline in: gcr.io/cloud-lifesciences/variant-transforms-custom-runner:latest
.
This image contains all the necessary python/linux dependencies needed to run variant transforms so that they are not downloaded from the internet when the pipeline starts.
You can override which container is used by passing a --sdk_container_image
as in the following example:
COMMAND="/opt/gcp_variant_transforms/bin/vcf_to_bq ...
docker run gcr.io/cloud-lifesciences/gcp-variant-transforms \
--project "${GOOGLE_CLOUD_PROJECT}" \
--sdk_container_image gcr.io/path/to/my/container\
...
"${COMMAND}"
By default the dataflow workers will use the default compute service account. You can override which service account to use with the --service_account
flag as in the following example:
COMMAND="/opt/gcp_variant_transforms/bin/vcf_to_bq ...
docker run gcr.io/cloud-lifesciences/gcp-variant-transforms \
--project "${GOOGLE_CLOUD_PROJECT}" \
--service_account my-cool-dataflow-worker@<PROJECT_ID>.iam.gserviceaccount.com\
...
"${COMMAND}"
Other Service Account Notes: