Skip to content

Latest commit

 

History

History
279 lines (202 loc) · 17.2 KB

README.md

File metadata and controls

279 lines (202 loc) · 17.2 KB

Google BigQuery Storage Client for Java

Java idiomatic client for BigQuery Storage.

Maven Stability

Quickstart

If you are using Maven with BOM, add this to your pom.xml file:

<dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>com.google.cloud</groupId>
      <artifactId>libraries-bom</artifactId>
      <version>26.49.0</version>
      <type>pom</type>
      <scope>import</scope>
    </dependency>
    <dependency>
      <groupId>io.opentelemetry</groupId>
      <artifactId>opentelemetry-bom</artifactId>
      <version>1.43.0</version>
      <type>pom</type>
      <scope>import</scope>
     </dependency>
  </dependencies>
</dependencyManagement>

<dependencies>
  <dependency>
    <groupId>com.google.cloud</groupId>
    <artifactId>google-cloud-bigquerystorage</artifactId>
  </dependency>

If you are using Maven without the BOM, add this to your dependencies:

<dependency>
  <groupId>com.google.cloud</groupId>
  <artifactId>google-cloud-bigquerystorage</artifactId>
  <version>3.10.2</version>
</dependency>

If you are using Gradle 5.x or later, add this to your dependencies:

implementation platform('com.google.cloud:libraries-bom:26.49.0')

implementation 'com.google.cloud:google-cloud-bigquerystorage'

If you are using Gradle without BOM, add this to your dependencies:

implementation 'com.google.cloud:google-cloud-bigquerystorage:3.10.2'

If you are using SBT, add this to your dependencies:

libraryDependencies += "com.google.cloud" % "google-cloud-bigquerystorage" % "3.10.2"

Authentication

See the Authentication section in the base directory's README.

Authorization

The client application making API calls must be granted authorization scopes required for the desired BigQuery Storage APIs, and the authenticated principal must have the IAM role(s) required to access GCP resources using the BigQuery Storage API calls.

Getting Started

Prerequisites

You will need a Google Cloud Platform Console project with the BigQuery Storage API enabled. You will need to enable billing to use Google BigQuery Storage. Follow these instructions to get your project set up. You will also need to set up the local development environment by installing the Google Cloud Command Line Interface and running the following commands in command line: gcloud auth login and gcloud config set project [YOUR PROJECT ID].

Installation and setup

You'll need to obtain the google-cloud-bigquerystorage library. See the Quickstart section to add google-cloud-bigquerystorage as a dependency in your code.

About BigQuery Storage

BigQuery Storage is an API for reading data stored in BigQuery. This API provides direct, high-throughput read access to existing BigQuery tables, supports parallel access with automatic liquid sharding, and allows fine-grained control over what data is returned.

See the BigQuery Storage client library docs to learn how to use this BigQuery Storage Client Library.

OpenTelemetry support

The client supports emitting metrics to OpenTelemetry. This is disabled by default. It can be enabled by calling

JsonStreamWriter.Builder.setEnableOpenTelemetry(true)

The following metric attributes are supported.

Key Value
error_code Specifies error code in the event an append request fails, or a connection ends.
is_retry Indicates this was a retry operation. This can be set for either ack’ed requests or connection retry attempts.
table_id Holds fully qualified name of destination table
trace_field_1 If a colon-separated traceId is provided, this holds the first portion. Must be non-empty. Currently populated only for Dataflow.
trace_field_2 If a colon-separated traceId is provided, this holds the second portion. Must be non-empty. Currently populated only for Dataflow.
trace_field_3 If a colon-separated traceId is provided, this holds the third portion. Must be non-empty. Currently populated only for Dataflow.
writer_id Specifies writer instance id.
The following metrics are supported.
Name Kind
------------------------------ ---------------------
active_connection_count Asynchronous gauge
append_requests_acked Synchronous counter
append_request_bytes_acked Synchronous counter
append_rows_acked Synchronous counter
connection_end_count Synchronous counter
connection_start_count Synchronous counter
inflight_queue_length Asynchronous gauge
network_response_latency Histogram

Exporting OpenTelemetry metrics

An exporter or collector must be installed by the application in order for OpenTelemetry metrics to be captured. The sample application uses Google Monitoring Metrics Exporter to export metrics to a Google Cloud project.

Samples

Samples are in the samples/ directory.

Sample Source Code Try it
Export Open Telemetry source code Open in Cloud Shell
Json Writer Stream Cdc source code Open in Cloud Shell
Parallel Write Committed Stream source code Open in Cloud Shell
Storage Arrow Sample source code Open in Cloud Shell
Storage Sample source code Open in Cloud Shell
Write Buffered Stream source code Open in Cloud Shell
Write Committed Stream source code Open in Cloud Shell
Write Pending Stream source code Open in Cloud Shell
Write To Default Stream source code Open in Cloud Shell

Troubleshooting

To get help, follow the instructions in the shared Troubleshooting document.

Transport

BigQuery Storage uses gRPC for the transport layer.

Supported Java Versions

Java 8 or above is required for using this client.

Google's Java client libraries, Google Cloud Client Libraries and Google Cloud API Libraries, follow the Oracle Java SE support roadmap (see the Oracle Java SE Product Releases section).

For new development

In general, new feature development occurs with support for the lowest Java LTS version covered by Oracle's Premier Support (which typically lasts 5 years from initial General Availability). If the minimum required JVM for a given library is changed, it is accompanied by a semver major release.

Java 11 and (in September 2021) Java 17 are the best choices for new development.

Keeping production systems current

Google tests its client libraries with all current LTS versions covered by Oracle's Extended Support (which typically lasts 8 years from initial General Availability).

Legacy support

Google's client libraries support legacy versions of Java runtimes with long term stable libraries that don't receive feature updates on a best efforts basis as it may not be possible to backport all patches.

Google provides updates on a best efforts basis to apps that continue to use Java 7, though apps might need to upgrade to current versions of the library that supports their JVM.

Where to find specific information

The latest versions and the supported Java versions are identified on the individual GitHub repository github.com/GoogleAPIs/java-SERVICENAME and on google-cloud-java.

Versioning

This library follows Semantic Versioning.

Contributing

Contributions to this library are always welcome and highly encouraged.

See CONTRIBUTING for more information how to get started.

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms. See Code of Conduct for more information.

License

Apache 2.0 - See LICENSE for more information.

CI Status

Java Version Status
Java 8 Kokoro CI
Java 8 OSX Kokoro CI
Java 8 Windows Kokoro CI
Java 11 Kokoro CI

Java is a registered trademark of Oracle and/or its affiliates.