protobq

protobq is a tool designed to simplify and streamline schema management for materialized views in BigQuery. Instead of managing both the base tables and materialized views separately, developers only need to define the schema of the materialized view. Based on this schema, protobq automatically constructs and maintains the corresponding base table.

This approach ensures consistency between the materialized view and its source data, allowing developers to focus on high-level data modeling without worrying about the complexities of table creation or maintenance.

Key features:

Idempotent Schema Management: Define your materialized view schema declaratively, and let protobq handle updates and changes seamlessly.
Base Table Automation: Automatically create and manage the base table from your materialized view schema.
BigQuery-Native Optimization: Leverage BigQuery’s best practices, such as partitioning, clustering, and incremental refreshes, directly through schema definitions.
Protocol Buffers Integration: Use Protocol Buffers to define your schemas, enabling compatibility, extensibility, and multi-language support.

Philosophy

Why Protocol Buffers?

Schema-First Approach
BigQuery’s schema-driven nature aligns with protobuf, enabling structured and type-safe schema definitions.
Versioning and Evolution
Protobuf supports backward and forward compatibility, simplifying schema updates and ensuring long-term maintainability.
Seamless BigQuery Integration
BigQuery types map directly to protobuf types (STRING → string, etc.), ensuring consistency and reducing conversion complexity.
Readable and Extensible
Protobuf schemas are both human-readable and machine-readable, aiding collaboration, automation, and extensibility.

Why Materialized View First?

Simplified Architecture
Consolidating data into a unified base table simplifies data pipelines and downstream processes.
Query Optimization
Materialized views allow flexible clustering and partitioning, improving query performance for diverse use cases.
Cost and Performance Benefits
Precomputed results lower query costs and significantly improve performance for repetitive workloads.
Consistency and Reusability
A single base table ensures data integrity and facilitates schema reuse across multiple materialized views.

Quick Start

Installation

go install github.com/averak/protobq/cmd/protobq@latest

Usage

1. Define schema

syntax = "proto3";

package example;

import "averak/protobq/protobq.proto";
import "google/protobuf/timestamp.proto";

message View1 {
  option (protobq.materialized_view) = {
    base_table: "example"
    enable_refresh: true
  };

  google.protobuf.Timestamp timestamp = 1 [(protobq.materialized_view_field) = {
    origin_path: "timestamp"
    is_partitioned: true
  }];
  
  string message = 2;
}

2. Apply schema

protobq apply -i example/view1.proto --project-id {YOUR_PROJECT_ID} --dataset-id {YOUR_DATASET_ID}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.github/workflows		.github/workflows
cmd		cmd
internal		internal
schema/protobuf		schema/protobuf
.editorconfig		.editorconfig
.gitignore		.gitignore
.golangci.yml		.golangci.yml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
buf.gen.yaml		buf.gen.yaml
buf.lock		buf.lock
buf.yaml		buf.yaml
go.mod		go.mod
go.sum		go.sum
protobq.go		protobq.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

protobq

Philosophy

Why Protocol Buffers?

Why Materialized View First?

Quick Start

Installation

Usage

1. Define schema

2. Apply schema

About

Releases

Packages

Languages

License

averak/protobq

Folders and files

Latest commit

History

Repository files navigation

protobq

Philosophy

Why Protocol Buffers?

Why Materialized View First?

Quick Start

Installation

Usage

1. Define schema

2. Apply schema

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages