Skip to content

Latest commit

 

History

History
211 lines (160 loc) · 8.53 KB

README.md

File metadata and controls

211 lines (160 loc) · 8.53 KB

a data pipe for quackheads

🐤 quackpipe

QuackPipe is a serverless OLAP API built on top of DuckDB emulating and aliasing the ClickHouse HTTP API

Play with DuckDB SQL and Cloud storage though a familiar API, without giving up old habits and integrations.

🐥 Demos

🐥 try a sample s3/parquet query in our miniature playground (fly.io free tier, 1x-shared-vcpu, 256Mb)
🐥 launch your own free instance on fly.io



🌱 Get Started

Download a binary release, use docker or build from source

🐋 Using Docker

docker pull ghcr.io/metrico/quackpipe:latest
docker run -ti --rm -p 8123:8123 ghcr.io/metrico/quackpipe:latest

📦 Download Binary

curl -fsSL github.com/metrico/quackpipe/releases/latest/download/quackpipe-amd64 --output quackpipe \
&& chmod +x quackpipe
🔌 Start Server w/ parameters
./quackpipe --port 8123
🔌 Start Server w/ file database, READ-ONLY access
./quackpipe --port 8123 --params "/tmp/test.db?access_mode=READ_ONLY"
🔌 Start Server w/ Motherduck authentication token
Using DuckDB Params
./quackpipe --port 8123 --params "/tmp/test.db?motherduck_token=YOUR_TOKEN_HERE"
Using System ENV
export motherduck_token='<token>'
./quackpipe --port 8123 

Run with -h for a full list of parameters

Parameters
params usage default
--port HTTP API Port 8123
--host HTTP API Host 0.0.0.0
--stdin STDIN query mode false
--format FORMAT handler JSONCompact
--params Optional Parameters

👉 Playground

Execute stateless queries w/o persistence using the embedded playground

👉 Stateful Queries

Execute stateful queries with data persistence by adding unique HTTP Authentication. No registration required.

👉 API

Execute queries using the POST API

curl -X POST https://quackpipe.fly.dev 
   -H "Content-Type: application/json"
   -d 'SELECT version()'  

👉 STDIN

Execute queries using STDIN

# echo "SELECT 'hello', version() as version FORMAT CSV" | ./quackpipe --stdin
hello,v1.1.1

🤜 Clickhouse SQL (chsql)

Quackpipe speaks a little ClickHouse SQL using the chsql DuckDB Extension providing users with 100+ ClickHouse SQL Command Macros two clients (HTTP/S and Native) to interact with remote ClickHouse APIs

Example

--- Install and load chsql
D INSTALL chsql FROM community;
D LOAD chsql;

--- Use any of the 100+ ClickHouse Function Macros
D SELECT IPv4StringToNum('127.0.0.1'), IPv4NumToString(2130706433);
┌──────────────────────────────┬─────────────────────────────┐
│ ipv4stringtonum('127.0.0.1') │ ipv4numtostring(2130706433) │
│            int32             │           varchar           │
├──────────────────────────────┼─────────────────────────────┤
│                   2130706433127.0.0.1                   │
└──────────────────────────────┴─────────────────────────────┘

Remote Queries

The built-in ch_scan function can be used to query remote ClickHouse servers using the HTTP/s API

--- Set optional X-Header Authentication
D CREATE SECRET extra_http_headers (
      TYPE HTTP,
      EXTRA_HTTP_HEADERS MAP{
          'X-ClickHouse-User': 'user',
          'X-ClickHouse-Key': 'password'
      }
  );
--- Query using the HTTP API
D SELECT * FROM ch_scan("SELECT number * 2 FROM numbers(10)", "https://play.clickhouse.com");

🤜 Extensions

Several extensions are pre-installed by default in Docker images, including parquet, json, httpfs
When using HTTP API, httpfs, parquet, json extensions are automatically pre-loaded by the wrapper.

Users can pre-install extensions and execute quackpipe using a custom parameters:

echo "INSTALL httpfs;" | ./quackpipe --stdin --params "?extension_directory=/tmp/"
./quackpipe --port 8123 --host 0.0.0.0 --params "?extension_directory=/tmp/"

ClickHouse HTTP

Quackpipe can be used to query a remote instance of itself and/or ClickHouse using the HTTP API

CREATE OR REPLACE MACRO quackpipe(query, server := 'https://play.clickhouse.com', user := 'play', format := 'JSONEachRow') AS TABLE
    SELECT * FROM read_json_auto(concat(server, '/?default_format=', format, '&user=', user, '&query=', query));

SELECT * FROM quackpipe("SELECT number as once, number *2 as twice FROM numbers(10)")

ClickHouse UDF

Quackpipe can be used as executable UDF to get DuckDB data IN/OUT of ClickHouse queries:

SELECT *
FROM executable('quackpipe -stdin -format TSV', TSV, 'id UInt32, num UInt32', (
    SELECT 'SELECT 1, 2'
))
Query id: dd878948-bec8-4abe-9e06-2f5813653c3a
┌─id─┬─num─┐
│  12 │
└────┴─────┘
1 rows in set. Elapsed: 0.155 sec.

🃏 What is this? Think of it as a SELECT within a SELECT with a different syntax.
🃏 Format confusion? Make DuckDB SQL feel like ClickHouse with the included ClickHouse Macro Aliases



🚧 Feature Status

  • DuckDB Core 1
    • cgo binding
    • Extension preloading
    • Aliases Extension
  • REST API 2
    • CH FORMAT Emulation
      • CSV, CSVWithNames
      • TSV, TSVWithNames
      • JSONCompact
      • Native
    • Web Playground (from ClickkHouse, Apache2 Licensed) 3
  • STDIN Fast Query Execution
  • ClickHouse Executable UDF
  • :memory: mode Cloud Storage (s3/r2/minio, httpfs, etc)
  • :file: mode using optional parameters

Contributors

    Contributors @metrico/quackpipe

Community

Stargazers for @metrico/quackpipe

🃏 Disclaimers

Footnotes

  1. DuckDB ® is a trademark of DuckDB Foundation. All rights reserved by their respective owners.

  2. Released under the MIT license. See LICENSE for details. All rights reserved by their respective owners.

  3. ClickHouse ® is a trademark of ClickHouse Inc. No direct affiliation or endorsement.