This is a work-in-progress demonstration of reading a series of Cap'N Proto messages into Arrow. The dynamic value Reader is used to flexibly traverse arbitrary schemas, allowing the library to be schema-agnostic.
rustup override set nightly
sudo apt install capnproto # install compiler
Generate an id: capnp id
Create a JSON Lines file with a new-line separated list of points:
cat << EOF > points.jsonl
{"values": [{"x": 0, "y": 1}, {"x": -1, "y": 2}]}
{"values": [{"x": 0, "y": 0}]}
{"values": [{"x": -2, "y": 3}]}
EOF
Convert the JSONL to binary Cap'N Proto messages based on the schema:
cat points.jsonl | capnp convert json:binary ./src/schema/point.capnp Points > points.bin
Run the binary messages through the demo:
$ cat points.bin | cargo run
shape: (3, 1)
┌─────────────────────────┐
│ values │
│ --- │
│ list[struct[2]] │
╞═════════════════════════╡
│ [{0.0,1.0}, {-1.0,2.0}] │
│ [{0.0,0.0}] │
│ [{-2.0,3.0}] │
└─────────────────────────┘
cargo test
The test schema is from the capnproto-rust repo:
wget -qO- https://raw.githubusercontent.com/capnproto/capnproto-rust/master/capnpc/test/test.capnp > tests/test.capnp
rust-gdb -q target/debug/capnp2arrow
(gdb) b rust_panic
(gdb) r < points.bin
-
Reflection based
Debug
implementation: https://github.com/capnproto/capnproto-rust/blob/f7c86befe11b27f33c2a45957d402abff2b9e347/capnp/src/stringify.rs -
Reflection based example: https://github.com/capnproto/capnproto-rust/blob/master/example/fill_random_values/src/lib.rs
-
Cap'N Proto
TypeVariant
: https://docs.rs/capnp/latest/capnp/introspect/enum.TypeVariant.html -
Arrow2
DataTypes
: https://docs.rs/arrow2/latest/arrow2/datatypes/enum.DataType.html -
Cap'N Proto Language Reference: https://capnproto.org/language.html
-
Cap'N Proto test schema: https://github.com/capnproto/capnproto/blob/master/c%2B%2B/src/capnp/test.capnp
-
Cap'N Proto test JSON: https://github.com/capnproto/capnproto/blob/master/c%2B%2B/src/capnp/testdata/pretty.json