Skip to content

Latest commit

 

History

History
147 lines (103 loc) · 7.47 KB

README.md

File metadata and controls

147 lines (103 loc) · 7.47 KB

Introduction to ASON: AssemblyScript Object Notation

ASON is a data oriented algorithm designed for compact and speedy storage of AssemblyScript objects in a binary format.

There are many Serialization methods that can serialize a conceptual object into a buffer or string of some kind, like JSON and protobuf. ASON however is fine tuned just for AssemblyScript objects.

JSON and XML are declarative tree-like data structure formats. ASON uses a data-oriented approach, which means instead of a declarative data structure, it uses a collection of tables to describe object shapes. Assembling a tree becomes a linear time operation with a minimal amount of jumps.

Uses

This library is perfect for transferring references from one module of the same type to another module of the exact same type.

These serialization methods are also great for helping store references like configuration files on disk. If JSON is too verbose, or requires too much memory, or takes too long to parse for the fast world of WebAssembly, ASON is a better alternative, since it reduces overhead byte storage by a very large amount.

How To Use

Install from npm:

npm install --save-dev @ason/assembly

Modify your asconfig to include the transform:

{
  "options": {
    ... // other options here
    "transform": ["@ason/transform"]
  }
}

Import the library and serialize away!

You can use the built-in functions Ason.serialize() and Ason.deserialize():

import { ASON } from "@ason/assembly";

// serialize can determine type information
let buffer: StaticArray<u8> = ASON.serialize([3.14, 99, 25.624] as Array<f64>);

// deserialize must have the type passed (to perform type assertions)
let result: Array<f64> = ASON.deserialize<Array<f64>>(buffer);

assert(result.length == 3);
assert(result[0] == <f64>3.14);
assert(result[1] == <f64>99);
assert(result[2] == <f64>25.624);

It's also possible to save heap allocations, by declaring a new Serializer and Deserializer object. This is optimal when serializing multiple objects of the same type:

import { Serializer, Deserializer } from "@ason/assembly";
class Vec3 {
  constructor(public x: f32, public y: f32, public z: f32) {}
}

let result = new Array<StaticArray<u8>>(); // an array of buffers
let serializer = new ASON.Serializer<Vec3>();

for (let i = 0; i < 10; i++) {
  result.push(serializer.serialize(new Vec3(1, 2, 3)));
}

let deserializer = new ASON.Deserializer<Vec3>();

for (let i = 0; i < 10; i++) {
  let vec = deserializer.deserialize(result[i]);
  assert(vec); // make sure the reference isn't null
  assert(vec.x == 1); // check the properties
  assert(vec.y == 2);
  assert(vec.x == 3);
}

Advanced: Writing Custom Serializers and Deserializers

Some objects have values that don't necessarily need to be stored in an ASON Byte Array, in order to be preserved. In order to save on space, you can roll your own serializer and deserializer functions for your objects that don't store those extraneous values. Do this by defining two functions for your object:

  • __asonSerialize(): StaticArray<u8> - This function should take all values you wish to preserve in your object, store them in a StaticArray<u8>, and return that Byte Array.
  • __asonDeserialize(buffer: StaticArray<u8>): void - This function should take in the Byte Array generated by the __asonSerialize() function, and use that to rebuild the object.
class CustomVector {
  x: f32 = 1;
  y: f32 = 2;
  z: f32 = 3;

  __asonSerialize(): StaticArray<u8> {
    let result = new StaticArray<u8>(offsetof<CustomVector>());
    memory.copy(changetype<usize>(result), changetype<usize>(this), offsetof<CustomVector>());
    return result;
  }

  __asonDeserialize(buffer: StaticArray<u8>): void {
    assert(buffer.length == offsetof<CustomVector>());
    memory.copy(changetype<usize>(this), changetype<usize>(buffer), offsetof<CustomVector>());
  }
}

Caveats

  • If the modules using this library are different, then runtime type information might not match. This will result in runtime errors, instanceof checks failing, and undefined behavior. ASON also performs type information validation for objects at the top level, so providing the wrong reference type parameter to the ASON.deserialize function will result in a runtime error.

  • ASON serialization optimizes for large object trees, at the cost of making simple serialization slightly more expensive.

  • ASON cannot serialize objects with more than 2^32-1 values or references in them. This is because u32.MAX_VALUE is reserved for null references so that map keys and set entries can contain null values. We have chosen to accept this limitation, because if you are attempting to serialize single objects that are 4 Gigabytes in size (at an absolute minimum), we will not pass judgment, but we will recommend refactoring.

Implementation

The object that the serializer returns is a StaticArray<u8>, which is just a buffer of bytes. This array is composed of a Header at the beginning of the buffer (which will be changing quite a lot until the API becomes stable,) and the individual Tables that compose an ASON serialization. The header simply contains the byte length of each Table.

After the ASONHeader is a series of Tables describing the shape and contents of every field, every reference (stored like a Table of c-like pointers), and every possible combination of data segments, sets, maps, etc. that could be contained within an object.

It also holds a linking Table, that defines every way objects are linked to each other within the serialized object. These links must be defined, and asserted while deserializing, otherwise the garbage collection algorithm could potentially free objects, or otherwise mishandle them while they are being assembled back together at deserialization time. An additional benefit of using this kind of linking Table is the way this gracefully handles circular references: the serializer will recognize that a reference to a specific object already exists within one of the other Tables, and instead of adding a duplicate reference, the linking Table will simply point to the existing reference.

Lastly, we assert that the entryId stored at entry 0 is the primary entry of the buffer. If the generic type T of the Serializer is a number, it asserts that the primary entry of the reference table is a Box<T> instead of a T. The serializer wraps the primary entry in a Box<T>, if it does not already have a box.

MIT License

MIT License

Copyright (c) 2021 Joshua Tenner <tenner.joshua@gmail.com>

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.