Here you have a tool for parsing Adobe Audience Manager CDF files. Those are the log files from the DMP platform from Adobe. They use a particular separation notation to represent in a line what is indeed a hierarchical structure.
In te process of ingesting this data into a tool like BigQuery or any data science platform we, at Divisadero, needed to transform this data into something accessible.
The easiest way of preserving the whole line/log structure without loosing any data seemed to be another line/log structure, but easier to access.
Simply use your preffered node package manager to add it:
yarn add aam-cdf-parser
A small demonstration script is provided to checkout how it works in general
lines. It can be found in cmd.js
. You can invoke it with:
./cmd.js input.gz output.ndj
We provide a small script for creating the table in BigQuery with the current
schema (defined in schema.json
). To create the table just load the schema
on the Web UI or call the script from te terminal like so:
./mktable.js my_dataset
This creates in BigQuery a partitioned table (by event time, not ingestion) so you can insert generated files directly.
parse(in: InputStream, out: OutputStream): Stream
It has several convenience methods/wrappers arround the main parse
method. Which is the primitive method, and the core of the library.
It chains several stream transformations and returns the last one (just
in case you want to keep on chaining).
const {parse} = require('aam-cdf-parser');
// ...
const input // = some.method.to.get.an.inputStream();
const output // = some.method.to.get.an.outputStream();
const onFinish = () => {console.log('done')};
parse(input, output).on('finish', onFinish);
promiseParse(in: InputStream, out: OutputStream): Promise<boolean>
Import it into your code either with require or Import
const {promiseParser} = require('aam-cdf-parser');
// ...
const input // = some.method.to.get.an.inputStream();
const output // = some.method.to.get.an.outputStream();
const onFinish = () => {console.log('done')};
promiseParser(input, output).then(onFinish);
callbackParse(in: InputStream, out: OutputStream, callback: Function)
Import it into your code either with require or Import
const {callbackParse} = require('aam-cdf-parser');
// ...
const input // = some.method.to.get.an.inputStream();
const output // = some.method.to.get.an.outputStream();
const onFinish = () => {console.log('done')};
callbackParse(input, output, onFinish);
local(in: String, out: String, callback: Function)
Import it into your code either with require or Import
const {callbackParse} = require('aam-cdf-parser');
// ...
const input = 'my-input-cdf-file.gz';
const output = 'my-output-file.json';
const onFinish = () => {console.log('done')};
local(input, output).then(onFinish);