Skip to content

Plugins

Jakub Puchała edited this page Mar 27, 2018 · 10 revisions

Writing own queryable sources

Musoq uses conception of schemas and virtual tables. When typing ... from #csv.file('path/to/file.csv') you are refering to schema named csv and parametrized source based on which the virtual table will be created. What steps are required to create own source:

  1. Implement ISchemaTable. It defines what columns table has.
  2. Implement queryable source by inheriting from RowSource.
  3. Implement user defined functions by inheriting from LibraryBase
  4. Implement schema by inheriting from SchemaBase.
  5. Edit service .app config file

All those tasks are fairly easy and we will implement very easy log source to make sure everything is clear and explained in details.

Let's start with something really easy. Simple flat file reader. It will treat file as queryable source, read it's lines one by one and do something usefull with it. Our table will consist of two columns, LineNumber and Line. The first column is just pointer to line in file. The second one is line from that file.

Implement ISchemaTable

After that short introduction, let's do some implementation. First one is ISchemaTable. This is interface that defines what columns your source has. Based on that, evaluator will be able to access proper column from the source, will know it's name, type and index. Ok, so before we start implement table, we will need to create some helper classes. Look at it below:

public static class FlatFileHelper
{
    public static readonly IDictionary<string, int> FlatNameToIndexMap;
    public static readonly IDictionary<int, Func<FlatFileEntity, object>> FlatIndexToMethodAccessMap;
    public static readonly ISchemaColumn[] FlatColumns;

    static FlatFileHelper()
    {
        FlatNameToIndexMap = new Dictionary<string, int>
        {
            {nameof(FlatFileEntity.LineNumber), 0},
            {nameof(FlatFileEntity.Line), 1}
        };

        FlatIndexToMethodAccessMap = new Dictionary<int, Func<FlatFileEntity, object>>
        {
            {0, info => info.LineNumber},
            {1, info => info.Line}
        };

        FlatColumns = new ISchemaColumn[]
        {
            new SchemaColumn(nameof(FlatFileEntity.LineNumber), 0, typeof(int)),
            new SchemaColumn(nameof(FlatFileEntity.Line), 1, typeof(string))
        };
    }
}

What this class do are three things:

  1. Defining columns in variable FlatColumns. As you can see, it just instantiate array of columns, their name, indexes and type of stored values.
  2. Define FlatNameToIndexMap that remap your column name to specifix index.
  3. Define FlatIndexToMethodAccessMap that points how to access specific property from the source.

There is one thing that is not mentioned here. what is FlatFileEntity in public static readonly IDictionary<int, Func<FlatFileEntity, object>> FlatIndexToMethodAccessMap. Answer is, the FlatFileEntity type holds single row from source. In our case, it will contain a single line from file and line number. After we have this helper class, We can move forward and implement proper table.

The easiest one implementation is really short and doesn't need explanation. It just assigns columns from helper class to our table instance.

public class FlatFileTable : ISchemaTable
{
    public ISchemaColumn[] Columns { get; } = FlatFileHelper.FlatColumns;
}

As you see, there is no any file accessing here, so where do We access our file? The answer is not here. This class just define metadata about your source. We will implement it for a little bit below.

Implement queryable source

Row source for your queries are the most robust code you will need to code and as you will see, it's complexity will depends mostly from how hard is to access row sources for your use case. In case of this tutorial, it won't be hard. Basic approach proposed by this framework is to use chunked source load. Chunked load is based on loading file in parts and process only those parts, not whole source so the processor doesn't have to wait for the file to be loaded completely.

Clone this wiki locally