This is an adaptation of the original reference recursive model indexes (RMIs) implementation, specifically for generating P4 source code files. A prototype RMI was initially described in The Case for Learned Index Structures by Kraska et al. in 2017. The original reference implementation that generates C++ source code files can be found at learnedsystems/RMI.
This work is at best a proof of concept and by no means close to being a viable solution for actual real world switches. A detailed analysis of existing limitations and a more in depth description of my work can be found in my bachelor thesis.
With that said this implementation focuses on generating P4 source code files that are tested and meant to be used with the BMv2 reference software switch. In that sense this implementation has a lot more freedom when looking at performance limitations or similar things that real world switches would restrain a lot more.
To use this implementation, clone this repository and install Rust.
The RMI-P4 implementation is a compiler just as the reference implementation is. It takes a dataset as input, and produces P4 and Python source files as outputs. The data input file must be a binary file containing:
- The number of items, as a 64-bit unsigned integer (little endian)
- The data items, either 32-bit or 64-bit unsigned integers (little endian)
If the input file contains 32-bit integers, the filename must end with uint32
. If the input file contains 64-bit integers, the filename must end with uint64
.
In addition to the input dataset, you must also provide a model structure. For example, to build a 2-layer RMI on the data file books_200M_uint32
(available from the Harvard Dataverse) with a branching factor of 100, one could run:
cargo run --release -- books_200M_uint32 p4-rmi linear,linear 100 --data-path path/where/to/store/model/parameters
Logging useful diagnostic information can be enabled by setting the RUST_LOG
environmental variable to trace
: export RUST_LOG=trace
.
The RMI generator produces a P4 source code file and a Python source code file in the current directory. The P4 source code file is meant to be compiled with a P4 compiler such as p4c. Currently the compiled P4 file is then meant to be run on the BMv2 switch. Simply copy the generated P4 file in a corresponding Mininet folder and type make
. As a final step the generated Python source file has to be executed to send the necessary model parameters to the switch using P4Runtime.
To install BMv2 and the p4c compiler, follow the instructions given in the official P4 Tutorial in the section Obtaining required software.
Currently, the following types of RMI layers are supported for P4:
linear
, simple linear regressioncubic
, connected cubic spline segmentsradix
, eliminates common prefixes and returns a fixed number of significant bits based on the branching factor
The following remaining types of RMI layers are not (yet) supported for P4:
normal
, normal CDF with tuned mean, variance, and scale.loglinear
, simple linear regression with a log transformlognormal
, normal CDF with log transformbradix
, same as radix, but attempts to choose the number of bits based on balancing the datasethistogram
, partitions the data into several even-sized blocks (based on the branching factor)