Skip to content

Latest commit

 

History

History
112 lines (92 loc) · 4.82 KB

README.md

File metadata and controls

112 lines (92 loc) · 4.82 KB

TeRM: Extending RDMA-Attached Memory with SSD

This is the open-source repository for our paper TeRM: Extending RDMA-Attached Memory with SSD on FAST'24 and ACM Transactions on Storage.

Notably, the codename of TeRM is PDP (one step further beyond ODP).

Directory structure

TeRM
|---- ae                 # artifact evaluation files
    |---- bin            # binaries generated by source code
    |---- scripts        # common scripts
    |---- figure-*.py    # scripts to execute experiments
    |---- run-all.sh     # run all experiments
|---- app                # source code of octopus and xstore with some bugs fixed 
|---- driver             # TeRM's driver
    |---- driver.patch   # patches to the official driver
    |---- mlnx-*.zip     # the patched driver
|---- libterm            # TeRM's userspace shared library

How to build

Environment

  • OS: Ubuntu 22.04.2 LTS
  • Kernel: Linux 5.19.0-50-generic
  • OFED driver: 5.8-2.0.3

We recommend the same environment used in our development. You may need to customize the source code for different enviroment. The environment is mainly required for the driver. We only need the patched driver on the server side.

Dependencies

sudo apt install libfmt-dev libaio-dev libboost-coroutine-dev libmemcached-dev libgoogle-glog-dev libgflags-dev

Settings

We hard coded some settings in the source code. Please modify them according to your cluster settings.

  1. memcached. TeRM uses memcached to synchronize cluster metadata. Please install memcached in your cluster and modify the ip and port in ae/scripts/reset-memc.sh, libterm/ibverbs-pdp/global.cc, and libterm/include/node.hh.

  2. CPU affinity. The source code is in class Schedule of file libterm/include/util.hh. Please modify the constants according to your CPU hardware.

Build the driver

The patched driver is required on the server side. There are two ways to build the driver. We provide an out-of-the-box driver zip file in the second choice.

  1. Download the source code of the driver from the official website. Apply official backport batches first and then patch the modifications listed in driver/driver.patch. Then, build the driver. Please note that, we apply minimum number of patches, instead of all patches, that make it work for our environment. One shall not git apply the driver/driver.patch directly, because line numbers may differ. One should parse and patch it manually.

  2. Use driver/mlnx-ofed-kernel-5.8-2.0.3.0.zip. Unzip it and run the contained build.sh.

Build libterm

We provide CMakeLists.txt for building. It produces two outputs, the userspace shared library libpdp.so and a program perf. Please copy two files to ae/bin before running AE scripts.

$ cd libterm
$ mkdir -p build && cd build
$ cmake .. -DCMAKE_BUILD_TYPE=Release # Release for compiler optimizations and high performance
$ make -j

How to use

  1. Replace the modified driver *.ko files on the server side and restart the openibd service.
  2. Restart the memcached instance. We provide a script ae/scripts/reset-memc.sh to do so.
  3. mmap an SSD in the RDMA program with MAP_SHARED and ibv_reg_mr the memory area as an ODP MR.
  4. Set LD_PRELOAD=libpdp.so on all nodes to enable TeRM. Also set enviroment variables PDP_server_mmap_dev=nvmeXnY for the SSD backend and PDP_server_memory_gb=Z for the size of the mapped area. Set PDP_is_server=1 if and only if for the server side.
  5. Run the RDMA application.

libterm accepts a series of environment variables for configuration. Please refer to libterm/ibverbs-pdp/global.cc for more details.

If you have further questions and interests about the repository, please feel free to propose an issue or contact me via email (yangzhe.ac AT outlook.com). You can find my github at yzim.

To cite our paper:

@inproceedings {fast24-term,
author = {Zhe Yang and Qing Wang and Xiaojian Liao and Youyou Lu and Keji Huang and Jiwu Shu},
title = {{TeRM}: Extending {RDMA-Attached} Memory with {SSD}},
booktitle = {22nd USENIX Conference on File and Storage Technologies (FAST 24)},
year = {2024},
isbn = {978-1-939133-38-0},
address = {Santa Clara, CA},
pages = {1--16},
url = {https://www.usenix.org/conference/fast24/presentation/yang-zhe},
publisher = {USENIX Association},
month = feb
}

@article{tos24-term,
author = {Yang, Zhe and Wang, Qing and Liao, Xiaojian and Lu, Youyou and Huang, Keji and Shu, Jiwu},
title = {Efficiently Enlarging RDMA-Attached Memory with SSD},
year = {2024},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
issn = {1553-3077},
url = {https://doi.org/10.1145/3700772},
doi = {10.1145/3700772},
journal = {ACM Trans. Storage},
month = oct
}