ScalaBFS is an BFS accelerator built on top of an FPGA configured with HBM (i.e., FPGA-HBM platform) which can scale its performance according to the available memory channels (on a single card). It utlizes multiple processing elements to sufficiently exploit the high bandwidth of HBM to improve efficiency. We implement the prototype system of ScalaBFS on Xilinx Alveo U280 FPGA card (real hardware). Paper: https://arxiv.org/abs/2105.11754
The code for ScalaBFS using Chisel language is located in src/ directory. Vitis project is located in ScalaBFS-proj/ directory after deployment. Graph data processing files are provided in preprocess/ directory.
This project works on Xilinx U280 Data Center Accelerator card.
Ubuntu 18.04 LTS
U280 Package File on Vitis 2019.2
Notice:
- After the installation of xdma and update the shell on alveo card manually(under normal circumstances , the command is shown in the process of the installtion of xdma. If not , you can use command "/opt/xilinx/xrt/bin/xbmgmt flash --update"), you should cold reboot your machine. The cold reboot means that you should shutdown your machine , unplug the power , wating for several minutes , plug the power and boot up your machine.You can use command
/opt/xilinx/xrt/bin/xbmgmt flash --scan
/opt/xilinx/xrt/bin/xbutil validate
to make sure that the runtime enviroment and the alveo card is ready.
- Don't forget to add the xrt and Vitis to your PATH. Typically you can
source /opt/xilinx/xrt/setup.sh
source /tools/Xilinx/Vitis/2019.2/settings64.sh
You can also add this two commands to your .bashrc file.If in the process of making ScalaBFS you fail and see "make: vivado: Command not found", you very likely ignored this step.
-
If you meet "PYOPENCL INSTALL FAILED" in the installtion of xrt , refer to AR# 73055
-
If you meet "XRT Requires opencl header" when you open Vitis , refer to Vitis prompt “XRT Requires opencl header"
To compile chisel code, you need to install:
- Java 1.0.8
sudo apt install openjdk-8-jre-headless
sudo apt-get install java-wrappers
sudo apt-get install default-jdk
- sbt 1.4.2
echo "deb https://dl.bintray.com/sbt/debian /" | \
sudo tee -a /etc/apt/sources.list.d/sbt.list
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 \
--recv 2EE0EA64E40A89B84B2DF73499E82A75642AC823
sudo apt-get update
sudo apt-get install sbt
- Scala 2.11.12
sudo apt install scala
$ git clone https://github.com/lizardll/ScalaBFS.git
$ make
Before deploying and running ScalaBFS, we need to make sure that you have specific graph data with divided csc-csr format that ScalaBFS required. For complete graph data preprocess guide, see Data Preprocess.
We start with a small directed graph named Wiki-Vote for example. First we should make for directed or undirected graph for propose. Then we generate divided graph data with 32 channels and 64 PEs for ScalaBFS.
cd data_preprocess
make all
./GraphToScalaBFS Wiki-Vote.txt 32 64
ScalaBFS-proj/workspace
For the preprocessed wiki-vote graph data mentioned before, we should first modify the input file name at line 121:
string bfs_filename = "YOUR_DIR_HERE/ScalaBFS/data_preprocess/Wiki-Vote_pe_64_ch_";
Then we have to modify the following line 122-127 according to data_preprocess/Wiki-Vote_addr_pe_64_ch_32.log:
cl_uint csr_c_addr = 260;
cl_uint csr_r_addr = 0;
cl_uint level_addr = 2958;
cl_uint node_num = 8298;
cl_uint csc_c_addr = 1780;
cl_uint csc_r_addr = 1520;
And in order to show correct prerformance value, on line 132 we also need to set the edge count of the dataset (in this case, wiki-vote has 103689 edges):
result = 103689;
After that, it's time to build the whole project in vitis. Select the "Hardware" target in the left down corner, and press the hammer button to build it! Genarally it will take 10~15 hours.
The running results will be like this:
TABLE 1: Graph datasets
Graphs | Vertices | Edges | Avg. | Directed |
---|---|---|---|---|
(M) | (M) | Degree | ||
soc-Pokec (PK) | 1.63 | 30.62 | 18.75 | Y |
soc-LiveJournal (LJ) | 4.85 | 68.99 | 14.23 | Y |
com-Orkut (OR) | 3.07 | 234.37 | 76.28 | N |
hollywood-2009 (HO) | 1.14 | 113.89 | 99.91 | N |
RMAT18-8 | 0.26 | 2.05 | 7.81 | N |
RMAT18-16 | 0.26 | 4.03 | 15.39 | N |
RMAT18-32 | 0.26 | 7.88 | 30.06 | N |
RMAT18-64 | 0.26 | 15.22 | 58.07 | N |
RMAT22-16 | 4.19 | 65.97 | 15.73 | N |
RMAT22-32 | 4.19 | 130.49 | 31.11 | N |
RMAT22-64 | 4.19 | 256.62 | 61.18 | N |
RMAT23-16 | 8.39 | 132.38 | 15.78 | N |
RMAT23-32 | 8.39 | 262.33 | 31.27 | N |
RMAT23-64 | 8.39 | 517.34 | 61.67 | N |
TABLE 2: Performance comparison between GunRock and ScalaBFS (32-PC/64-PE configuration)
Gunrock on V100 | ScalaBFS on U280 | |||
---|---|---|---|---|
Datasets | Throughput (GTEPS) |
Power eff. (GTEPS/watt) |
Throughput (GTEPS) |
Power eff. (GTEPS/watt) |
soc-Pokec (PK) | 14.9 | 0.050 | 16.2 | 0.506 |
soc-LiveJournal (LJ) | 18.5 | 0.062 | 11.2 | 0.350 |
com-Orkut (OR) | 150.6 | 0.502 | 19.1 | 0.597 |
hollywood-2009 (HO) | 73 | 0.243 | 16.4 | 0.513 |
FIGURE 1: Performances and aggregated bandwidths of ScalaBFS (with 32 HBM PCs and 64 PEs) and baseline case