Skip to content

MBM MBA how to guide

kmabbasi edited this page Oct 2, 2020 · 4 revisions

What is Intel® RDT?

Intel® Resource Director Technology (RDT) is designed to monitor and manage CPU resources and maintains performance of applications and VMs sharing CPU resources.

Intel® RDT includes monitoring and control technologies. Monitoring technologies include CMT (Cache Monitoring Technology), which monitors occupancy of last level cache, and MBM (Memory Bandwidth Monitoring). Control technologies include CAT (Cache Allocation Technology), CDP (Code Data Prioritization) and MBA (Memory Bandwidth Allocation).

What is MBA and MBM?

MBA allows to limit memory bandwidth available to specified cores/processes.

MBM enables monitoring of memory traffic (to and from RAM) for a specified cores/processes.

Why use MBA and MBM?

MBA and MBM can be used to identify and manage applications that are over utilizing memory bandwidth and thus have negative impacts on other applications competing for this resource.

MBA has been introduced in Intel® Xeon® Scalable processors, while MBM in Intel® Xeon® D-15XX processors.

To see which software release is needed for each feature, visit https://github.com/intel/intel-cmt-cat/wiki\#msr-interface-feature-support and https://github.com/intel/intel-cmt-cat/wiki\#os-interface-feature-support

How to monitor memory bandwidth using MBM

PQoS allows to monitor memory bandwidth per core or per process/task.

Memory bandwidth – local and remote

Local memory bandwidth is bandwidth sourced from local memory controllers, on the same package. Remote memory bandwidth is a bandwidth sourced from remote memory controllers.

For example, if there is a two-socket server, each die has its own memory controller. If CPU from the first socket reads data from a controller located in the same socket, then it counts as local memory bandwidth. If it reads memory from a controller located in the second socket, then it counts as remote memory bandwidth. This guide uses MBL and MBR as abbreviations of local and remote memory bandwidth, respectively.

MBM - core monitoring

To monitor memory bandwidth usage for a specified core(s) run the following command:

sudo pqos [-I] -m 'mbl:cores;mbr:cores

For example:

sudo pqos -m 'mbl:1,3-4;mbr:1,3-4'

or

sudo pqos -I -m 'mbl:1,3-4;mbr:1,3-4'

It will print the current local and remote memory bandwidth usage.

sudo pqos -m 'mbl:1,3-4;mbr:1,3-4'
TIME 2019-05-07 08:01:16  
CORE IPC MISSES MBL\[MB/s\] MBR\[MB/s\]  
1 1.17 124254k 15161.8 0.1  
3 1.54 38010k 4595.0 0.1  
4 1.41 135584k 8210.6 0.2

MBM - process monitoring

PQoS also allows to monitor memory bandwidth usage per process/task. For that it requires OS interface (-I). To monitor memory bandwidth usage for process/task run the following command:

sudo pqos -I -p 'mbl:pids;mbr:pids'

For example:

sudo pqos -I -p 'mbl:2357;mbr:2357'

How to allocate memory bandwidth using MBA

PQoS allows to configure MBA per class of service (COS). COS can be associated with multiple cores whereas each core can have only 1 COS.

Checking number of available classes of service (COS)

Before allocating memory bandwidth, it is useful to check how many classes of service are available for MBA. PQoS utility can print this number using –d flag:

sudo pqos -d

Hardware capabilities
    Monitoring
        Cache Monitoring Technology (CMT) events:
            LLC Occupancy (LLC)
        Memory Bandwidth Monitoring (MBM) events:
            Total Memory Bandwidth (TMEM)
            Local Memory Bandwidth (LMEM)
            Remote Memory Bandwidth (RMEM) (calculated)
        PMU events:
            Instructions/Clock (IPC)
            LLC misses
    Allocation
        Cache Allocation Technology (CAT)
            L3 CAT
                CDP: enabled
                Num COS: 16
        Memory Bandwidth Allocation (MBA)
            Num COS: 8

Please note that the number of classes of service available for Cache Allocation Technology (CAT) and Memory Bandwidth Allocation (MBA) might be different.

Logical core - allocation class of service (COS) association

Before allocating memory bandwidth to a core, it must be first associated with a COS. It can be done by running:

sudo pqos [-I] -a 'core:cos=cores'

For example:

sudo pqos -a 'core:1=3'

or

sudo pqos -I -a 'core:1=3'

This command will associate COS 1 with core 3. Please note it is important which interface is used: MSR (a default one) or OS (-I). It is important not to mix use of interfaces as this can lead to RDT configuration corruption.

$ sudo pqos -a 'core:1=3'

NOTE: Mixed use of MSR and kernel interfaces to manage  
CAT or CMT & MBM may lead to unexpected behavior.  
Allocation configuration altered.

MBA - configure MBA in %

After associating cores with COS, now MBA can be configured using the associated COS by running:

sudo pqos -e 'mba:cos=mba'

For example:

sudo pqos -e 'mba:1=20'

or

sudo pqos -I -e 'mba:1=20'

The command above will restrict memory bandwidth of COS1 to 20%. In other words 80% delay will be added to cores associated with COS1.

$ sudo pqos -e 'mba:1=20'

NOTE: Mixed use of MSR and kernel interfaces to manage  
CAT or CMT & MBM may lead to unexpected behavior.  
SOCKET 0 MBA COS1 =\> 20% requested, 20% applied  
SOCKET 1 MBA COS1 =\> 20% requested, 20% applied  
Allocation configuration altered.

MBA - MBM - configure MBA in MB/s

Unlike configuring MBA in percentage, it is also possible to allocate precise value in megabytes per second (MB/s). For that MBA works together with MBM to dynamically adjust memory bandwidth to the level specified by MBM in MB/s. Please note that the operating system must support MBA CTRL. To check if your environment is able to use MBA CTRL, please refer to https://github.com/intel/intel-cmt-cat/blob/master/README#L248. It works only on OS interface.

It requires resctrl mounted with mba_MBps flag. It can be also enabled by resetting allocation settings with mbaCtrl-on option:

sudo pqos -I -R mbaCtrl-on

Note: This will reset all allocation configuration including COS-cores association.

MBA can be configured in MB/s by running:

sudo pqos -I -e 'mba_max:cos=mba'

For example:

sudo pqos -I -e 'mba_max:1=4000'

It will set the cap (maximum level) to 4000 MB/s of memory bandwidth for COS 1.

$ sudo pqos -I -e 'mba_max:1=4000'

NOTE: Mixed use of MSR and kernel interfaces to manage  
CAT or CMT & MBM may lead to unexpected behavior.  
SOCKET 0 MBA COS1 =\> 4000 MBps  
SOCKET 1 MBA COS1 =\> 4000 MBps  
Allocation configuration altered.

Generating memory traffic using membw

membw is a tool to simulate memory usage by performing memory intensive operations. It is used in the examples below. To generate memory traffic run this command: sudo membw –c <CPU> -b <BANDWIDTH> <OPERATION> For example:

sudo ./membw -c 3 -b 10000 --read

will generate memory traffic for core 3 of approximately 10000 MB/s using x86 loads.

$ sudo ./membw -c 3 -b 10000 --read

- THREAD logical core id: 3, memory bandwidth [MB]: 10000, starting...

Example usage of MBM and MBA

Monitoring per core - MBA in %

  1. Run task on a specified core as an example you may run membw from tools/ on core 3:
sudo ./membw -c 3 -b 10000 --read
  1. Monitor MBL and MBR on core 3:
sudo pqos -m 'mbl:3;mbr:3'
  1. Associate COS 1 with core 3:
sudo pqos -a 'core:1=3'
  1. Configure MBA on COS 1:
sudo pqos -e 'mba:1=20'
  1. Monitor MBL and MBR on core 3:
sudo pqos -m 'mbl:3;mbr:3'

It is expected that pqos reports lower memory bandwidth usage that in step 2.

Note: Above commands can be also run with -I flag to make use of OS interface.

Monitoring per core - MBA in MB/s

  1. Reset allocation configuration and enable MBA CTRL:
sudo pqos -I -R mbaCtrl-on
  1. Run task on a specified core As an example you may run membw from tools/ on core 3:
sudo ./membw -c 3 -b 10000 --read
  1. Monitor MBL and MBR on core 3:
sudo pqos -I -m 'mbl:3;mbr:3'
  1. Associate COS 1 with core 3:
sudo pqos -I -a 'core:1=3'
  1. Configure MBA using MBA CTRL on COS 1:
sudo pqos -I -e 'mba_max:1=4000'
  1. Monitor MBL and MBR on core 3:
sudo pqos -I -m 'mbl:3;mbr:3'

It is expected that pqos reports much lower memory bandwidth usage that in step 3 (approximately 4000 MB/s). The value sometimes can be greater than the MBA setting in MB/s due to the fact that the underlying mechanism must take into account MBA granularity the platform provides.

Monitoring per process - MBA in %

  1. Run an application on a specified core. As an example you may run membw from tools/ on core 3:
sudo ./membw -c 3 -b 10000 --read
  1. Find PID of the running application
pidof membw

Let’s assume the application has PID 1234.

  1. Monitor MBL and MBR for PID 1234:
sudo pqos -I -p 'mbl:1234;mbr:1234'
  1. Associate COS 1 with PID 1234:
sudo pqos -I -a 'pid:1=1234'
  1. Configure MBA on COS 1:
sudo pqos -I -e 'mba:1=20'
  1. Monitor MBL and MBR for PID 1234:
sudo pqos -I -p 'mbl:1234;mbr:1234'

It is expected that pqos reports much lower memory bandwidth usage that in step 3 (about 20% of the previously reported values).

Monitoring per process - MBA in MB/s

  1. Reset allocation configuration and enable MBA CTRL:
sudo pqos -I -R mbaCtrl-on
  1. Run an application on a specified core. As an example you may run membw from tools/ on core 3:
sudo ./membw -c 3 -b 10000 --read
  1. Find PID of the running application
pidof membw

Let’s assume the application has PID 1234.

  1. Monitor MBL and MBR for PID 1234:
sudo pqos -I -p 'mbl:1234;mbr:1234'
  1. Associate COS 1 with PID 1234:
sudo pqos -I -a 'pid:1=1234'
  1. Configure MBA using MBA CTRL on COS 1:
sudo pqos -I -e 'mba_max:1=4000'
  1. Monitor MBL and MBR for PID 1234:
sudo pqos -I -p 'mbl:1234;mbr:1234'

It is expected that pqos reports much lower memory bandwidth usage that in step 4 (approximately 4000 MB/s). The value sometimes can be greater than the MBA setting in MB/s due to the fact that the underlying mechanism must take into account MBA granularity the platform provides.

Configuring MBA using rdtset

MBA can be also configured using rdtset utility instead of pqos. rdtset is a taskset-like application that provides easy to use interface for running applications and configuring RDT for them. Once the application is no longer running, rdtset will automatically revert RDT configuration.

MBA configuration via pqos can be replaced with rdtset. The following command can be used:

MBA in %, per core

sudo rdtset -t 'mba=20;cpu=3' –c 3 ./membw –c 3 –b 10000 –read

replaces

sudo pqos -a 'core:<COS>=3'
sudo pqos -e 'mba:<COS>=20'
sudo ./membw -c 3 -b 10000 –read

MBA in %, per process/task

sudo rdtset -t 'mba=20' -I -p 1234

replaces

sudo pqos –I -a 'pid:<COS>=1234'
sudo pqos –I -e 'mba:<COS>=20'
sudo ./membw -c 3 -b 10000 –read # assuming this process has PID 1234

MBA in MB/s, per core

sudo rdtset -t 'mba_max=4000;cpu=3' –c 3 ./membw –c 3 –b 10000 –read

replaces

sudo pqos –I -a 'core:<COS>=3'
sudo pqos –I -e 'mba_max:<COS>=4000'
sudo ./membw -c 3 -b 10000 –read

MBA in MB/s, per process/task

sudo rdtset -t 'mba_max=4000' –I –p 1234

replaces

sudo pqos –I -a 'pid:<COS>=1234'
sudo pqos –I -e 'mba_max:<COS>=4000'
sudo ./membw -c 3 -b 10000 –read # assuming this process has PID 1234