Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

AI Engine Development

See Vitis™ Development Environment on xilinx.com
See Vitis™ AI Development Environment on xilinx.com

Data Multicasting

Introduction

In a complete system design there are a lot of situations where the same data must be used by various independent kernels. These kernels are not aware of the existence of the other kernels and should operate completely independently of the other kernel operations.

There can be situations where you can have data split (multicast) but the various branches do not have the same buffer size. This case is handled automatically in case of buffer interface.

Case 1: Stream and Buffer Multicasting

When the data source is a stream and all the input interfaces of the destinations are streaming interfaces, the aiecopiler has nothing really special to do. The AXI-Stream interface interconnect is used to connect the destinations to the same source.

No image

Note: Be aware of inefficiencies and deadlocks. You may have to introduce FIFOs on the various paths to overcome stream stalls on some branches.

The situation is different when the source and destination are buffers. Memory interfaces can be accessed only by the four neighbors and cannot be extended further. That is why the aiecompiler has to use the AXI-Stream interconnect and the DMAs (MM2S -> S2MM) to multicast the source memory content to the destination memories.

No image

All the infrastructure _dma[0] ... _dma[4] has been added by the compiler to multicast the data to all destination memories.

Type make CASE=1 clean data aie aieviz and explore the graph view and the array view shown by Vitis Analyzer.

At the end of the simulation the simulator displays the throughput of the system for every port:

--------------------------------------------------------------------------------------------------
Port Name           | Type              | Average Throughput
--------------------------------------------------------------------------------------------------
Inputb              | IN                | 1250.775916 MBps  
Inputs              | IN                | 1251.141553 MBps  
Outputb_0           | OUT               | 1252.446184 MBps  
Outputs_0           | OUT               | 1252.446184 MBps  
Outputb_1           | OUT               | 1252.446184 MBps  
Outputs_1           | OUT               | 1252.446184 MBps  
Outputb_2           | OUT               | 1252.446184 MBps  
Outputs_2           | OUT               | 1252.446184 MBps  
Outputb_3           | OUT               | 1252.446184 MBps  
Outputs_3           | OUT               | 1252.446184 MBps  
--------------------------------------------------------------------------------------------------

This case being single rate, the throughput is the same for all the ports.

Case 2: Multirate Buffer Multicasting

Multirate design is a subject explored in another section of this tutorial. This subsection emphasizes the fact that in case of different buffer sizes for destinations, the aiecompiler is able to compute a number of runs per iteration for each kernel so that the data generated by the maker is consumed by all the takers.

The architecture is exactly the same but the repetition counts are different from case 1:

No image

Type make CASE=2 clean data aie aieviz and explore the graph view and the array view shown by the AMD Vitis™ Analyzer.

At the end of the simulation the simulator displays the throughput of the system for every port:

--------------------------------------------------------------------------------------------------
Port Name           | Type              | Average Throughput
--------------------------------------------------------------------------------------------------
Inputb              | IN                | 1250.344637 MBps  
Outputb_0           | OUT               | 1250.610650 MBps  
Outputb_1           | OUT               | 1250.814332 MBps  
--------------------------------------------------------------------------------------------------

Here also the system is single rate, the only difference between the cores are the input and output buffer sizes.

Support

GitHub issues will be used for tracking requests and bugs. For questions, go to support.xilinx.com.

Copyright © 2023 Advanced Micro Devices, Inc

Terms and Conditions