See Vitis™ Development Environment on xilinx.com See Vitis™ AI Development Environment on xilinx.com |
In a complete system design there are a lot of situations where the same data must be used by various independent kernels. These kernels are not aware of the existence of the other kernels and should operate completely independently of the other kernel operations.
There can be situations where you can have data split (multicast) but the various branches do not have the same buffer size. This case is handled automatically in case of buffer interface.
When the data source is a stream and all the input interfaces of the destinations are streaming interfaces, the aiecopiler
has nothing really special to do. The AXI-Stream interface interconnect is used to connect the destinations to the same source.
Note: Be aware of inefficiencies and deadlocks. You may have to introduce FIFOs on the various paths to overcome stream stalls on some branches.
The situation is different when the source and destination are buffers. Memory interfaces can be accessed only by the four neighbors and cannot be extended further. That is why the aiecompiler
has to use the AXI-Stream interconnect and the DMAs (MM2S -> S2MM) to multicast the source memory content to the destination memories.
All the infrastructure _dma[0] ... _dma[4]
has been added by the compiler to multicast the data to all destination memories.
Type make CASE=1 clean data aie aieviz
and explore the graph view and the array view shown by Vitis Analyzer.
At the end of the simulation the simulator displays the throughput of the system for every port:
--------------------------------------------------------------------------------------------------
Port Name | Type | Average Throughput
--------------------------------------------------------------------------------------------------
Inputb | IN | 1250.775916 MBps
Inputs | IN | 1251.141553 MBps
Outputb_0 | OUT | 1252.446184 MBps
Outputs_0 | OUT | 1252.446184 MBps
Outputb_1 | OUT | 1252.446184 MBps
Outputs_1 | OUT | 1252.446184 MBps
Outputb_2 | OUT | 1252.446184 MBps
Outputs_2 | OUT | 1252.446184 MBps
Outputb_3 | OUT | 1252.446184 MBps
Outputs_3 | OUT | 1252.446184 MBps
--------------------------------------------------------------------------------------------------
This case being single rate, the throughput is the same for all the ports.
Multirate design is a subject explored in another section of this tutorial. This subsection emphasizes the fact that in case of different buffer sizes for destinations, the aiecompiler
is able to compute a number of runs per iteration for each kernel so that the data generated by the maker
is consumed by all the takers
.
The architecture is exactly the same but the repetition counts are different from case 1:
Type make CASE=2 clean data aie aieviz
and explore the graph view and the array view shown by the AMD Vitis™ Analyzer.
At the end of the simulation the simulator displays the throughput of the system for every port:
--------------------------------------------------------------------------------------------------
Port Name | Type | Average Throughput
--------------------------------------------------------------------------------------------------
Inputb | IN | 1250.344637 MBps
Outputb_0 | OUT | 1250.610650 MBps
Outputb_1 | OUT | 1250.814332 MBps
--------------------------------------------------------------------------------------------------
Here also the system is single rate, the only difference between the cores are the input and output buffer sizes.
GitHub issues will be used for tracking requests and bugs. For questions, go to support.xilinx.com.
Copyright © 2023 Advanced Micro Devices, Inc