Configuring the System Architecture
InSDAccel Compilation Flow and Execution Model, you learned of the two distinct phases in theSDAccel™environment kernel build process:
- Compilation stage: The compilation process is controlled by the
xocc –c
option. At the end of the compilation stage one or more kernel functions are compiled into separate.xo
files. At this stage, thexocc
compiler extracts the hardware intent from the C/C++ code and associated pragmas. Refer to theSDx Command and Utility Reference Guidefor more information on thexocc
compiler. - Linking stage: The linking stage is controlled by the
xocc –l
option. During the linking process all the.xo
files are integrated into the FPGA hardware.
If needed, the kernel linking process can be customized to improve theSDAccelenvironment runtime performance. This chapter introduces a few such techniques.
Multiple Instances of a Kernel
By default, a single hardware instance is implemented from a kernel. If the host intends to execute the same kernel multiple times, then multiple such kernel executions take place on the same hardware instance sequentially. However, you can customize the kernel compilation (linking stage) to create multiple hardware instances from a single kernel. This can improve execution performance as the multiple kernel calls can now run concurrently, overlapping their execution while running on separate hardware instances.
Multiple instances of the kernel can be created by using thexocc -–nk
switch during linking.
For example, for a kernel namefoo
, two hardware instances can be implemented as follows:
# xocc -–nk : xocc --nk foo:2
By default, the implemented instance names are
and
. However, you can optionally change the default instance names as shown below:
# xocc -–nk ::.… xocc --nk foo:3:fooA.fooB.fooC
This example implements three identical copies, or hardware instances of kernelfoo
, namedfooA
,fooB
, andfooC
on the FPGA programmable logic.
Customization of DDR Bank to Kernel Connection
By default, all the memory interfaces from all the kernels are connected to a single global memory bank. As a result, only one memory interface at a time can transfer data to and from the memory bank, limiting the performance of the kernel. If the FPGA contains only one DDR (or global) memory bank, this is the only option.
However, some FPGA devices contain multiple DDR memory banks. You can customize the connections among the kernel memory interfaces and the DDR memory bank of such a device by altering the default connection.
The above approach can even improve the performance of a single kernel.
Consider the following example:
void cnn( int *image, // Read-Only Image int *weights, // Read-Only Weight Matrix int *out, // Output Filters/Images ... // Other input or Output ports #pragma HLS INTERFACE m_axi port=image offset=slave bundle=gmem #pragma HLS INTERFACE m_axi port=weights offset=slave bundle=gmem #pragma HLS INTERFACE m_axi port=out offset=slave bundle=gmem
The example shows two memory interface inputs for the kernel:image
andweights
. If both are connected to the same DDR bank, a concurrent transfer of both of these inputs into the kernel is not possible.
The following steps are needed to implement separate DDR bank connections for theimage
andweights
inputs:
- Specify separate bundle names for these inputs. This is discussed inMemory Data Inputs and Outputs. However, for reference the code is shown here again.
void cnn( int *image, // Read-Only Image int *weights, // Read-Only Weight Matrix int *out, // Output Filters/Images ... // Other input or Output ports #pragma HLS INTERFACE m_axi port=image offset=slave bundle=gmem #pragma HLS INTERFACE m_axi port=weights offset=slave bundle=gmem1 #pragma HLS INTERFACE m_axi port=out offset=slave bundle=gmem
IMPORTANT:When specifying abundle=
name, you should use all lowercase characters to be able to assign it to a specific memory bank using the--sp
option.The memory interface inputs
image
andweights
are assigned different bundle names in the example above. - Alter the XOCC link process to create custom DDR bank connections. This is done using
-–sp
switch:--sp
. : Where:
is the instance name of the kernel as specified by the--nk
option, described inMultiple Instances of a Kernel.
is the name of the interface bundle defined by the HLS INTERFACE pragma, includingm_axi_
as a prefix, and thebundle=
name when specified.TIP:If the port is not specified as part of a bundle, then the
is the specifiedport=
name, without them_axi_
prefix.
is denoted asbank0
,bank1
, etc. For a device with four DDR banks, the bank names arebank0
,bank1
,bank2
, andbank3
.
For the above example, considering a single instance of the
cnn
kernel, the-–sp
switch can be specified as follows:--sp cnn_1.m_axi_gmem:bank0 \ -–sp cnn_1.m_axi_gmem1:bank1
The customized bank connection needs to be reflected in the host code as well. This was previously discussed inSpecifying Exact Memory from the Host Code.
If-nk
and-sp
switches are used together for a kernel, each hardware instance should have identical memory connectivity. If not, you should use theOpenCL™sub-device to allocate each kernel instances separately in the host code.
Summary
- Consider creating multiple instances of a kernel on the fabric of the FPGA by specifying the
xocc --nk
if the kernel is called multiple times from the host code. - Consider using the
xocc --sp
switch to customize the DDR bank connection to kernel memory interfaces to achieve concurrent access.