Lab 3: Optimize the Application Code
This tutorial demonstrates how you can modify your code to optimize the hardware-software system generated by the SDx environment. You will also learn how to find more information about build errors so that you can correct your code.
Introduction to System Ports and DMA
In Zynq®-7000 All Programmable SoC device systems, the memory seen by the ARM A9 processors has two levels of on-chip cache followed by a large off-chip DDR memory. From the programmable logic side, the SDx IDE creates a hardware design that might contain a Direct Memory Access (DMA) block to allow a hardware function to directly read and/or write to the processor system memory via the system interface ports.
As shown in the simplified diagram below, the processing system (PS) block in Zynq devices has three kinds of system ports that are used to transfer data from processor memory to the Zynq device programmable logic (PL) and back. They are Accelerator Coherence Port (ACP) which allows the hardware to directly access the L2 Cache of the processor in a coherent fashion, High Performance ports 0-3 (HP0-3), which provide direct buffered access to the DDR memory or the on-chip memory from the hardware bypassing the processor cache using Asynchronous FIFO Interface (AFI), and General-Purpose IO ports (GP0/GP1) which allow the processor to read/write hardware registers.
When the software running on the ARM A9 processor “calls” a hardware function, it actually invokes ansds++
generated stub function that in turn calls underlying drivers to send data from the processor memory to the hardware function and to get data back from the hardware function to the processor memories over the three types of system ports shown: GPx, ACP, and AFI.
The table below shows the different system ports and their properties. Thesds++
compiler automatically chooses the best possible system port to use for any data transfer, but allows you to override this selection by using pragmas.
System Port | Properties |
---|---|
ACP | Hardware functions have cache coherent access to DDR via the PS L2 cache. |
AFI (HP) | Hardware functions have fast non-cache coherent access to DDR via the PS memory controller. |
GP | Processor directly writes/reads data to/from hardware function. Inefficient for large data transfers. |
MIG | Hardware functions access DDR from PL via a MIG IP memory controller. |
Learning Objectives
- Use pragmas to select ACP or AFI ports for data transfer
- Observe the error detection and reporting capabilities of the SDSoC environment.
- Use pragmas to select different data movers for your hardware function arguments
- Understand the use of
sds_alloc()
- Use pragmas to control the number of data elements that are transferred to/from the hardware function.