SDAccel Debug Features

In this chapter, different features of theSDAccel™environment supporting debugging efforts are examined. This chapter introduces the debugging tools available to analyze the project and perform debugging. The next chapter illustrates debug techniques using the features described here.

Defensive Programming

TheSDAccelenvironment is capable of creating very efficient implementations. In some cases, however, implementation issues can occur. One such case is if a write request is emitted before there is enough data available in the process to complete the write transaction. This can cause deadlock conditions when multiple concurrent kernels are affected by this issue and the write request of a kernel depends on the input read being completed.

To avoid such situations, a conservative mode is available on the adapter. In principle, it delays the write request until it has all of the data necessary to complete the write. This mode is enabled during compilation by applying the following --xpoption to the xocccompiler:
--xp param:compiler.axiDeadLockFree=yes

Because enabling this mode can impact performance, you might prefer to use this as a defensive programming technique where this option is inserted during development and testing and then removed during optimization. You might also want to add this option when the accelerator hangs repeatedly.

SDAccel Software Debug

TheSDAccelenvironment supports typical software-like debugging for the host as well as kernel code. This flow is supported during software and hardware emulation and allows the use of break points and the analysis of variables as commonly done during software debugging.

Note:The host code can still be debugged in this mode even when the actual hardware is executed.

IDE Debug Flow

TheSDAccelintegrated design environment (IDE) flow provides easy access to the debug capabilities. Setting up an executable for debugging requires many steps when performed manually. These steps are handled by the IDE when you use the IDE debug flow.

Note:The SDAcceldebug flow relies on shell scripts during debugging. This requires that the setup files such as .bashrcor .cshrcdo not interfere with the SDAccelsetup, such as the LD_LIBRARY_PATH.

Preparing the executable for debugging requires that you change the build configurations to enable the application of debug flags. You can set these options through the Project Settings in theSDx™environment. There are two check boxes provided in the Options section for the Active build configuration. One enables host debug builds while the other enables debugging of the kernels. The checkboxes are named Host debug and Kernel debug respectively.

Figure:Software Project Settings Options

A more intuitive way to set these build options is through the context menu settings. To do this, right-click on the build configuration in the Assistant view and selectSettings. Alternatively, you can double-click on the build configuration. The same two checkboxes are presented. While you can enable host debug on all targets, kernel debug is only supported for software emulation and hardware emulation build targets. This completes the setup; cleaning the build directory and rebuilding the application ensure that the project is ready for running in the GDB debug environment.

Running a GDB session from the IDE takes care of all the setup required. It automatically manages the environment setup for hardware or software emulation. It configures theSDAccelruntime to ensure debug support by the runtime environment, and manages the different consoles required for the execution of the kernel model, the host model, and the debug server.

As a result, when initiating the debug session, theSDAccelenvironment asks to switch into the debug perspective, which presents several windows to manage the different debug consoles and source code windows.

Figure:GDB Console

After starting the application, by default the application is stopped right at the beginning of themainfunction body in the host code. As with any GDB graphical front end, you can now set breakpoints and inspect variables in the host code. TheSDAccelenvironment enables the same capabilities for the accelerated kernel implementation in a transparent way.

Note:In hardware emulation, because the C/C++/ OpenCL™kernel code is translated for efficient implementation, breakpoints cannot be placed on all statements. Mostly, untouched loops and functions are available for breakpoints, and similarly only preserved variables can be accessed.

Command Line Debug Flow

The command line debug flow in theSDAccelenvironment provides tools to debug the host and kernel application running in all modes: software emulation, hardware emulation, or hardware execution.

Note:The host code can be debugged using this feature in the hardware execution mode only.

There are four steps to debugging in theSDAccelenvironment using the command line flow:

  1. General environment setup.
  2. Prepare the host code for debug.
  3. Prepare the kernel code for debug.
  4. Launch GDB Standalone to debug.
IMPORTANT:The SDAccelenvironment supports host program debugging in all modes, but kernel debugging is only supported in the emulation flows with gdb. In addition, more hardware-centric debugging support, such as waveform analysis, is provided for the kernels.

General Environment Setup

Running software or hardware emulation requires first the tool setup followed by the selection of the emulation mode.
  1. To set up the tool environment and run theSDxtool, source the file below so thatSDxcommand settings are in thePATH:
    • C Shell:source /settings64.csh
    • Bash:source /settings64.sh
  2. To set up the runtime environment reponsible for the interaction between the software and hardware implementation, source the file below:
    • C Shell:source /opt/xilinx/xrt/setup.csh
    • Bash:source /opt/xilinx/xrt/setup.sh
Table 1.Select Emulation Mode
Environment Variable Value
XCL_EMULATION_MODE sw_emuorhw_emu

These environment settings are used by the runtime library to correctly execute the desired emulation. This is required in addition to building the executable for the specific emulation flow.

Preparing the Host Code

The host program needs to be compiled with debugging information generated in the executable by adding the-goption to thexcppcommand line option, as follows:

xcpp -g ...
TIP:Because xcppis simply a wrapper around the system compiler ( gcc), the -goption enables the compiler to generate debug information.

Preparing the Kernel

Kernel code can be debugged together with the host program in either software emulation or hardware emulation. Debugging information needs to be generated first in the binary container by passing the-goption to thexocccommand line executable:

xocc -g -t [sw_emu | hw_emu | hw] ...

The–t(or-target) option is used to specify the compilation target as either software emulation (sw_emu), hardware emulation (hw_emu), or hardware execution (hw).

In the software emulation flow, additional runtime checks can be performed for OpenCLbased kernels. The runtime checks include:
  • Checking out-of-bound access made by kernel interface buffers (option:address)
  • Checking uninitialized memory access initiated by kernel local to kernel (option:memory)

The options are enabled through the-–xpoption and theparam:compiler.fsanitizedirective, and need to be enabled during the link stage (-l) as shown in the following examples:

xocc -l –t sw_emu --xp param:compiler.fsanitize=address -o bin_kernel.xclbin xocc -l –t sw_emu --xp param:compiler.fsanitize=memory -o bin_kernel.xclbin xocc -l –t sw_emu --xp param:compiler.fsanitize=address,memory -o bin_kernel.xclbin

When applied, the emulation run produces a debug log with emulation diagnostic messages such as/Emulation-SW/-Default>/emulation_debug.log.

Launching GDB Host Code Debug

You can launch GDB standalone to debug the host program if the code is built with debug information (built with the-gflag). This flow should also work while using a graphical front-end for GDB, such as the Data Display Debugger (DDD) available from GNU. The following steps are the instructions for launching GDB.

  1. To set up the environment to run theSDxtool, source the file below so thatSDxcommand settings are in thePATH:
    • C Shell:source /settings64.csh
    • Bash:source /settings64.sh
  2. To set up the runtime environment responsible for the interaction between the software and hardware implementation, source the file below:
    • C Shell:source /opt/xilinx/xrt/setup.csh
    • Bash:source /opt/xilinx/xrt/setup.sh
  3. Ensure that the environment variableXCL_EMULATION_MODEis set to the correct mode.
  4. The application debug feature must be enabled at runtime using an attribute in thesdaccel.inifile. Create ansdaccel.inifile in the same directory as your host executable, and include the following lines:
    [Debug] app_debug=true
    This informs the runtime library that the kernel is debug enabled.
  5. Startgdbthrough theXilinx®wrapper:
    xgdb --args host.exe test.xclbin
    The xgdbwrapper performs the following setup steps under the hood:
    • Launches GDB on the host program:
      gdb --args host.exe test.xclbin
    • Sets up the environment variablesPYTHONHOMEandPYTHONPATHto Python installation. Currently, thegdbin theSDxenvironment expects Python 2.6 or Python 2.7. For example, if the Python available on the machine is Python 2.6, set the environment as shown (Bash shell shown):
      export PYTHONHOME=/usr export PYTHONPATH=/usr/lib64/python2.6/:/usr/lib64/python2.6/lib-dynload/
    • Sources the Python script in the GDB console to enable theXilinxGDB extensions:
      gdb> source ${XILINX_SDX}/scripts/appdebug.py

Launching Host and Kernel Debug

In software emulation, to better mimic the hardware being emulated, kernels are spawned off as separate processes. If you are using GDB to debug the host code, breakpoints set on kernel lines are not hit because the kernel code is not run within that process. To support the concurrent debugging of the host code and the kernel code, the SDAccelenvironment provides a mechanism to attach to spawned kernels through the use of sdx_server.
  1. You must start three different terminals in the command line flow. In the first terminal, start thesdx_serverusing the following command:
    ${XILINX_VIVADO}/bin/sdx_server --sdx-url
  2. In a second terminal, run the host code inxgdbas described inLaunching GDB Host Code Debug.

    At this point, the first terminal running thesdx_servershould provide aGDB listener port NUMon standard out. Keep track of the number returned by thesdx_serveras the GDB listener port is used by GDB to debug the kernel process. When the GDB listener port is printed, the spawned kernel process has attached to thesdx_serverand is waiting for commands from you. To control this process, you must start a new instance of GDB and connect to thesdx_server.

    IMPORTANT:If the sdx_serveris running, then all spawned processes compiled for debug connect and wait for control from you. If no GDB ever attaches or provides commands, the kernel code appears to hang.
  3. In a third terminal, run thexgdbcommand, and at the GDB prompt, run the following commands:
    • For software emulation:
      “file ${XILINX_SDX}/data/emulation/unified/cpu_em/generic_pcie/model/genericpciemodel”
    • For hardware emulation:
      1. Locate thesdx_servertemporary directory:/tmp/sdx/$uid.
      2. Find thesdx_serverprocess id (PID) containing the DWARF file of this debug session.
      3. At thegdbcommand line, run:file /tmp/sdx/$uid/$pid/NUM.DWARF.
    • In either case, connect to the kernel process:
      target remote :NUM

      WhereNUMis the number returned by thesdx_serveras the GDB listener port.

    TIP:When debugging software/hardware emulation kernels in the SDAccelIDE, these steps are handled automatically and the kernel process is automatically attached, providing multiple contexts to debug both the host code and kernel code simultaneously.

After these commands are executed, you can set breakpoints on your kernels as needed, run thecontinuecommand, and debug your kernel code. When the all kernel invocations have finished, the host code continues, and thesdx_serverconnection drops.

For both software and hardware emulation flows, there are restrictions with respect to the accelerated kernel code debug interactions. Because this code is preprocessed in the software emulation flow, and then translated in the hardware emulation flow into a hardware description language (HDL) and simulated during debugging, it is not always possible to set breakpoints at all locations. Especially with hardware emulation, only a limited number of breakpoints such as on preserved loops and functions are supported. Nevertheless, this mode is useful for debugging the kernel/host interface.

Utilities for Hardware Debugging

In some cases, the normalSDAccelIDE and command line debug features are limited in their ability to isolate an issue. This is especially true when the software or hardware appears not to make any progress (hangs). These kinds of system issues are best analyzed with the help of the utilities mentioned in this section.

Using the Linux dmesg Utility

Well-designed Linux kernels and modules report issues through the kernel ring buffer. This is also true forSDAccelenvironment modules that allow you to debug the interaction with the accelerator board on the lowest Linux level.

Note:This utility intended for use in hardware debug only.
TIP:In most cases, it is sufficient to work with the less verbose xbutilfeature to localize a problem. Refer to the SDx Command and Utility Reference Guidefor more information on the xbutilcommand.

Thedmesgutility is a Linux tool that lets you read the kernel ring buffer. The kernel ring buffer holds kernel information messages in a circular buffer. A circular buffer of fixed size is used to limit the resource requirements by overwriting the oldest entry with the next incoming message.

In theSDAcceltool, thexoclmodule andxclmgmtdriver modules write informational messages to the ring buffer. Thus, for an application hang or crash, or for that matter any unexpected behavior (like being unable to program the bitstream, and so on), thedmesgtool should be used to check the ring buffer.

The following image shows the layers of the software platform associated with theSDAccelboard platform.

Figure:Software Platform Layers

To review messages from the Linux tool, you should first clear the ring buffer:
sudo dmesg -c
This flushes all messages from the ring buffer and make it easier to spot messages from the xocland xclmgmt. After that, start your application and run dmesgin another terminal.
sudo dmesg
The dmesgutility prints a record such as the following module reports:

In the example shown above, the AXI Firewall 2 has tripped, which is better examined using thexbutilutility.

Using the Xilinx xbutil Utility

TheXilinxboard utility (xbutil) is a powerful standalone command line utility that can be used to debug lower level hardware/software interaction issues. A full description of this utility can be found in theSDx Command and Utility Reference Guide.

With respect to debugging, the following xbutiloptions are of special interest:
query
Provides an overall status of an SDAccelenvironment platform.
program
Downloads a binary ( xclbin) to the programmable region of the Xilinxdevice.
status
Extracts the status of the SDxenvironment Performance Monitors ( spm) and the Lightweight AXI Protocol Checkers ( lapc).

Hardware Debugging UsingChipScope

After the final system image ( xclbin) is generated and executed on the SDAccelenvironment platform, the entire system including the host application running on the CPU, and the accelerated kernels on the XilinxFPGA, can be confirmed to be executing correctly on the actual hardware. At this stage you can validate the functioning of the host code and kernel in the target hardware, and debug any issues found. Some of the conditions that can be looked for or analyzed are listed as follows:
  • System hangs that could be due to protocol violations:
    • These violations can take down the entire system.
    • These violations can cause the kernel to get invalid data or to hang.
    • It is hard to determine where or when these violations originated.
    • To debug this condition, you should use an ILA triggered off of the AXI protocol checker, which needs to be configured on theSDAccelplatform in use.
  • Problems inside the RTL kernel:
    • These problems are sometimes caused by the implementation: timing issues, race condition, and bad design constraint.
    • Functional bugs that hardware emulation did not show.
  • Performance problems:
    • For example, the frames per second processing is not what you expect.
    • You can examine data beats and pipelining.
    • Using an ILA with trigger sequencer, you can examine the burst size, pipelining, and data width to locate the bottleneck.

Checking the FPGA Board for Hardware Debug Support

Supporting hardware debugging requires the platform to support several IP components, most notably the Debug Bridge. Talk to your platform designer to determine if these components are included in the platform shell. If aXilinxplatform is used, debug availability can be verified using theplatforminfoutility to query the platform. Debug capabilities are listed under thechipscope_debugobjects.

For example, to query the a platform for hardware debug support, the following platforminfocommand can be used. A response can be seen showing that the platform contains a user and management debug network, and also supports debugging a MicroBlaze™processor.
$ platforminfo --json="hardwarePlatform.extensions.chipscope_debug" --platform xilinx_u200_xdma_201830_1 { "debug_networks": { "user": { "name": "User Debug Network", "pcie_pf": "1", "bar_number": "0", "axi_baseaddr": "0x000C0000", "supports_jtag_fallback": "false", "supports_microblaze_debug": "true", "is_user_visible": "true" }, "mgmt": { "name": "Management Debug Network", "pcie_pf": "0", "bar_number": "0", "axi_baseaddr": "0x001C0000", "supports_jtag_fallback": "true", "supports_microblaze_debug": "true", "is_user_visible": "false" } } }

EnablingChipScopefrom theSDxIDE

TheSDxIDE provides options to enable theChipScope™debug feature on all the interface ports of the compute units in the design. When enabling this option on a compute unit, theSDAccelenvironment compiler adds a System ILA debug core to monitor the interface ports of the compute unit. This ensures that you can debug the interface signals on theSDAccelenvironment platform hardware while the kernel is running. You can access this through the Settings command by right-clicking on a kernel in the system build configuration in the Assistant window as shown below.

Figure:SDxAssistant View

This brings up the Hardware Function Settings dialog box as shown in the following figure. You can use the Debug and Profiling Settings table in this dialog box to enable theChipScopeDebug checkbox for specific compute units of the kernel, which enables the monitoring of all the interfaces/ports on the compute unit.

Figure:SDxHardware Function Settings

TIP:Enabling the ChipScope Debugoption on larger designs with multiple kernels and/or compute units can result in overuse of the FPGA device resources. Xilinxrecommends using the xocc --dk list_portsoption on the command line to determine the number and type of interfaces on the compute units. If you know which ports need to be monitored for debug as the design runs in hardware, the recommended methodology is to use the -–dkoption documented in the following topic.

Command Line Flow

The fullSDAccelkernel code compilation and linking command line flow can be found in theSDAccel Environment User Guide, Chapter 8. The following section covers thexocclinker options that can be used to list the available kernel ports as well as enable theSystem Integrated Logic Analyzercore on the selected ports. You should only use this flow if you are already familiar with the steps to build anSDAccelkernel at the command line.

TheSystem Integrated Logic Analyzerdebug core provides transaction-level visibility into an accelerated kernel or function running on hardware. AXI traffic of interest can also be captured and viewed using the System ILA core. The ILA core can be instantiated in the overall hardware of an existing RTL IP design to enable debugging features within that design, or it can be inserted automatically by the compiler. Thexocccompiler provides the--dkoption to attach System ILA cores at the interfaces to the kernels for debugging and performance monitoring purposes.

The-–dkoption to enable ILA IP core insertion has the following syntax:

--dk <[chipscope|list_ports]<:compute_unit_name><:interface_name>>

In general, theis optional. If not specified, all ports are expected to be analyzed. Thechipscopeoption requires the explicit name of the compute unit to be provided for theand. Thelist_portsoption generates a list of valid compute units and port combinations in the current design and must be used after the kernel has been compiled.

Before using the--dkoption, the kernel must be compiled into an.xofile. For a complete description of eachxocccommand line option as well as the completeSDAccelcommand line build flow, refer to theSDAccel Environment User Guide.

The first command compiles the kernel source files into an.xofile:

xocc -c -k  --platform  -o .xo 

After the kernel has been compiled into an.xofile,--dk list_portscan be added to the command line options used during thexocclinking process. This causes thexocccompiler to print the list of valid compute units and port combinations. See the following example:

xocc -l --platform  --nk ::--dk list_ports.xo

Finally,ChipScopedebug can be enabled on the desired ports by replacinglist_portswith the appropriate--dk chipscopecommand:

xocc -l --platform  --nk ::--dk chipscope::.xo
Note:

Multiple--dkoption switches can be specified in a single command line to additively increase interface monitoring capability.

Refer to theSDx Command and Utility Reference Guidefor more information on anyxoccoption. When the design is built, you can debug the design using theVivado®hardware manager as described inVivado Design Suite User Guide: Programming and Debugging(UG908).