SDAccel Development Environment Features
Compiler
- RTL Kernel Support
2017.1 provides significant improvement to usability and performance for importing and optimizing RTL kernels in SDx™.
- New RTL Kernel wizard is added to provide user a template for importing RTL IP into SDx.
- Provides scripts to build the RTL kernel into.xo(Xilinx Object) avoiding error-prone creation ofkernel.xmlfiles that were required in earlier releases.
- For Vivado® experts working on optimizing performance of RTL kernels,
xocc
has been enhanced to include Vivado script through-custom_script
duringxocc --link
. - Utilization reports are automatically generated during compilation. These reports show utilization of LUTS, Registers, BRAMs, and DSPs for the platform region, as well as for each of the kernels.
- The SDAccel™ compiler employs mathematical techniques to identify statically the access pattern of inputs and outputs to perform better memory coalescing and burst inferencing. This is an early access feature and can be enabled with both of the following
xocc
options:--xp param:compiler.version=39
--xp param:compiler.advancedLoopOptimizations=true
- SDx/HLS should be able to infer a shift register pattern automatically for OpenCL™ kernel. OpenCL Shifter Design Pattern is inferred if:
- The shift logic is described simply using for-loop.
- The array is initiated with 0, with assignment to the begin or to the end for new value.
- The access points should be constant offset.
- The compiler automatically decides if a shift-register is implemented in SRL or BRAM to improve better resource use and improve timing closure.
- Dataflow support for OpenCL is now a production feature.
- The compiler can support dataflow on functions with arbitrary sized parameters, sub-functions, or loops. Dataflow can also apply to loop statements.
- The
xocc
command to define OpenCL Compiler dataflow FIFO size is:--xp param:compiler.xclDataflowFifoDepth = 4
- The following warning message might appear during compile:
kernel.cl:28:17: warning: unknown attribute 'xcl_dataflow' ignored __attribute__ ((xcl_dataflow))
Ignore the warning or use
-k kernel_name
to avoid the warning. - SDx introduces the OpenCL 2.0 image data type, which provides the ability to read and write to images in kernels through OpenCL 2.0 image built-ins.
- The supported APIs are:
- clCreateImage()for the image types listed above.
- clGetSupportedImageFormats()
- clEnqueueReadImage()
- clEnqueueWriteImage()
- clGetImageInfo()
- Enhancements to OpenCL math built-in functions to improve performance and reduce resources.
- To provide better control to expert users, xocc allows users to write out the default script as well as apply custom scripts.
xocc -c -export_script xocc -c -custom_script
- Changes to SDAccel platform include the following:
- XDMA enhanced to support two Physical Functions to provide one PF for secure management.
- AXI Firewall at XDMA Full AXI4 and two AXI Lite interfaces to insulate the platform from hangs caused by AXI protocol violations in kernels.
- Feature ROM to embed platform data in a ROM in the DSA to enable Run Time checks.
- Bitstream download through ICAP on all platforms rather than MCAP for faster downloads.
- Introducing support for Kintex® UltraScale™ FPGA KCU1500 Reconfigurable Acceleration PCIe® card.
Board | Device | Supported DSAs | Kernel Clock Frequency MHz | Status | Features |
---|---|---|---|---|---|
XIL-ACCEL-RD- KU115 | KU115 | xilinx:xil-accel-rd-pcie3-ku115:4ddr:4.0 | 300 | Production |
|
KCU1500 | KU115 | xilinx:kcu1500:4ddr:4.0 |
300 | Production |
|
ADM-PCIE-KU3 | KU60 | xilinx:adm- pcie-ku3:2ddr-xpr:4.0 | 250 | Production | PCIe Gen3x8, 2 DDR Automatic frequency scaling Increased fabric resources for compute units. Global memory changes to volatile, between binary loads |
ADM-PCIE-7V3 | V7690T | xiilinx:adm- pcie-7v3:1ddr:3.0 | 200 | Production | PCIe Gen3x8, 1DDR Automatic frequency scaling |
DSA | User PF Driver | User Device Node | Management PF Driver | Management PF Node |
---|---|---|---|---|
xilinx:kcu1500:4ddr-xpr:4.0 | xdma |
/dev/xdmaX_user /dev/xdmaX_c2h_0 /dev/xdmaX_c2h_1 /dev/xdmaX_h2c_0 /dev/xdmaX_h2c_1 |
xclmgmt |
/dev/xclmgmtX |
xilinx:xil-accel-rd-ku115:4ddr-xpr:4.0 | xdma |
/dev/xdmaX_user /dev/xdmaX_c2h_0 /dev/xdmaX_c2h_1 /dev/xdmaX_h2c_0 /dev/xdmaX_h2c_1 |
xclmgmt |
/dev/xclmgmtX |
xilinx:xil-accel-rd-vu9p:4ddr-xpr:4.1 | xdma |
/dev/xdmaX_user /dev/xdmaX_c2h_0 /dev/xdmaX_c2h_1 /dev/xdmaX_h2c_0 /dev/xdmaX_h2c_1 |
xclmgmt |
/dev/xclmgmtX |
xilinx:adm-pcie-ku3:2ddr-xpr:4.0 | xdma |
/dev/xdmaX_user /dev/xdmaX_c2h_0 /dev/xdmaX_c2h_1 /dev/xdmaX_h2c_0 /dev/xdmaX_h2c_1 |
xclmgmt |
/dev/xclmgmtX |
2017.1 enhancements also include:
- SDx Eclipse UI
- Platform setup and project creation.
- Add custom platform(s) without creating an SDx project.
- SDx example store: access to GitHub SDSoC and SDAccel examples.
- Wizard for creating RTL kernels.
Acceleration
- Support for accelerating C function templates.
- Launch HLS for specified hardware functions.
Reporting
- New post-implementation utilization reports.
- Collapsible headers on all SDx reports.
- Platform setup and project creation.
- Xilinx Runtime.
- Early access
xocl
kernel driver, based on Linux kernel GEM framework for PCIe based DSAs has the following features.- Support for host page pinning which improves DMA bandwidth.
- Uses Linux kernel based memory management for device memory management.
- Is multi-threading safe and provides a single device node per device for all device operations.
xocl
requires Redhat 6.9 or higher version or Ubuntu 16.04.- User can install
xocl
, by invokingxbinst -gem
. The rest of the steps (e.g. running ./install.sh) remain the same.
- Features ROM in device which advertises device configuration to the driver.
- Migration to
xclbin2
format, which has features like skipping bitstream re-download if the same bitstream is already running. - Calling
clReleaseContext()
truly releases exclusive lock on the device so that another concurrently running application can create context withclCreateContext()
. - Several new features in
xbask
including the new commandsscan
,mem
, andstatus
. xbsak
flash requires root permissions.- 4.X DSAs have AXI Firewal IP which protects PCIe from hangs and stalls inside the device. In case of AXI bus errors, AXI Firewall IP would trip which will cause the driver to send a SIGBUS to all applications which have opened the device node.
- Runtime now includes support for latest version of OpenCL C++ wrappers from Khronos. Thecl2.hppheader file ships along with standard OpenCL C API header files.
- Early access
xocc
supports set target kernel frequency. Overriding kernel clock frequency with lower frequency might help with designs that fail to meet timing on platform clocks.- --kernel_frequency
sets a user-defined clock frequency in MHz for kernel, overriding a default value from DSA. - For a kernel compilation to change target to 150 MHz, add--kernel_frequency 150.
- --kernel_frequency
- Usability
- GDB extension to provide visibility into OpenCL data structures
cl_queue
,cl_event
, andcl_mem
to debug host application hangs. - Application timeline trace refinements.
- Multicolor support for better visualization of OpenCL API calls.
- Support for additional OpenCL APIs:
- clCreateContext
- clCreateImage
- clEnqueueTask
- clEnqueueMigrateMemObjects
- clEnqueueReadImage
- clEnqueueWriteImage
- clEnqueueMigrateMem
- clEnqueueMapBuffer
- clEnqueueUnmapMemObject
- Detailed Kernel Trace
- Support for RTL kernels.
- Reporting loop pipeline activity in waveform.
- GDB extension to provide visibility into OpenCL data structures
- Enhanced debug checks in HW Emulation Flow covering:
- Kernel or System transactions hangs.
- Uninitialized memory read by kernel.
- Out of DDR Range access.
- Out of Bounds array access.
- Periodic aliveness status during long HW Emulation runs.