Debugging Performance Tips

The SDSoC environment provides some basic performance monitoring capabilities in the form of thesds_clock_counter()function. Use this function to determine how much time different code sections, such as the accelerated code and the non-accelerated code, take to execute.

Estimate the actual hardware acceleration time by looking at the latency numbers in the Vivado® Design Suite HLS report files (_sds/vhls/…/*.rpt). Latency of X accelerator clock cycles = X * (processor_clock_freq/accelerator_clock_freq) processor clock cycles. Compare this with the time spent on the actual function call to determine the data transfer overhead.

For best performance improvement, the time required for executing the accelerated function must be much smaller than the time required for executing the original software function. If this is not true, try to run the accelerator at a higher frequency by selecting a differentclkidon the sdscc/sds++ command line. If that does not work, try to determine whether the data transfer overhead is a significant part of the accelerated function execution time, and reduce the data transfer overhead. Note that the defaultclkidis 100 MHz for all platforms. More details about theclkidvalues for the given platform can be obtained by runningsdscc –sds-pf-info .

If the data transfer overhead is large, the following changes might help:
  • Move more code into the accelerated function so that the computation time increases, and the ratio of computation to data transfer time is improved.
  • Reduce the amount of data to be transferred by modifying the code or using pragmas to transfer only the required data.