pragma HLS loop_flatten

Description

Allows nested loops to be flattened into a single loop hierarchy with improved latency.

In the RTL implementation, it requires one clock cycle to move from an outer loop to an inner loop, and from an inner loop to an outer loop. Flattening nested loops allows them to be optimized as a single loop. This saves clock cycles, potentially allowing for greater optimization of the loop body logic.

Apply the LOOP_FLATTENpragma to the loop body of the inner-most loop in the loop hierarchy. Only perfect and semi-perfect loops can be flattened in this manner:
  • Perfect loop nests:
    • Only the innermost loop has loop body content.
    • There is no logic specified between the loop statements.
    • All loop bounds are constant.
  • Semi-perfect loop nests:
    • Only the innermost loop has loop body content.
    • There is no logic specified between the loop statements.
    • The outermost loop bound can be a variable.
  • Imperfect loop nests: When the inner loop has variable bounds (or the loop body is not exclusively inside the inner loop), try to restructure the code, or unroll the loops in the loop body to create a perfect loop nest.

Syntax

Place the pragma in the C source within the boundaries of the nested loop.

#pragma HLS loop_flatten off

Where:

  • off: Is an optional keyword that prevents flattening from taking place. Can prevent some loops from being flattened while all others in the specified location are flattened.
    Note:The presence of the LOOP_FLATTEN pragma enables the optimization.

Example 1

Flattensloop_1in functionfooand all (perfect or semi-perfect) loops above it in the loop hierarchy, into a single loop. Place the pragma in the body ofloop_1.

void foo (num_samples, ...) { int i; ... loop_1: for(i=0;i< num_samples;i++) { #pragma HLS loop_flatten ... result = a + b; } }

Example 2

Prevents loop flattening inloop_1:

loop_1: for(i=0;i< num_samples;i++) { #pragma HLS loop_flatten off ...

See Also