pragma SDS async

Description

TheASYNCpragma must be paired with theWAITpragma to support manual control of the hardware function synchronization.

TheASYNCpragma is specified immediately preceding a call to a hardware function, directing the compiler not to automatically generate the wait based on data flow analysis. TheWAITpragma must be inserted at an appropriate point in the program to direct the CPU to wait until the associatedASYNCfunction call with the same ID has completed.

In the presence of anASYNCpragma, the SDSoC system compiler does not generate ansds_wait()in the stub function for the associated call. The program must contain the matchingsds_wait(ID)or#pragma SDS wait(ID)at an appropriate point to synchronize the controlling thread running on the CPU with the hardware function thread. An advantage of using the#pragma SDS wait(ID)over thesds_wait(ID)function call is that the source code can then be compiled by compilers other than the SDSoC compiler, likegcc, that does not interpret eitherASYNCorWAITpragmas.

Syntax

Place the pragma in the C source immediately before the function call:

#pragma SDS async() ... #pragma SDS wait(ID)

Where:

: Is a user-defined ID for theASYNC/WAITpair specified as a compile time unsigned integer constant.

Example 1

The following code snippet shows an example of using these pragmas with different IDs:

{ #pragma SDS async(1) mmult(A, B, C); #pragma SDS async(2) mmult(D, E, F); ... #pragma SDS wait(1) #pragma SDS wait(2) }

The program running on the hardware first transfersAandBto the mmult hardware and returns immediately. Then the program transfersDandEto the mmult hardware and returns immediately. When the program later executes to the point of#pragma SDS wait(1), it waits for the outputCto be ready. When the program later executes to the point of#pragma SDS wait(2), it waits for the outputFto be ready.

Example 2

The following code snippet shows an example of using these pragmas with the same IDto pipeline the data transfer and accelerator execution:

for (int i = 0; i < pipeline_depth; i++) { #pragma SDS async(1) mmult_accel(A[i%NUM_MAT], B[i%NUM_MAT], C[i%NUM_MAT]); } for (int i = pipeline_depth; i < NUM_TESTS-pipeline_depth; i++) { #pragma SDS wait(1) #pragma SDS async(1) mmult_accel(A[i%NUM_MAT], B[i%NUM_MAT], C[i%NUM_MAT]); } for (int i = 0; i < pipeline_depth; i++) { #pragma SDS wait(1) }

In the above example, the first loop ramps up the pipeline with a depth ofpipeline_depth, the second loop executes the pipeline, and the third loop ramps down the pipeline. The hardware buffer depth (pragma SDS data buffer_depth) should be set to the same value aspipeline_depth. The goal of this pipeline is to transfer data to the accelerator for the next execution while the current execution is not finished.SeeIncreasing System Parallelism and Concurrencyfor more information.

Description

Syntax

Example 1

Example 2

See Also