xcl_reqd_pipe_depth

Description

The OpenCL 2.0 specification introduces a new memory object called pipe. A pipe stores data organized as a FIFO. Pipes can be used to stream data from one kernel to another inside the FPGA device without having to use the external memory, which greatly improves the overall system latency.

In the SDAccel development environment, pipes must be statically defined outside of all kernel functions:. The depth of a pipe must be specified by using the xcl_reqd_pipe_depthattribute in the pipe declaration:
pipe int p0 __attribute__((xcl_reqd_pipe_depth(512)));
Important:A given pipe, can have one and only one producer and consumer in different kernels.

Pipes can only be accessed using standard OpenCLread_pipe()andwrite_pipe()built-in functions in non-blocking mode, or using Xilinx® extendedread_pipe_block()andwrite_pipe_block()functions in blocking mode. Pipe objects are not accessible from the host CPU. The status of pipes can be queried using OpenCLget_pipe_num_packets()andget_pipe_max_packets()built-in functions. SeeThe OpenCL C Specificationfrom Khronos OpenCL Working Group for more details on these built-in functions.

Syntax

This attribute must be assigned at the declaration of the pipe object:

pipe int p0 __attribute__((xcl_reqd_pipe_depth(n)));

Where:
  • n: Specifies the depth of the pipe. Valid depth values are 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768.

Examples

The following is thedataflow_pipes_oclexample fromXilinx GitHubthat use pipes to pass data from one processing stage to the next using blockingread_pipe_block()andwrite_pipe_block()functions:

pipe int p0 __attribute__((xcl_reqd_pipe_depth(32))); pipe int p1 __attribute__((xcl_reqd_pipe_depth(32))); // Input Stage Kernel : Read Data from Global Memory and write into Pipe P0 kernel __attribute__ ((reqd_work_group_size(1, 1, 1))) void input_stage(__global int *input, int size) { __attribute__((xcl_pipeline_loop)) mem_rd: for (int i = 0 ; i < size ; i++) { //blocking Write command to pipe P0 write_pipe_block(p0, &input[i]); } } // Adder Stage Kernel: Read Input data from Pipe P0 and write the result // into Pipe P1 kernel __attribute__ ((reqd_work_group_size(1, 1, 1))) void adder_stage(int inc, int size) { __attribute__((xcl_pipeline_loop)) execute: for(int i = 0 ; i < size ; i++) { int input_data, output_data; //blocking read command to Pipe P0 read_pipe_block(p0, &input_data); output_data = input_data + inc; //blocking write command to Pipe P1 write_pipe_block(p1, &output_data); } } // Output Stage Kernel: Read result from Pipe P1 and write the result to // Global Memory kernel __attribute__ ((reqd_work_group_size(1, 1, 1))) void output_stage(__global int *output, int size) { __attribute__((xcl_pipeline_loop)) mem_wr: for (int i = 0 ; i < size ; i++) { //blocking read command to Pipe P1 read_pipe_block(p1, &output[i]); } }

See Also