![]() |
AI Engine API User Guide
(AIE) 2021.2
|
AIE provides hardware support to accelerate special multiplications that can be used to accelerate specific application use cases like (but not limited to) signal processing.
Typedefs |
|
template |
|
using | aie::sliding_mul_sym_x_ops=sliding_mul_sym_ops< Lanes, Points, CoeffStep, DataStepX, 1, CoeffType, DataType, AccumTag > |
More... |
|
template |
|
using | aie::sliding_mul_sym_xy_ops=sliding_mul_sym_ops< Lanes, Points, CoeffStep, DataStepXY, DataStepXY, CoeffType, DataType, AccumTag > |
More... |
|
template |
|
using | aie::sliding_mul_sym_y_ops=sliding_mul_sym_ops< Lanes, Points, CoeffStep, 1, DataStepY, CoeffType, DataType, AccumTag > |
More... |
|
template |
|
using | aie::sliding_mul_x_ops=sliding_mul_ops< Lanes, Points, CoeffStep, DataStepX, 1, CoeffType, DataType, AccumTag > |
More... |
|
template |
|
using | aie::sliding_mul_xy_ops=sliding_mul_ops< Lanes, Points, CoeffStep, DataStepXY, DataStepXY, CoeffType, DataType, AccumTag > |
More... |
|
template |
|
using | aie::sliding_mul_y_ops=sliding_mul_ops< Lanes, Points, CoeffStep, 1, DataStepY, CoeffType, DataType, AccumTag > |
More... |
|
Functions |
|
template |
|
auto | aie::accumulate(const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &data, const NextVecData &...next_data) ->operand_base_type_t< Acc > |
More... |
|
template |
|
auto | aie::accumulate(const VecCoeff &coeff, unsigned coeff_start, const VecData &data, const NextVecData &...next_data) ->accum< std::conditional_t< std::is_same_v< AccumTag,accauto>,detail::default_accum_tag_t< typename VecCoeff::value_type, typename VecData::value_type >, AccumTag >, Lanes > |
More... |
|
struct aie::sliding_mul_ops |
This type provides a parametrized multiplication that implements the following compute pattern:
Types (coeff x data) | Lanes | CoeffStep | DataStepX | DataStepY | coeff_start | data_start |
---|---|---|---|---|---|---|
16b x 16b | 8 16 |
1,2,3,4 | 1 | 1 | Unsigned smaller than 8 | Signed |
16b x 32b | 8 16 |
1,2,3,4 | 1,2,3,4 | 1,2 1 |
Unsigned smaller than 8 | Signed |
32b x 16b | 8 16 |
1,2,3,4 | 1,2,3,4 | 1,2 1 |
Unsigned smaller than 8 | Signed |
16b x c16b | 4 8 |
1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 |
Unsigned smaller than 8 | Signed |
c16b x 16b | 4 8 |
1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 |
Unsigned smaller than 8 | Signed |
c16b x c16b | 4 8 |
1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 |
Unsigned smaller than 8 | Signed |
c16b x 32b | 4 8 |
1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 |
Unsigned smaller than 8 | Signed |
32b x c16b | 4 8 |
1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 |
Unsigned smaller than 8 | Signed |
c32b x 16b | 4 8 |
1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 |
Unsigned smaller than 8 | Signed |
16b x c32b | 4 8 |
1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 |
Unsigned smaller than 8 | Signed |
c32b x c16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
c16b x c32b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
Types (coeff x data) | Lanes | CoeffStep | DataStepX | DataStepY | coeff_start | data_start |
---|---|---|---|---|---|---|
32b x 16b | 8 | 1,2,3,4 | 1,2,3,4 | 1,2 | Unsigned smaller than 8 | Signed |
16b x 32b | 8 | 1,2,3,4 | 1,2,3,4 | 1,2 | Unsigned smaller than 8 | Signed |
32b x 32b | 4 8 |
1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 |
Unsigned smaller than 8 | Signed |
32b x c16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
c16b x 32b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
c32b x 16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
16b x c32b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
c32b x c16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
c16b x c32b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
c32b x 32b | 2 4 |
1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
32b x c32b | 2 4 |
1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
c32b x c32b | 2 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
Types (coeff x data) | Lanes | CoeffStep | DataStepX | DataStepY | coeff_start | data_start |
---|---|---|---|---|---|---|
float x float | 8 | 1,2,3,4 | 1,2,3,4 | 1,2 | Unsigned smaller than 8 | Signed |
float x cfloat | 4 | 1,2,3 | 1,2,3,4 | 1,2,3 | Unsigned smaller than 8 | Signed |
cfloat x float | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3 | Unsigned smaller than 8 | Signed |
cfloat x cfloat | 4 | 1,2,3 | 1,2,3,4 | 1,2,3 | Unsigned smaller than 8 | Signed |
Lanes | Number of output elements. |
Points | Number of data elements used to compute each lane. |
CoeffStep | Step used to select elements from the coeff buffer. This step is applied to element selection within a lane. |
DataStepX | Step used to select elements from the data buffer. This step is applied to element selection within a lane. |
DataStepY | Step used to select elements from the data buffer. This step is applied to element selection accross lanes. |
CoeffType | Type of the coefficient elements. |
DataType | Type of the data elements. |
AccumTag | Accumulator tag that specifies the required accumulation bits. The class must be compatible with the result of the multiplication of the coefficient and data types (real/complex). |
Public Types |
|
using | accum_type=accum< std::conditional_t< std::is_same_v< AccumTag,accauto>,detail::default_accum_tag_t< CoeffType, DataType >, AccumTag >, Lanes > |
using | coeff_type= typename impl_type::coeff_type |
using | data_type= typename impl_type::data_type |
using | impl_type=detail::sliding_mul< Lanes, Points, CoeffStep, DataStepX, DataStepY, accum_bits, CoeffType, DataType > |
enum class | MulType{Mul,Acc_Mul,NegMul} |
Static Public Member Functions |
|
template<AccumOrOpAcc,VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mac(const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start) |
More... |
|
template<VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mul(const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start) |
More... |
|
template<MulTypeMul,VectorOrOpVecCoeff,VectorOrOpVecData, AccumOrOp... Acc> | |
static constexpraccum_type | mul_common(const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start, const Acc &...acc) |
template<VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | negmul(const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start) |
More... |
|
Static Public Attributes |
|
static constexpr unsigned | columns_per_mul= impl_type::columns_per_mul |
static constexpr unsigned | lanes= impl_type::lanes |
static constexpr unsigned | lanes_per_mul= impl_type::lanes_per_mul |
static constexpr unsigned | num_mul= impl_type::num_mul |
static constexpr unsigned | points= impl_type::points |
|
strong |
|
inlinestaticconstexpr |
Performs a multiply-add with the pattern defined by the class parameters using the input coefficient and data arguments.
acc | Accumulator that is added to the result of the multiplication. |
coeff | Vectorof coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
data | Vectorof data samples. |
data_start | Index of the first data element to be used in the multiplication. |
|
inlinestaticconstexpr |
Performs the multiplication pattern defined by the class parameters using the input coefficient and data arguments.
|
inlinestaticconstexpr |
Performs a negation of the multiplication pattern defined by the class parameters using the input coefficient and data arguments.
struct aie::sliding_mul_sym_ops |
This type provides a parametrized multiplication that implements the following compute patterns:
If Points is an even number:
If Points is an odd number:
Types (coeff x data) | Lanes | CoeffStep | DataStepX | DataStepY | coeff_start | data_start |
---|---|---|---|---|---|---|
16b x 16b | 8 16 |
1,2,3,4 | 1 | 1 | Unsigned smaller than 8 | Signed |
16b x 32b | 8 16 |
1,2,3,4 | 1,2,3,4 | 1,2 1 |
Unsigned smaller than 8 | Signed |
32b x 16b | 8 16 |
1,2,3,4 | 1,2,3,4 | 1,2 1 |
Unsigned smaller than 8 | Signed |
16b x c16b | 4 8 |
1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 |
Unsigned smaller than 8 | Signed |
c16b x 16b | 4 8 |
1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 |
Unsigned smaller than 8 | Signed |
c16b x c16b | 4 8 |
1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 |
Unsigned smaller than 8 | Signed |
c16b x 32b | 4 8 |
1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 |
Unsigned smaller than 8 | Signed |
32b x c16b | 4 8 |
1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 |
Unsigned smaller than 8 | Signed |
c32b x 16b | 4 8 |
1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 |
Unsigned smaller than 8 | Signed |
16b x c32b | 4 8 |
1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 |
Unsigned smaller than 8 | Signed |
c32b x c16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
c16b x c32b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
Types (coeff x data) | Lanes | CoeffStep | DataStepX | DataStepY | coeff_start | data_start |
---|---|---|---|---|---|---|
32b x 16b | 8 | 1,2,3,4 | 1,2,3,4 | 1,2 | Unsigned smaller than 8 | Signed |
16b x 32b | 8 | 1,2,3,4 | 1,2,3,4 | 1,2 | Unsigned smaller than 8 | Signed |
32b x 32b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 1,2 |
Unsigned smaller than 8 | Signed |
32b x c16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
c16b x 32b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
c32b x 16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
16b x c32b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
c32b x c16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
c16b x c32b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
c32b x 32b | 2 4 |
1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
32b x c32b | 2 4 |
1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
c32b x c32b | 2 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
Lanes | Number of output elements. |
Points | Number of data elements used to compute each lane. |
CoeffStep | Step used to select elements from the coeff buffer. This step is applied to element selection within a lane. |
DataStepX | Step used to select elements from the data buffer. This step is applied to element selection within a lane. |
DataStepY | Step used to select elements from the data buffer. This step is applied to element selection across lanes. |
CoeffType | Type of the coefficient elements. |
DataType | Type of the data elements. |
AccumTag | Accumulator tag that specifies the required accumulation bits. The class must be compatible with the result of the multiplication of the coefficient and data types (real/complex). |
Public Types |
|
using | accum_type=accum< std::conditional_t< std::is_same_v< AccumTag,accauto>,detail::default_accum_tag_t< CoeffType, DataType >, AccumTag >, Lanes > |
using | coeff_type= typename impl_type::coeff_type |
using | data_type= typename impl_type::data_type |
using | impl_type=detail::sliding_mul_sym< Lanes, Points, CoeffStep, DataStepX, DataStepY, accum_bits, CoeffType, DataType > |
enum class | SymMulType{Sym,Antisym,Acc_Sym,Acc_Antisym} |
Static Public Member Functions |
|
template<AccumOrOpAcc,VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mac_antisym(const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start) |
More... |
|
template<AccumOrOpAcc,VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mac_antisym(const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned ldata_start, unsigned rdata_start) |
More... |
|
template<AccumOrOpAcc,VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mac_antisym(const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &ldata, unsigned ldata_start, const VecData &rdata, unsigned rdata_start) |
More... |
|
template<AccumOrOpAcc,VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mac_sym(const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start) |
More... |
|
template<AccumOrOpAcc,VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mac_sym(const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned ldata_start, unsigned rdata_start) |
More... |
|
template<AccumOrOpAcc,VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mac_sym(const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &ldata, unsigned ldata_start, const VecData &rdata, unsigned rdata_start) |
More... |
|
template<VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mul_antisym(const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start) |
More... |
|
template<VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mul_antisym(const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned ldata_start, unsigned rdata_start) |
More... |
|
template<VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mul_antisym(const VecCoeff &coeff, unsigned coeff_start, const VecData &ldata, unsigned ldata_start, const VecData &rdata, unsigned rdata_start) |
More... |
|
template<SymMulTypeMulType,VectorOrOpVecCoeff,VectorOrOpVecData, AccumOrOp... Acc> | |
static constexpraccum_type | mul_common(const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start, const Acc &...acc) |
template<SymMulTypeMulType,VectorOrOpVecCoeff,VectorOrOpVecData, AccumOrOp... Acc> | |
static constexpraccum_type | mul_common(const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned ldata_start, unsigned rdata_start, const Acc &...acc) |
template<SymMulTypeMulType,VectorOrOpVecCoeff,VectorOrOpVecData, AccumOrOp... Acc> | |
static constexpraccum_type | mul_common(const VecCoeff &coeff, unsigned coeff_start, const VecData &ldata, unsigned ldata_start, const VecData &rdata, unsigned rdata_start, const Acc &...acc) |
template<VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mul_sym(const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start) |
More... |
|
template<VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mul_sym(const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned ldata_start, unsigned rdata_start) |
More... |
|
template<VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mul_sym(const VecCoeff &coeff, unsigned coeff_start, const VecData &ldata, unsigned ldata_start, const VecData &rdata, unsigned rdata_start) |
More... |
|
Static Public Attributes |
|
static constexpr unsigned | columns_per_mul= impl_type::columns_per_mul |
static constexpr unsigned | lanes= impl_type::lanes |
static constexpr unsigned | lanes_per_mul= impl_type::lanes_per_mul |
static constexpr unsigned | num_mul= impl_type::num_mul |
static constexpr unsigned | points= impl_type::points |
|
strong |
|
inlinestaticconstexpr |
Performs the antisymmetric multiply-add pattern defined by the class parameters using the input coefficient and data arguments. This variant allows two separate start indices for left/right elements.
acc | Accumulator to be added to the result of the multiplication. |
coeff | Vectorof coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
data | Vectorof data samples. |
ldata_start | Index of the first left data element to be used in the multiplication. |
rdata_start | Index of the first right data element to be used in the multiplication. |
|
inlinestaticconstexpr |
Performs the antisymmetric multiply-add pattern defined by the class parameters using the input coefficient and data arguments. This variant allows two separate start indices for left/right elements.
acc | Accumulator to be added to the result of the multiplication. |
coeff | Vectorof coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
data | Vectorof data samples. |
ldata_start | Index of the first left data element to be used in the multiplication. |
rdata_start | Index of the first right data element to be used in the multiplication. |
|
inlinestaticconstexpr |
Performs the antisymmetric multiply-add pattern defined by the class parameters using the input coefficient and data arguments. This variant uses two input buffers for left/right elements.
acc | Accumulator to be added to the result of the multiplication. |
coeff | Vectorof coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
ldata | Vectorof left data samples. The size is limitted to vectors of up to 512 bits. |
ldata_start | Index of the first left data element to be used in the multiplication. |
rdata | Vectorof right data samples. The size is limitted to vectors of up to 512 bits. |
rdata_start | Index of the first right data element to be used in the multiplication. |
|
inlinestaticconstexpr |
Performs the symmetric multiply-add pattern defined by the class parameters using the input coefficient and data arguments.
acc | Accumulator to be added to the result of the multiplication. |
coeff | Vectorof coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
data | Vectorof data samples. |
data_start | Index of the first data element to be used in the multiplication. |
|
inlinestaticconstexpr |
Performs the symmetric multiply-add pattern defined by the class parameters using the input coefficient and data arguments. This variant allows two separate start indices for left/right elements.
acc | Accumulator to be added to the result of the multiplication. |
coeff | Vectorof coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
data | Vectorof data samples. |
ldata_start | Index of the first left data element to be used in the multiplication. |
rdata_start | Index of the first right data element to be used in the multiplication. |
|
inlinestaticconstexpr |
Performs the symmetric multiply-add pattern defined by the class parameters using the input coefficient and data arguments. This variant uses two input buffers for left/right elements.
acc | Accumulator to be added to the result of the multiplication. |
coeff | Vectorof coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
ldata | Vectorof left data samples. The size is limitted to vectors of up to 512 bits. |
ldata_start | Index of the first left data element to be used in the multiplication. |
rdata | Vectorof right data samples. The size is limitted to vectors of up to 512 bits. |
rdata_start | Index of the first right data element to be used in the multiplication. |
|
inlinestaticconstexpr |
Performs the antisymmetric multiplication pattern defined by the class parameters using the input coefficient and data arguments. This variant allows two separate start indices for left/right elements.
coeff | Vectorof coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
data | Vectorof data samples. |
ldata_start | Index of the first left data element to be used in the multiplication. |
rdata_start | Index of the first right data element to be used in the multiplication. |
|
inlinestaticconstexpr |
Performs the antisymmetric multiplication pattern defined by the class parameters using the input coefficient and data arguments. This variant allows two separate start indices for left/right elements.
coeff | Vectorof coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
data | Vectorof data samples. |
ldata_start | Index of the first left data element to be used in the multiplication. |
rdata_start | Index of the first right data element to be used in the multiplication. |
|
inlinestaticconstexpr |
Performs the antisymmetric multiplication pattern defined by the class parameters using the input coefficient and data arguments. This variant uses two input buffers for left/right elements.
coeff | Vectorof coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
ldata | Vectorof left data samples. The size is limitted to vectors of up to 512 bits. |
ldata_start | Index of the first left data element to be used in the multiplication. |
rdata | Vectorof right data samples. The size is limitted to vectors of up to 512 bits. |
rdata_start | Index of the first right data element to be used in the multiplication. |
|
inlinestaticconstexpr |
Performs the symmetric multiplication pattern defined by the class parameters using the input coefficient and data arguments.
|
inlinestaticconstexpr |
Performs the symmetric multiplication pattern defined by the class parameters using the input coefficient and data arguments. This variant allows two separate start indices for left/right elements.
coeff | Vectorof coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
data | Vectorof data samples. |
ldata_start | Index of the first left data element to be used in the multiplication. |
rdata_start | Index of the first right data element to be used in the multiplication. |
|
inlinestaticconstexpr |
Performs the symmetric multiplication pattern defined by the class parameters using the input coefficient and data arguments. This variant uses two input buffers for left/right elements.
coeff | Vectorof coefficients. On AIE the size is limited to vectors of up to 256 bits. |
coeff_start | Index of the first coefficient element to be used in the multiplication. |
ldata | Vectorof left data samples. The size is limitted to vectors of up to 512 bits. |
ldata_start | Index of the first left data element to be used in the multiplication. |
rdata | Vectorof right data samples. The size is limitted to vectors of up to 512 bits. |
rdata_start | Index of the first right data element to be used in the multiplication. |
struct aie::sliding_mul_sym_uct_ops |
This type provides a parametrized multiplication across the lower half of its lanes (equivalent tosliding_mul_sym_ops), and upshifts one selected set of data in the upper half of the lanes.
It implements the following compute pattern:
Types (coeff x data) | Lanes | CoeffStep | DataStepX | DataStepY | coeff_start | data_start |
---|---|---|---|---|---|---|
c16b x c16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
32b x c16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
c32b x c16b | 4 | 1,2,3,4 | 1,2,3,4 | 1,2,3,4 | Unsigned smaller than 8 | Signed |
Lanes | Number of output elements. |
Points | Number of data elements used to compute each lane in the first half of the output Lanes. |
CoeffStep | Step used to select elements from the coeff buffer. This step is applied to element selection within a lane. |
DataStep | Step used to select elements from the data buffer. This step is applied to element selection within a lane and across lanes. |
CoeffType | Type of the coefficient elements. |
DataType | Type of the data elements. |
AccumTag | Accumulator tag that specifies the required accumulation bits. The class must be compatible with the result of the multiplication of the coefficient and data types (real/complex). |
Public Types |
|
using | accum_type=accum< std::conditional_t< std::is_same_v< AccumTag,accauto>,detail::default_accum_tag_t< CoeffType, DataType >, AccumTag >, Lanes > |
using | coeff_type= typename impl_type::coeff_type |
using | data_type= typename impl_type::data_type |
using | impl_type=detail::sliding_mul_sym_uct< Lanes, Points, CoeffStep, DataStep, accum_bits, CoeffType, DataType > |
enum class | SymMulType{Sym,Antisym,Acc_Sym,Acc_Antisym} |
Static Public Member Functions |
|
template<AccumOrOpAcc,VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mac_antisym_uct(const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start, unsigned uct_shift) |
template<AccumOrOpAcc,VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mac_antisym_uct(const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &ldata, unsigned ldata_start, const VecData &rdata, unsigned rdata_start, unsigned uct_shift) |
template<AccumOrOpAcc,VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mac_sym_uct(const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start, unsigned uct_shift) |
template<AccumOrOpAcc,VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mac_sym_uct(const Acc &acc, const VecCoeff &coeff, unsigned coeff_start, const VecData &ldata, unsigned ldata_start, const VecData &rdata, unsigned rdata_start, unsigned uct_shift) |
template<VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mul_antisym_uct(const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start, unsigned uct_shift) |
template<VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mul_antisym_uct(const VecCoeff &coeff, unsigned coeff_start, const VecData &ldata, unsigned ldata_start, const VecData &rdata, unsigned rdata_start, unsigned uct_shift) |
template<SymMulTypeMulType,VectorOrOpVecCoeff,VectorOrOpVecData, AccumOrOp... Acc> | |
static constexpraccum_type | mul_common(const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start, unsigned uct_shift, const Acc &...acc) |
template<SymMulTypeMulType,VectorOrOpVecCoeff,VectorOrOpVecData, AccumOrOp... Acc> | |
static constexpraccum_type | mul_common(const VecCoeff &coeff, unsigned coeff_start, const VecData &ldata, unsigned ldata_start, const VecData &rdata, unsigned rdata_start, unsigned uct_shift, const Acc &...acc) |
template<VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mul_sym_uct(const VecCoeff &coeff, unsigned coeff_start, const VecData &data, unsigned data_start, unsigned uct_shift) |
template<VectorOrOpVecCoeff,VectorOrOpVecData> | |
static constexpraccum_type | mul_sym_uct(const VecCoeff &coeff, unsigned coeff_start, const VecData &ldata, unsigned ldata_start, const VecData &rdata, unsigned rdata_start, unsigned uct_shift) |
Static Public Attributes |
|
static constexpr unsigned | columns_per_mul= impl_type::columns_per_mul |
static constexpr unsigned | lanes= impl_type::lanes |
static constexpr unsigned | lanes_per_mul= impl_type::lanes_per_mul |
static constexpr unsigned | num_mul= impl_type::num_mul |
static constexpr unsigned | points= impl_type::points |
|
strong |
usingaie::sliding_mul_sym_x_ops= typedefsliding_mul_sym_ops |
Similar tosliding_mul_sym_ops, but DataStepY is always 1.
Lanes | Number of output elements. |
Points | Number of data elements used to compute each lane. |
CoeffStep | Step used to select elements from the coeff buffer. This step is applied to element selection within a lane. |
DataStepX | Step used to select elements from the data buffer. This step is applied to element selection within a lane. |
CoeffType | Type of the coefficient elements. |
DataType | Type of the data elements. |
AccumTag | Accumulator tag that specifies the required accumulation bits. The class must be compatible with the result of the multiplication of the coefficient and data types (real/complex). |
usingaie::sliding_mul_sym_xy_ops= typedefsliding_mul_sym_ops |
Similar tosliding_mul_sym_ops, but DataStepX is equal to DataStepY.
Lanes | Number of output elements. |
Points | Number of data elements used to compute each lane. |
CoeffStep | Step used to select elements from the coeff buffer. This step is applied to element selection within a lane. |
DataStepXY | Step used to select elements from the data buffer. This step is applied to element selection within a lane and across lanes. |
CoeffType | Type of the coefficient elements. |
DataType | Type of the data elements. |
AccumTag | Accumulator tag that specifies the required accumulation bits. The class must be compatible with the result of the multiplication of the coefficient and data types (real/complex). |
usingaie::sliding_mul_sym_y_ops= typedefsliding_mul_sym_ops |
Similar tosliding_mul_sym_ops, but DataStepX is always 1.
Lanes | Number of output elements. |
Points | Number of data elements used to compute each lane. |
CoeffStep | Step used to select elements from the coeff buffer. This step is applied to element selection within a lane. |
DataStepY | Step used to select elements from the data buffer. This step is applied to element selection across lanes. |
CoeffType | Type of the coefficient elements. |
DataType | Type of the data elements. |
AccumTag | Accumulator tag that specifies the required accumulation bits. The class must be compatible with the result of the multiplication of the coefficient and data types (real/complex). |
usingaie::sliding_mul_x_ops= typedefsliding_mul_ops |
Similar tosliding_mul_ops, but DataStepY is always 1.
For the list of valid parameters, checksliding_mul_ops.
Lanes | Number of output elements. |
Points | Number of data elements used to compute each lane. |
CoeffStep | Step used to select elements from the coeff buffer. This step is applied to element selection within a lane. |
DataStepX | Step used to select elements from the data buffer. This step is applied to element selection within a lane. |
CoeffType | Type of the coefficient elements. |
DataType | Type of the data elements. |
AccumTag | Accumulator tag that specifies the required accumulation bits. The class must be compatible with the result of the multiplication of the coefficient and data types (real/complex). |
usingaie::sliding_mul_xy_ops= typedefsliding_mul_ops |
Similar tosliding_mul_ops, but DataStepX is equal to DataStepY.
For the list of valid parameters, checksliding_mul_ops.
Lanes | Number of output elements. |
Points | Number of data elements used to compute each lane. |
CoeffStep | Step used to select elements from the coeff buffer. This step is applied to element selection within a lane. |
DataStepXY | Step used to select elements from the data buffer. This step is applied to element selection within a lane and across lanes. |
CoeffType | Type of the coefficient elements. |
DataType | Type of the data elements. |
AccumTag | Accumulator tag that specifies the required accumulation bits. The class must be compatible with the result of the multiplication of the coefficient and data types (real/complex). |
usingaie::sliding_mul_y_ops= typedefsliding_mul_ops |
Similar tosliding_mul_ops, but DataStepX is always 1.
Lanes | Number of output elements. |
Points | Number of data elements used to compute each lane. |
CoeffStep | Step used to select elements from the coeff buffer. This step is applied to element selection within a lane. |
DataStepY | Step used to select elements from the data buffer. This step is applied to element selection accross lanes. |
CoeffType | Type of the coefficient elements. |
DataType | Type of the data elements. |
AccumTag | Accumulator tag that specifies the required accumulation bits. The class must be compatible with the result of the multiplication of the coefficient and data types (real/complex). |
auto aie::accumulate | ( | const Acc & | acc, |
const VecCoeff & | coeff, | ||
unsigned | coeff_start, | ||
const VecData & | data, | ||
const NextVecData &... | next_data | ||
) | ->operand_base_type_t |
This function provides a parametrized multiplication that implements the following compute pattern:
Lanes | Number of output elements. |
auto aie::accumulate | ( | const VecCoeff & | coeff, |
unsigned | coeff_start, | ||
const VecData & | data, | ||
const NextVecData &... | next_data | ||
) | ->accum |
This function provides a parametrized multiplication that implements the following compute pattern:
Lanes | Number of output elements. |
AccumTag | Accumulator tag that specifies the required accumulation bits. The class must be compatible with the result of the multiplication of the coefficient and data types (real/complex). |
coeff | Vectorwith the coefficients. |
coeff_start | First element from the coeff vector to be used. |
data | First vector of data. |
next_data | Rest of the data vectors. |