AI Engine API User Guide (AIE) 2021.2
Matrix Multiplication

Overview

The AIE API encapsulates the matrix multiplication functionality in theaie::mmulclass template. This class template is parametrized with the matrix multiplication shape (MxKxN), the data types and, optionally, the requested accmululation precision. The resulting class defines a function that performs the multiplication and a data type for the result that can be converted to an accumulator/vector. The function interprets the input vectors as matrices as described by the shape parameters.

The following code snippet shows a sample blocked multiplication using theaie::mmulclass. The matrices are assumed to be pre-tiled as defined by the mmul shape (MxK for A, KxN for B, and MxN for C).

template< unsignedM, unsignedK, unsignedN>
voidmmul_blocked( unsignedrowA, unsignedcolA, unsignedcolB,
const int16* __restrict pA, const int16* __restrict pB, int16* __restrict pC)
{
usingMMUL = aie::mmul;
for( unsignedz = 0; z < rowA; z += 2) chess_loop_range(2,) {
int16* __restrict pC1 = pC + ( z * colB + 0) * MMUL::size_C;
int16* __restrict pC2 = pC + ((z + 1) * colB + 0) * MMUL::size_C;
for( unsignedj = 0; j < colB; j += 2) chess_loop_range(2,) {
const int16* __restrict pA1 = pA + ( z * colA + 0) * MMUL::size_A;
const int16* __restrict pA2 = pA + ((z + 1) * colA + 0) * MMUL::size_A;
const int16* __restrict pB1 = pB + ( 0 * colB + j) * MMUL::size_B;
const int16* __restrict pB2 = pB + ( 0 * colB + (j + 1)) * MMUL::size_B;
aie::vectorA0 = aie::load_v(pA1); pA1 += MMUL::size_A;
aie::vectorA1 = aie::load_v(pA2); pA2 += MMUL::size_A;
aie::vectorB0 = aie::load_v(pB1); pB1 += MMUL::size_B * colB;
aie::vectorB1 = aie::load_v(pB2); pB2 += MMUL::size_B * colB;
MMUL C00; C00.mul(A0, B0);
MMUL C01; C01.mul(A0, B1);
MMUL C10; C10.mul(A1, B0);
MMUL C11; C11.mul(A1, B1);
for( unsignedi = 1; i < colA; ++i) chess_prepare_for_pipelining chess_loop_range(3,) {
A0 = aie::load_v(pA1); pA1 += MMUL::size_A;
A1 = aie::load_v(pA2); pA2 += MMUL::size_A;
B0 = aie::load_v(pB1); pB1 += MMUL::size_B * colB;
B1 = aie::load_v(pB2); pB2 += MMUL::size_B * colB;
C00.mac(A0, B0);
C01.mac(A0, B1);
C10.mac(A1, B0);
C11.mac(A1, B1);
}
aie::store_v(pC1, C00.template to_vector()); pC1 += MMUL::size_C;
aie::store_v(pC1, C01.template to_vector()); pC1 += MMUL::size_C;
aie::store_v(pC2, C10.template to_vector()); pC2 += MMUL::size_C;
aie::store_v(pC2, C11.template to_vector()); pC2 += MMUL::size_C;
}
}
}
T1 * store_v(T1 *ptr, const vector< T2, Elems > &v)
Definition:aie.hpp:705
Definition:aie.hpp:5893
Definition:aie_declaration.hpp:68
int16_t int16
Definition:types.hpp:63

Classes

struct aie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >

Supported matrix multiplication shapes

Matrix multiplication modes for real types
8b x 8b 16b x 8b 8b x 16b 16b x 16b 32b x 16b 16b x 32b 32b x 32b float
4x8x4
4x16x4
8x8x4
2x8x8
4x8x8
2x16x8
4x16x8
4x4x4
8x4x4
4x8x4
4x4x8
4x4x8
4x4x4
4x4x4
2x4x8
4x4x8
4x2x8
2x4x8
4x4x4
4x2x4
2x2x4
2x4x4
4x4x2
2x2x8
4x2x2
2x4x8
4x4x4
4x2x4
2x2x2
2x4x2
2x8x2
4x2x2
4x4x2
2x4x4
4x2x4
2x2x2
2x4x2
2x8x2
4x2x2
4x4x2
2x4x4
Matrix multiplication modes for complex types (c16b/c32b/cfloat represent complex types)
16b x c16b 16b x c32b c16b x 16b c16b x c16b c16b x 32b c16b x c32b 32b x c16b 32b x c32b c32b x 16b c32b x c16b c32b x 32b c32b x c32b float x cfloat cfloat x float cfloat x cfloat
4x2x2
4x4x4
2x4x2
2x4x4
2x8x2
4x4x2
2x2x4
2x2x8
2x4x4
2x4x8
4x2x4
4x4x2
4x4x4
2x2x2
2x4x2
2x8x2
2x4x4
4x2x2
4x4x2
4x2x4
2x2x2
2x4x2
2x8x2
2x4x4
4x2x2
4x4x2
4x2x4
2x2x2
2x4x2
2x2x2
2x4x2
2x8x2
2x4x4
4x2x2
4x4x2
4x2x4
2x2x2
2x4x2
2x4x2
2x8x2
2x4x4
4x4x2
2x2x2
2x4x2
1x2x2
2x2x2
2x4x2
1x2x2
2x2x1
2x2x2
2x2x2
2x4x2
2x2x2
2x4x2
2x2x2
2x2x4
2x4x2
4x2x2

Class Documentation

aie::mmul

struct aie::mmul
templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>
struct aie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >

Type that encapsulates a blocked matrix multiplication C = A x B

Objects of this type encapsulate the current result of the multiplication. The first result is computed with the mul method. New multiplications can be accumulated using the mac method.

Template Parameters
M_Elems Rows in matrix A.
K_Elems Columns in matrix A / Rows in matrix B.
N_Elems Columns in matrix B.
TypeA Type of the elements in matrix A. It must meetElemBaseType.
TypeB Type of the elements in matrix B. By default is the same as TypeA. It must meetElemBaseType.
AccumTag Type of the elements of the accumulator that contains the results to be written in matrix C. It must meetAccumElemBaseType. If not specified, it uses the default accumulation type for multiplications of TypeA x TypeB.
Inheritance diagram for aie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >:
aie::detail::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeA, detail::to_native_accum_bits_for_mul_types_tag< TypeA, TypeA, accauto >()>

Public Types

using accum_type= typename mmul_impl::accum_type
using mmul_impl=detail::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, detail::to_native_accum_bits_for_mul_types_tag< TypeA, TypeB, AccumTag >()>

Public Member Functions

mmul()
More...
mmul(constaccum_type&acc)
More...
template
mmul(constvector< T,M*N> &v, int shift=0)
More...
template
void mac(constvector< TypeA, ElemsA > &a, constvector< TypeB, ElemsB > &b)
More...
template
void mul(constvector< TypeA, ElemsA > &a, constvector< TypeB, ElemsB > &b)
More...
operator accum_type() const
More...
accum_type to_accum() const
More...
template
vector< T,M*N> to_vector(int shift=0) const
More...

Static Public Attributes

static constexpr unsigned K= K_Elems
More...
static constexpr unsigned M= M_Elems
More...
static constexpr unsigned N= N_Elems
More...
static constexpr unsigned size_A=M*K
More...
static constexpr unsigned size_B=K*N
More...
static constexpr unsigned size_C=M*N
More...

Constructor & Destructor Documentation

mmul()[1/3]

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>
aie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::mmul ( )
inline

Constructor. Data is undefined.

mmul()[2/3]

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>
aie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::mmul ( constaccum_type& acc )
inline

Constructor. Data is initialized from the given accumulator.

Parameters
acc Accumulator data is initialized from.

mmul()[3/3]

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>
template
aie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::mmul ( constvector< T,M*N> & v,
int shift=0
)
inline

Constructor. Data is initialized from the given vector.

Parameters
v Vectordata is initialized from.
shift Upshift in bits to be applied to input data. This parameter is ignored for floating-point types.

Member Function Documentation

mac()

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>
template
voidaie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::mac ( constvector< TypeA, ElemsA > & a,
constvector< TypeB, ElemsB > & b
)
inline

Multiply of the two given matrices and add it to the result.

Parameters
a Vectorthat represents the A input matrix. The number of elements must be M * N.
b Vectorthat represents the B input matrix. The number of elements must be N * K.

mul()

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>
template
voidaie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::mul ( constvector< TypeA, ElemsA > & a,
constvector< TypeB, ElemsB > & b
)
inline

Initialize the result value with the multiplication of the two given matrices.

Parameters
a Vectorthat represents the A input matrix. The number of elements must be M * N.
b Vectorthat represents the B input matrix. The number of elements must be N * K.

operator accum_type()

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>
aie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::operatoraccum_type ( ) const
inline

Conversion operator to accumulator.

to_accum()

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>
accum_typeaie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::to_accum ( ) const
inline

Return the result of the multiplication as an accumulator.

to_vector()

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>
template
vectorM*N>aie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::to_vector ( int shift=0 ) const
inline

Return the result of the multiplication as an accumulator.

Parameters
shift Downshift in bits to be applied to output data. This parameter is ignored for floating-point types.

Member Data Documentation

K

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>
constexpr unsignedaie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::K = K_Elems
staticconstexpr

Number of columns in matrix A, and number of rows in matrix B.

M

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>
constexpr unsignedaie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::M = M_Elems
staticconstexpr

Number of rows in matrix A.

N

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>
constexpr unsignedaie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::N = N_Elems
staticconstexpr

Number of columns in matrix B.

size_A

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>
constexpr unsignedaie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::size_A =M*K
staticconstexpr

Number of elements in matrix A

size_B

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>
constexpr unsignedaie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::size_B =K*N
staticconstexpr

Number of elements in matrix B

size_C

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>
constexpr unsignedaie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::size_C =M*N
staticconstexpr

Number of elements in matrix C