Overview

The AIE API encapsulates the matrix multiplication functionality in theaie::mmulclass template. This class template is parametrized with the matrix multiplication shape (MxKxN), the data types and, optionally, the requested accmululation precision. The resulting class defines a function that performs the multiplication and a data type for the result that can be converted to an accumulator/vector. The function interprets the input vectors as matrices as described by the shape parameters.

The following code snippet shows a sample blocked multiplication using theaie::mmulclass. The matrices are assumed to be pre-tiled as defined by the mmul shape (MxK for A, KxN for B, and MxN for C).

      template<
      unsignedM,
      unsignedK,
      unsignedN>
     
      voidmmul_blocked(
      unsignedrowA,
      unsignedcolA,
      unsignedcolB,
     
      const
      int16* __restrict pA,
      const
      int16* __restrict pB,
      int16* __restrict pC)
     
      {
     
      usingMMUL =
      aie::mmul;
     
      for(
      unsignedz = 0; z < rowA; z += 2) chess_loop_range(2,) {
     
      int16* __restrict pC1 = pC + ( z * colB + 0) * MMUL::size_C;
     
      int16* __restrict pC2 = pC + ((z + 1) * colB + 0) * MMUL::size_C;
     
      for(
      unsignedj = 0; j < colB; j += 2) chess_loop_range(2,) {
     
      const
      int16* __restrict pA1 = pA + ( z * colA + 0) * MMUL::size_A;
     
      const
      int16* __restrict pA2 = pA + ((z + 1) * colA + 0) * MMUL::size_A;
     
      const
      int16* __restrict pB1 = pB + ( 0 * colB + j) * MMUL::size_B;
     
      const
      int16* __restrict pB2 = pB + ( 0 * colB + (j + 1)) * MMUL::size_B;
     
      aie::vectorA0 = aie::load_v(pA1); pA1 += MMUL::size_A;
     
      aie::vectorA1 = aie::load_v(pA2); pA2 += MMUL::size_A;
     
      aie::vectorB0 = aie::load_v(pB1); pB1 += MMUL::size_B * colB;
     
      aie::vectorB1 = aie::load_v(pB2); pB2 += MMUL::size_B * colB;
     
      MMUL C00; C00.mul(A0, B0);
     
      MMUL C01; C01.mul(A0, B1);
     
      MMUL C10; C10.mul(A1, B0);
     
      MMUL C11; C11.mul(A1, B1);
     
      for(
      unsignedi = 1; i < colA; ++i) chess_prepare_for_pipelining chess_loop_range(3,) {
     
      A0 = aie::load_v(pA1); pA1 += MMUL::size_A;
     
      A1 = aie::load_v(pA2); pA2 += MMUL::size_A;
     
      B0 = aie::load_v(pB1); pB1 += MMUL::size_B * colB;
     
      B1 = aie::load_v(pB2); pB2 += MMUL::size_B * colB;
     
      C00.mac(A0, B0);
     
      C01.mac(A0, B1);
     
      C10.mac(A1, B0);
     
      C11.mac(A1, B1);
     
      }
     
      aie::store_v(pC1, C00.template to_vector()); pC1 += MMUL::size_C;
     
      aie::store_v(pC1, C01.template to_vector()); pC1 += MMUL::size_C;
     
      aie::store_v(pC2, C10.template to_vector()); pC2 += MMUL::size_C;
     
      aie::store_v(pC2, C11.template to_vector()); pC2 += MMUL::size_C;
     
      }
     
      }
     
      }

Classes
struct	aie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >

Supported matrix multiplication shapes

Matrix multiplication modes for real types
8b x 8b	16b x 8b	8b x 16b	16b x 16b	32b x 16b	16b x 32b	32b x 32b	float
4x8x4 4x16x4 8x8x4 2x8x8 4x8x8 2x16x8 4x16x8	4x4x4 8x4x4 4x8x4 4x4x8	4x4x8 4x4x4	4x4x4 2x4x8 4x4x8 4x2x8	2x4x8 4x4x4 4x2x4 2x2x4 2x4x4 4x4x2 2x2x8	4x2x2 2x4x8 4x4x4	4x2x4 2x2x2 2x4x2 2x8x2 4x2x2 4x4x2 2x4x4	4x2x4 2x2x2 2x4x2 2x8x2 4x2x2 4x4x2 2x4x4

Matrix multiplication modes for complex types (c16b/c32b/cfloat represent complex types)
16b x c16b	16b x c32b	c16b x 16b	c16b x c16b	c16b x 32b	c16b x c32b	32b x c16b	32b x c32b	c32b x 16b	c32b x c16b	c32b x 32b	c32b x c32b	float x cfloat	cfloat x float	cfloat x cfloat
4x2x2 4x4x4	2x4x2 2x4x4 2x8x2 4x4x2	2x2x4 2x2x8 2x4x4 2x4x8 4x2x4 4x4x2 4x4x4	2x2x2 2x4x2 2x8x2 2x4x4 4x2x2 4x4x2 4x2x4	2x2x2 2x4x2 2x8x2 2x4x4 4x2x2 4x4x2 4x2x4	2x2x2 2x4x2	2x2x2 2x4x2 2x8x2 2x4x4 4x2x2 4x4x2 4x2x4	2x2x2 2x4x2	2x4x2 2x8x2 2x4x4 4x4x2	2x2x2 2x4x2	1x2x2 2x2x2 2x4x2	1x2x2 2x2x1 2x2x2	2x2x2 2x4x2	2x2x2 2x4x2	2x2x2 2x2x4 2x4x2 4x2x2

Class Documentation

◆aie::mmul

struct aie::mmul

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>
struct aie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >

Type that encapsulates a blocked matrix multiplication C = A x B

Objects of this type encapsulate the current result of the multiplication. The first result is computed with the mul method. New multiplications can be accumulated using the mac method.

Template Parameters

M_Elems	Rows in matrix A.
K_Elems	Columns in matrix A / Rows in matrix B.
N_Elems	Columns in matrix B.
TypeA	Type of the elements in matrix A. It must meetElemBaseType.
TypeB	Type of the elements in matrix B. By default is the same as TypeA. It must meetElemBaseType.
AccumTag	Type of the elements of the accumulator that contains the results to be written in matrix C. It must meetAccumElemBaseType. If not specified, it uses the default accumulation type for multiplications of TypeA x TypeB.

Inheritance diagram for aie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >:

Public Types
using	accum_type= typename mmul_impl::accum_type

using	mmul_impl=detail::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, detail::to_native_accum_bits_for_mul_types_tag< TypeA, TypeB, AccumTag >()>

Public Member Functions
	mmul()
	More...

	mmul(constaccum_type&acc)
	More...

template
	mmul(constvector< T,M*N> &v, int shift=0)
	More...

template
void	mac(constvector< TypeA, ElemsA > &a, constvector< TypeB, ElemsB > &b)
	More...

template
void	mul(constvector< TypeA, ElemsA > &a, constvector< TypeB, ElemsB > &b)
	More...

	operator accum_type() const
	More...

accum_type	to_accum() const
	More...

template
vector< T,M*N>	to_vector(int shift=0) const
	More...

Static Public Attributes
static constexpr unsigned	K= K_Elems
	More...

static constexpr unsigned	M= M_Elems
	More...

static constexpr unsigned	N= N_Elems
	More...

static constexpr unsigned	size_A=M*K
	More...

static constexpr unsigned	size_B=K*N
	More...

static constexpr unsigned	size_C=M*N
	More...

Constructor & Destructor Documentation

◆mmul()[1/3]

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>

aie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::mmul ( )

inline

Constructor. Data is undefined.

◆mmul()[2/3]

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>

aie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::mmul ( constaccum_type& acc )

inline

Constructor. Data is initialized from the given accumulator.

Parameters

acc	Accumulator data is initialized from.

◆mmul()[3/3]

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>

template

aie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::mmul	(	constvector< T,M*N> &	v,
		int	shift=`0`
	)

inline

Constructor. Data is initialized from the given vector.

Parameters

v	Vectordata is initialized from.
shift	Upshift in bits to be applied to input data. This parameter is ignored for floating-point types.

Member Function Documentation

◆mac()

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>

template

voidaie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::mac	(	constvector< TypeA, ElemsA > &	a,
		constvector< TypeB, ElemsB > &	b
	)

inline

Multiply of the two given matrices and add it to the result.

Parameters

a	Vectorthat represents the A input matrix. The number of elements must be M * N.
b	Vectorthat represents the B input matrix. The number of elements must be N * K.

◆mul()

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>

template

voidaie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::mul	(	constvector< TypeA, ElemsA > &	a,
		constvector< TypeB, ElemsB > &	b
	)

inline

Initialize the result value with the multiplication of the two given matrices.

Parameters

a	Vectorthat represents the A input matrix. The number of elements must be M * N.
b	Vectorthat represents the B input matrix. The number of elements must be N * K.

◆operator accum_type()

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>

aie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::operatoraccum_type ( ) const

inline

Conversion operator to accumulator.

◆to_accum()

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>

accum_type aie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::to_accum ( ) const

inline

Return the result of the multiplication as an accumulator.

◆to_vector()

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>

template

vectorM*N>aie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::to_vector ( int shift=0 ) const

inline

Return the result of the multiplication as an accumulator.

Parameters

shift Downshift in bits to be applied to output data. This parameter is ignored for floating-point types.

Member Data Documentation

◆K

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>

constexpr unsignedaie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::K = K_Elems

staticconstexpr

Number of columns in matrix A, and number of rows in matrix B.

◆M

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>

constexpr unsignedaie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::M = M_Elems

staticconstexpr

Number of rows in matrix A.

◆N

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>

constexpr unsignedaie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::N = N_Elems

staticconstexpr

Number of columns in matrix B.

◆size_A

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>

constexpr unsignedaie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::size_A =M*K

staticconstexpr

Number of elements in matrix A

◆size_B

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>

constexpr unsignedaie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::size_B =K*N

staticconstexpr

Number of elements in matrix B

◆size_C

templateElemBaseTypeTypeA, ElemBaseTypeTypeB = TypeA, AccumElemBaseTypeAccumTag = accauto>

constexpr unsignedaie::mmul< M_Elems, K_Elems, N_Elems, TypeA, TypeB, AccumTag >::size_C =M*N

staticconstexpr

Number of elements in matrix C

Overview

Classes

Supported matrix multiplication shapes

Class Documentation

◆aie::mmul

Public Types

Public Member Functions

Static Public Attributes

Constructor & Destructor Documentation

◆mmul()[1/3]

◆mmul()[2/3]

◆mmul()[3/3]

Member Function Documentation

◆mac()

◆mul()

◆operator accum_type()

◆to_accum()

◆to_vector()

Member Data Documentation

◆K

◆M

◆N

◆size_A

◆size_B

◆size_C