Linear Algebra

gemm

template<typename AlphaType, typename OCMShapeA, typename OCMShapeB, typename OCMShapeC, typename BetaType>
void qmath::gemm(AlphaType alpha, OCMShapeA &ocmA, OCMShapeB &ocmB, BetaType beta, OCMShapeC &ocmC, OCMShapeC &ocmOut)

Calculate general matrix multiplication (gemm) out = alpha*A*B + beta*C https://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms#Level_3.

Template Parameters
  • AlphaType: Type for the scalar alpha

  • OCMShapeA: Type for the matrix A (In OCM)

  • OCMShapeB: Type for the matrix B (In OCM)

  • OCMShapeC: Type for the matrix C (In OCM)

  • BetaType: Type for the scalar beta

Parameters
  • alpha: scalar alpha

  • ocmA: matrix A

  • ocmB: matrix B

  • beta: scalar beta

  • ocmC: matrix C

  • ocmOut: output matrix

solve

template<typename ocmAType, typename ocmBType, typename ocmXType>
void qmath::solve(ocmAType &ocmA, ocmBType &ocmB, ocmXType &ocmX, bool useCholesky = false)

Solves multiple independent matrix equations of the form Ax=b where A is non-singular small square matrix (matrixSize <= 20)

*  +---------------+
*  |  ocmA Tensor  |
*  +---------------+
*                      A matrices for each problem are stacked in x direction
*                      ----------->
*                   |  +------------------------------+-------------------------------+
*                   |  |                              |                               |
*                   |  |                              |                               |
*   matrix rows in  |  |<---------------------------->|<----------------------------->|
*     y direction   |  |       # problems in Roi      |       # problems in Roi       |
*                   |  |     (numCores x numCores)    |     (numCores x numCores)     |
*                   v  |                              |                               |
*                      +------------------------------+-------------------------------+
*
*                    matrix columns are in z direction (into page) with alternating
*                    real (even indicies) and imaginary (odd indicies) components.
*                    Thus, z dimension is of length 2 * matrixSize.
*
*
*  +---------------+
*  |  ocmB / ocmX  |
*  +---------------+
*                      b / x vectors for each problem are stacked in x direction
*                      ----------->
*                   |  +------------------------------+-------------------------------+
*                   |  |                              |                               |
*                   |  |                              |                               |
*  vector elements  |  |<---------------------------->|<----------------------------->|
*   in y direction  |  |       # problems in Roi      |       # problems in Roi       |
*                   |  |     (numCores x numCores)    |     (numCores x numCores)     |
*                   v  |                              |                               |
*                      +------------------------------+-------------------------------+
*
*                    real (index = 0) and imaginary (index = 1) components in z direction (into page).
*
*
*
*
*  For example, suppose we have 100 Ax=b problems to solve where each A is a 4x4 complex matrix. We need
*  to store all 100 matricies into one tensor. First let's consider the first matrix A1. Suppose A1 looks
*  like this:
*
*       ┌──────┬──────┬──────┬──────┐
*       │ 1+17i│ 2+18i│ 3+19i│ 4+20i│
*       ├──────┼──────┼──────┼──────┤
*       │ 5+21i│ 6+22i│ 7+23i│ 8+24i│
*  A1 = ├──────┼──────┼──────┼──────┤
*       │ 9+25i│10+26i│11+27i│12+28i│
*       ├──────┼──────┼──────┼──────┤
*       │13+29i│14+30i│15+31i│16+32i│
*       └──────┴──────┴──────┴──────┘
*
*  We first have to handle the real and complex parts. We transform A1 into a 4 x 8 matrix that has double
*  the number of columns. The real and imaginary parts of each original column are now their own individual
*  columns like so:
*
*      ┌────►z
*      │  Re Im Re Im Re Im Re Im
*      │ ┌──┬──┬──┬──┬──┬──┬──┬──┐
*      ▼ │ 1│17│ 2│18│ 3│19│ 4│20│
*      y ├──┼──┼──┼──┼──┼──┼──┼──┤
*        │ 5│21│ 6│22│ 7│23│ 8│24│
*  A1 =  ├──┼──┼──┼──┼──┼──┼──┼──┤
*        │ 9│25│11│26│12│27│13│28│
*        ├──┼──┼──┼──┼──┼──┼──┼──┤
*        │13│29│14│30│15│31│16│32│
*        └──┴──┴──┴──┴──┴──┴──┴──┘
*
*  The rows and columns of A1 will span the y and z directions of the final tensor that we will construct.
*
*  Now, we do this transformation to each matrix A1 to A100 and then stack them in the x direction. Thus,
*  the final tensor will have the dimensions (z, y, x) = (8, 4, 100).
*
*  Note: This implementation assumes input matrices are non-singular and does not check this assertion nor
*  gives notice to user. A notice or flag will be added in future.
*
*

Parameters
  • ocmA: A matrices in Ax=b. Input tensor with dimensions 1 x (2[=real+imaginary]*matrixSize) x matrixSize x numProblems

  • ocmB: b vectors in Ax=b. Input tensor with dimensions 1 x 2[=real+imaginary] x matrixSize x numProblems

  • ocmX: x vectors in Ax=b. Output tensor with dimensions 1 x 2[=real+imaginary] x matrixSize x numProblems

  • useCholesky: boolean value to determine which solver method to use:

    • (1) false: this uses gaussian elimination to solve Ax = b. This works for general square complex matrices A.

    • (2) true : this does a cholesky decomposition of A to form A = LL^H and then uses this decomposition to solve Ax=b. This only works for hermitian positive-definite matricies. Compute cycles are halved compared to gaussian elimination method when applicable.