MatMulInteger#

MatMulInteger - 10#

Version

  • name: MatMulInteger (GitHub)

  • domain: main

  • since_version: 10

  • function: False

  • support_level: SupportType.COMMON

  • shape inference: True

This version of the operator has been available since version 10.

Summary

Matrix product that behaves like numpy.matmul: https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.matmul.html. The production MUST never overflow. The accumulation may overflow if and only if in 32 bits.

Inputs

Between 2 and 4 inputs.

  • A (heterogeneous) - T1: N-dimensional matrix A

  • B (heterogeneous) - T2: N-dimensional matrix B

  • a_zero_point (optional, heterogeneous) - T1: Zero point tensor for input ‘A’. It’s optional and default value is 0. It could be a scalar or N-D tensor. Scalar refers to per tensor quantization whereas N-D refers to per row quantization. If the input is 2D of shape [M, K] then zero point tensor may be an M element vector [zp_1, zp_2, …, zp_M]. If the input is N-D tensor with shape [D1, D2, M, K] then zero point tensor may have shape [D1, D2, M, 1].

  • b_zero_point (optional, heterogeneous) - T2: Zero point tensor for input ‘B’. It’s optional and default value is 0. It could be a scalar or a N-D tensor, Scalar refers to per tensor quantization whereas N-D refers to per col quantization. If the input is 2D of shape [K, N] then zero point tensor may be an N element vector [zp_1, zp_2, …, zp_N]. If the input is N-D tensor with shape [D1, D2, K, N] then zero point tensor may have shape [D1, D2, 1, N].

Outputs

  • Y (heterogeneous) - T3: Matrix multiply results from A * B

Type Constraints

  • T1 in ( tensor(int8), tensor(uint8) ): Constrain input A data type to 8-bit integer tensor.

  • T2 in ( tensor(int8), tensor(uint8) ): Constrain input B data type to 8-bit integer tensor.

  • T3 in ( tensor(int32) ): Constrain output Y data type as 32-bit integer tensor.

Examples

default

node = onnx.helper.make_node('MatMulInteger',
    inputs=['A', 'B', 'a_zero_point', 'b_zero_point'],
    outputs=['Y'],)

A = np.array([[11, 7, 3],
    [10, 6, 2],
    [9, 5, 1],
    [8, 4, 0], ], dtype=np.uint8)

a_zero_point = np.array([12], dtype=np.uint8)

B = np.array([[1, 4],
    [2, 5],
    [3, 6], ], dtype=np.uint8)

b_zero_point = np.array([0], dtype=np.uint8)

output = np.array([[-38, -83],
    [-44, -98],
    [-50, -113],
    [-56, -128], ], dtype=np.int32)

expect(node, inputs=[A, B, a_zero_point, b_zero_point], outputs=[output],
       name='test_matmulinteger')