com.microsoft - DequantizeBFP#

DequantizeBFP - 1 (com.microsoft)#

Version

  • name: DequantizeBFP (GitHub)

  • domain: com.microsoft

  • since_version: 1

  • function:

  • support_level:

  • shape inference:

This version of the operator has been available since version 1 of domain com.microsoft.

Summary

The BFP dequantization operator. It consumes the raw BFP data and some metadata such as the shape and strides of the original tensor and computes the dequantized tensor. More documentation on the BFP format can be found in this paper: https://www.microsoft.com/en-us/research/publication/pushing-the-limits-of-narrow-precision-inferencing-at-cloud-scale-with-microsoft-floating-point/

Attributes

  • bfp_type (required): The type of BFP - must match with the BFPType enum Default value is ?.

  • block_dim: Each bounding box spans this dimension.Typically, the block dimension corresponds to the reduction dimension of the matrix multipication that consumes the output of this operator.For example, for a 2D matrix multiplication A@W, QuantizeBFP(A) would use block_dim 1 and QuantizeBFP(W) would use block_dim 0.The default is the last dimension. Default value is ?.

  • dtype: The datatype to dequantize to. Default value is ?.

Inputs

  • x (heterogeneous) - T1: 1-D, contiguous, raw, BFP data to be de-quantized.

  • shape (heterogeneous) - T2: shape of the original tensor.

  • strides (heterogeneous) - T2: strides of the original tensor.

Outputs

  • y (heterogeneous) - T3: de-quantized tensor.

Examples