.. _l-onnx-doccom.microsoft-QuantizeBFP:

===========================
com.microsoft - QuantizeBFP
===========================

.. contents::
    :local:


.. _l-onnx-opcom-microsoft-quantizebfp-1:

QuantizeBFP - 1 (com.microsoft)
===============================

**Version**

* **name**: `QuantizeBFP (GitHub) <https://github.com/onnx/onnx/blob/main/docs/Operators.md#com.microsoft.QuantizeBFP>`_
* **domain**: **com.microsoft**
* **since_version**: **1**
* **function**:
* **support_level**:
* **shape inference**:

This version of the operator has been available
**since version 1 of domain com.microsoft**.

**Summary**

The BFP quantization operator. It consumes a full precision tensor and computes an BFP tensor.
More documentation on the BFP format can be found in this paper: https://www.microsoft.com/en-us/research/publication/pushing-the-limits-of-narrow-precision-inferencing-at-cloud-scale-with-microsoft-floating-point/

**Attributes**

* **bfp_type** (required):
  The type of BFP - must match with the BFPType enum Default value is ``?``.
* **block_dim**:
  Each bounding box spans this dimension.Typically, the block
  dimension corresponds to the reduction dimension of the matrix
  multipication that consumes the output of this operator.For example,
  for a 2D matrix multiplication A@W, QuantizeBFP(A) would use
  block_dim 1 and QuantizeBFP(W) would use block_dim 0.The default is
  the last dimension. Default value is ``?``.

**Inputs**

* **x** (heterogeneous) - **T1**:
  N-D full precision input tensor to be quantized.

**Outputs**

* **y** (heterogeneous) - **T2**:
  1-D, contiguous BFP data
* **shape** (heterogeneous) - **T3**:
  Shape of x
* **strides** (heterogeneous) - **T3**:
  Strides of x

**Examples**