.. _l-onnx-doccom.microsoft-QuantizeWithOrder:

=================================
com.microsoft - QuantizeWithOrder
=================================

.. contents::
    :local:


.. _l-onnx-opcom-microsoft-quantizewithorder-1:

QuantizeWithOrder - 1 (com.microsoft)
=====================================

**Version**

* **name**: `QuantizeWithOrder (GitHub) <https://github.com/onnx/onnx/blob/main/docs/Operators.md#com.microsoft.QuantizeWithOrder>`_
* **domain**: **com.microsoft**
* **since_version**: **1**
* **function**:
* **support_level**:
* **shape inference**:

This version of the operator has been available
**since version 1 of domain com.microsoft**.

**Summary**

Quantize input matrix to specific layout used in cublaslt.

**Attributes**

* **order_input** (required):
  cublasLt order of input matrix. ORDER_COL = 0, ORDER_ROW = 1,
  ORDER_COL32 = 2, ORDER_COL4_4R2_8C = 3, ORDER_COL32_2R_4R4 = 4.
  Please refer
  https://docs.nvidia.com/cuda/cublas/index.html#cublasLtOrder_t for
  their meaning. Default value is ``?``.
* **order_output** (required):
  cublasLt order of output matrix. Default value is ``?``.

**Inputs**

* **input** (heterogeneous) - **F**:
  TODO: input tensor of (ROWS, COLS). if less than 2d, will broadcast
  to (1, X). If 3d, it is treated as (B, ROWS, COS)
* **scale_input** (heterogeneous) - **S**:
  scale of the input

**Outputs**

* **output** (heterogeneous) - **Q**:
  output tensor

**Examples**