.. _l-onnx-doccom.microsoft-QuantizeWithOrder: ================================= com.microsoft - QuantizeWithOrder ================================= .. contents:: :local: .. _l-onnx-opcom-microsoft-quantizewithorder-1: QuantizeWithOrder - 1 (com.microsoft) ===================================== **Version** * **name**: `QuantizeWithOrder (GitHub) `_ * **domain**: **com.microsoft** * **since_version**: **1** * **function**: * **support_level**: * **shape inference**: This version of the operator has been available **since version 1 of domain com.microsoft**. **Summary** Quantize input matrix to specific layout used in cublaslt. **Attributes** * **order_input** (required): cublasLt order of input matrix. ORDER_COL = 0, ORDER_ROW = 1, ORDER_COL32 = 2, ORDER_COL4_4R2_8C = 3, ORDER_COL32_2R_4R4 = 4. Please refer https://docs.nvidia.com/cuda/cublas/index.html#cublasLtOrder_t for their meaning. Default value is ``?``. * **order_output** (required): cublasLt order of output matrix. Default value is ``?``. **Inputs** * **input** (heterogeneous) - **F**: TODO: input tensor of (ROWS, COLS). if less than 2d, will broadcast to (1, X). If 3d, it is treated as (B, ROWS, COS) * **scale_input** (heterogeneous) - **S**: scale of the input **Outputs** * **output** (heterogeneous) - **Q**: output tensor **Examples**