MaxUnpool#

MaxUnpool - 11#

Version

  • name: MaxUnpool (GitHub)

  • domain: main

  • since_version: 11

  • function: False

  • support_level: SupportType.COMMON

  • shape inference: True

This version of the operator has been available since version 11.

Summary

MaxUnpool essentially computes the partial inverse of the MaxPool op.

The input information to this op is typically the the output information from a MaxPool op. The first input tensor X is the tensor that needs to be unpooled, which is typically the pooled tensor (first output) from MaxPool. The second input tensor, I, contains the indices to the (locally maximal) elements corrsponding to the elements in the first input tensor X. Input tensor I is typically the second output of the MaxPool op. The third (optional) input is a tensor that specifies the output size of the unpooling operation.

MaxUnpool is intended to do ‘partial’ inverse of the MaxPool op. ‘Partial’ because all the non-maximal

values from the original input to MaxPool are set to zero in the output of the MaxUnpool op. Pooling the result of an unpooling operation should give back the original input to the unpooling op.

MaxUnpool can produce the same output size for several input sizes, which makes unpooling op ambiguous.

The third input argument, output_size, is meant to disambiguate the op and produce output tensor of known/predictable size.

In addition to the inputs, MaxUnpool takes three attributes, namely kernel_shape, strides, and pads,

which define the exact unpooling op. The attributes typically have the same values as the corrsponding pooling op that the unpooling op is trying to invert.

Attributes

  • kernel_shape (required): The size of the kernel along each axis.

  • pads: Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. pads format should be as follow [x1_begin, x2_begin…x1_end, x2_end,…], where xi_begin the number of pixels added at the beginning of axis i and xi_end, the number of pixels added at the end of axis i. This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults to 0 along start and end of each spatial axis.

  • strides: Stride along each spatial axis. If not present, the stride defaults to 1 along each spatial axis.

Inputs

Between 2 and 3 inputs.

  • X (heterogeneous) - T1: Input data tensor that has to be unpooled. This tensor is typically the first output of the MaxPool op.Dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non-image case, the dimensions are in the form of (N x C x D1 x D2 … Dn), where N is the batch size. Optionally, if dimension denotation is in effect, the operation expects the input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE …].

  • I (heterogeneous) - T2: Input data tensor containing the indices corresponding to elements in the first input tensor X.This tensor is typically the second output of the MaxPool op.Dimensions must be the same as input tensor X. The indices are linear, i.e. computed considering the tensor as flattened 1-D tensor, assuming row-major storage. Also, the linear indices should not consider padding. So the values in indices are in the range [0, N x C x D1 x … x Dn).

  • output_shape (optional, heterogeneous) - T2: The shape of the output can be explicitly set which will cause pads values to be auto generated. If ‘output_shape’ is specified, ‘pads’ values are ignored.

Outputs

  • output (heterogeneous) - T1: Output data tensor that contains the result of the unpooling.

Type Constraints

  • T1 in ( tensor(double), tensor(float), tensor(float16) ): Constrain input and output types to float tensors.

  • T2 in ( tensor(int64) ): Constrain index tensor to int64

Examples

_without_output_shape

node = onnx.helper.make_node(
    'MaxUnpool',
    inputs=['xT', 'xI'],
    outputs=['y'],
    kernel_shape=[2, 2],
    strides=[2, 2]
)
xT = np.array([[[[1, 2],
                 [3, 4]]]], dtype=np.float32)
xI = np.array([[[[5, 7],
                 [13, 15]]]], dtype=np.int64)
y = np.array([[[[0, 0, 0, 0],
                [0, 1, 0, 2],
                [0, 0, 0, 0],
                [0, 3, 0, 4]]]], dtype=np.float32)
expect(node, inputs=[xT, xI], outputs=[y], name='test_maxunpool_export_without_output_shape')

_with_output_shape

node = onnx.helper.make_node(
    'MaxUnpool',
    inputs=['xT', 'xI', 'output_shape'],
    outputs=['y'],
    kernel_shape=[2, 2],
    strides=[2, 2]
)
xT = np.array([[[[5, 6],
                 [7, 8]]]], dtype=np.float32)
xI = np.array([[[[5, 7],
                 [13, 15]]]], dtype=np.int64)
output_shape = np.array((1, 1, 5, 5), dtype=np.int64)
y = np.array([[[[0, 0, 0, 0, 0],
                [0, 5, 0, 6, 0],
                [0, 0, 0, 0, 0],
                [0, 7, 0, 8, 0],
                [0, 0, 0, 0, 0]]]], dtype=np.float32)
expect(node, inputs=[xT, xI, output_shape], outputs=[y], name='test_maxunpool_export_with_output_shape')

Differences

00MaxUnpool essentially computes the partial inverse of the MaxPool op.MaxUnpool essentially computes the partial inverse of the MaxPool op.
11 The input information to this op is typically the the output information from a MaxPool op. The first The input information to this op is typically the the output information from a MaxPool op. The first
22 input tensor X is the tensor that needs to be unpooled, which is typically the pooled tensor (first output) input tensor X is the tensor that needs to be unpooled, which is typically the pooled tensor (first output)
33 from MaxPool. The second input tensor, I, contains the indices to the (locally maximal) elements corrsponding from MaxPool. The second input tensor, I, contains the indices to the (locally maximal) elements corrsponding
44 to the elements in the first input tensor X. Input tensor I is typically the second output of the MaxPool op. to the elements in the first input tensor X. Input tensor I is typically the second output of the MaxPool op.
55 The third (optional) input is a tensor that specifies the output size of the unpooling operation. The third (optional) input is a tensor that specifies the output size of the unpooling operation.
66
77MaxUnpool is intended to do 'partial' inverse of the MaxPool op. 'Partial' because all the non-maximalMaxUnpool is intended to do 'partial' inverse of the MaxPool op. 'Partial' because all the non-maximal
88 values from the original input to MaxPool are set to zero in the output of the MaxUnpool op. Pooling values from the original input to MaxPool are set to zero in the output of the MaxUnpool op. Pooling
99 the result of an unpooling operation should give back the original input to the unpooling op. the result of an unpooling operation should give back the original input to the unpooling op.
1010
1111MaxUnpool can produce the same output size for several input sizes, which makes unpooling op ambiguous.MaxUnpool can produce the same output size for several input sizes, which makes unpooling op ambiguous.
1212 The third input argument, output_size, is meant to disambiguate the op and produce output tensor of The third input argument, output_size, is meant to disambiguate the op and produce output tensor of
1313 known/predictable size. known/predictable size.
1414
1515In addition to the inputs, MaxUnpool takes three attributes, namely kernel_shape, strides, and pads,In addition to the inputs, MaxUnpool takes three attributes, namely kernel_shape, strides, and pads,
1616 which define the exact unpooling op. The attributes typically have the same values as the corrsponding which define the exact unpooling op. The attributes typically have the same values as the corrsponding
1717 pooling op that the unpooling op is trying to invert. pooling op that the unpooling op is trying to invert.
1818
1919**Attributes****Attributes**
2020
2121* **kernel_shape** (required):* **kernel_shape** (required):
2222 The size of the kernel along each axis. The size of the kernel along each axis.
2323* **pads**:* **pads**:
2424 Padding for the beginning and ending along each spatial axis, it can Padding for the beginning and ending along each spatial axis, it can
2525 take any value greater than or equal to 0. The value represent the take any value greater than or equal to 0. The value represent the
2626 number of pixels added to the beginning and end part of the number of pixels added to the beginning and end part of the
2727 corresponding axis. pads format should be as follow [x1_begin, corresponding axis. pads format should be as follow [x1_begin,
2828 x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels
2929 added at the beginning of axis i and xi_end, the number of pixels added at the beginning of axis i and xi_end, the number of pixels
3030 added at the end of axis i. This attribute cannot be used added at the end of axis i. This attribute cannot be used
3131 simultaneously with auto_pad attribute. If not present, the padding simultaneously with auto_pad attribute. If not present, the padding
3232 defaults to 0 along start and end of each spatial axis. defaults to 0 along start and end of each spatial axis.
3333* **strides**:* **strides**:
3434 Stride along each spatial axis. Stride along each spatial axis. If not present, the stride defaults
35 to 1 along each spatial axis.
3536
3637**Inputs****Inputs**
3738
3839Between 2 and 3 inputs.Between 2 and 3 inputs.
3940
4041* **X** (heterogeneous) - **T1**:* **X** (heterogeneous) - **T1**:
4142 Input data tensor that has to be unpooled. This tensor is typically Input data tensor that has to be unpooled. This tensor is typically
4243 the first output of the MaxPool op.Dimensions for image case are (N the first output of the MaxPool op.Dimensions for image case are (N
4344 x C x H x W), where N is the batch size, C is the number of x C x H x W), where N is the batch size, C is the number of
4445 channels, and H and W are the height and the width of the data. For channels, and H and W are the height and the width of the data. For
4546 non-image case, the dimensions are in the form of (N x C x D1 x D2 non-image case, the dimensions are in the form of (N x C x D1 x D2
4647 ... Dn), where N is the batch size. Optionally, if dimension ... Dn), where N is the batch size. Optionally, if dimension
4748 denotation is in effect, the operation expects the input data tensor denotation is in effect, the operation expects the input data tensor
4849 to arrive with the dimension denotation of [DATA_BATCH, to arrive with the dimension denotation of [DATA_BATCH,
4950 DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...]. DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...].
5051* **I** (heterogeneous) - **T2**:* **I** (heterogeneous) - **T2**:
5152 Input data tensor containing the indices corresponding to elements Input data tensor containing the indices corresponding to elements
5253 in the first input tensor X.This tensor is typically the second in the first input tensor X.This tensor is typically the second
5354 output of the MaxPool op.Dimensions must be the same as input tensor output of the MaxPool op.Dimensions must be the same as input tensor
5455 X. The indices are linear, i.e. computed considering the tensor as X. The indices are linear, i.e. computed considering the tensor as
5556 flattened 1-D tensor, assuming row-major storage. Also, the linear flattened 1-D tensor, assuming row-major storage. Also, the linear
5657 indices should not consider padding. So the values in indices are in indices should not consider padding. So the values in indices are in
5758 the range [0, N x C x D1 x ... x Dn). the range [0, N x C x D1 x ... x Dn).
5859* **output_shape** (optional, heterogeneous) - **T2**:* **output_shape** (optional, heterogeneous) - **T2**:
5960 The shape of the output can be explicitly set which will cause pads The shape of the output can be explicitly set which will cause pads
6061 values to be auto generated. If 'output_shape' is specified, 'pads' values to be auto generated. If 'output_shape' is specified, 'pads'
6162 values are ignored. values are ignored.
6263
6364**Outputs****Outputs**
6465
6566* **output** (heterogeneous) - **T1**:* **output** (heterogeneous) - **T1**:
6667 Output data tensor that contains the result of the unpooling. Output data tensor that contains the result of the unpooling.
6768
6869**Type Constraints****Type Constraints**
6970
7071* **T1** in (* **T1** in (
7172 tensor(double), tensor(double),
7273 tensor(float), tensor(float),
7374 tensor(float16) tensor(float16)
7475 ): ):
7576 Constrain input and output types to float tensors. Constrain input and output types to float tensors.
7677* **T2** in (* **T2** in (
7778 tensor(int64) tensor(int64)
7879 ): ):
7980 Constrain index tensor to int64 Constrain index tensor to int64

MaxUnpool - 9#

Version

  • name: MaxUnpool (GitHub)

  • domain: main

  • since_version: 9

  • function: False

  • support_level: SupportType.COMMON

  • shape inference: True

This version of the operator has been available since version 9.

Summary

MaxUnpool essentially computes the partial inverse of the MaxPool op.

The input information to this op is typically the the output information from a MaxPool op. The first input tensor X is the tensor that needs to be unpooled, which is typically the pooled tensor (first output) from MaxPool. The second input tensor, I, contains the indices to the (locally maximal) elements corrsponding to the elements in the first input tensor X. Input tensor I is typically the second output of the MaxPool op. The third (optional) input is a tensor that specifies the output size of the unpooling operation.

MaxUnpool is intended to do ‘partial’ inverse of the MaxPool op. ‘Partial’ because all the non-maximal

values from the original input to MaxPool are set to zero in the output of the MaxUnpool op. Pooling the result of an unpooling operation should give back the original input to the unpooling op.

MaxUnpool can produce the same output size for several input sizes, which makes unpooling op ambiguous.

The third input argument, output_size, is meant to disambiguate the op and produce output tensor of known/predictable size.

In addition to the inputs, MaxUnpool takes three attributes, namely kernel_shape, strides, and pads,

which define the exact unpooling op. The attributes typically have the same values as the corrsponding pooling op that the unpooling op is trying to invert.

Attributes

  • kernel_shape (required): The size of the kernel along each axis.

  • pads: Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. pads format should be as follow [x1_begin, x2_begin…x1_end, x2_end,…], where xi_begin the number of pixels added at the beginning of axis i and xi_end, the number of pixels added at the end of axis i. This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults to 0 along start and end of each spatial axis.

  • strides: Stride along each spatial axis.

Inputs

Between 2 and 3 inputs.

  • X (heterogeneous) - T1: Input data tensor that has to be unpooled. This tensor is typically the first output of the MaxPool op.Dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non-image case, the dimensions are in the form of (N x C x D1 x D2 … Dn), where N is the batch size. Optionally, if dimension denotation is in effect, the operation expects the input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE …].

  • I (heterogeneous) - T2: Input data tensor containing the indices corresponding to elements in the first input tensor X.This tensor is typically the second output of the MaxPool op.Dimensions must be the same as input tensor X. The indices are linear, i.e. computed considering the tensor as flattened 1-D tensor, assuming row-major storage. Also, the linear indices should not consider padding. So the values in indices are in the range [0, N x C x D1 x … x Dn).

  • output_shape (optional, heterogeneous) - T2: The shape of the output can be explicitly set which will cause pads values to be auto generated. If ‘output_shape’ is specified, ‘pads’ values are ignored.

Outputs

  • output (heterogeneous) - T1: Output data tensor that contains the result of the unpooling.

Type Constraints

  • T1 in ( tensor(double), tensor(float), tensor(float16) ): Constrain input and output types to float tensors.

  • T2 in ( tensor(int64) ): Constrain index tensor to int64