com.microsoft - RemovePadding#
RemovePadding - 1 (com.microsoft)#
Version
name: RemovePadding (GitHub)
domain: com.microsoft
since_version: 1
function:
support_level:
shape inference:
This version of the operator has been available since version 1 of domain com.microsoft.
Summary
Compress transformer input by removing paddings. It assumes padding is on the right side of sequence.
The input has padding with shape (batch_size, sequence_length, hidden_size). This will generate two outputs: output has shape (total_tokens, hidden_size); token_offset with shape (batch_size, sequence_length).
token_offset has offsets of all non-padding tokens first, then offset of all padding tokens. It is a list of batch_size * sequence_length elements, which is reshaped to 2D for convenience of shape inference.
Inputs
input (heterogeneous) - T: Input tensor with shape (batch_size, sequence_length, hidden_size)
sequence_token_count (heterogeneous) - M: Number of non-padding tokens in each sequence with shape (batch_size).
Outputs
output (heterogeneous) - T: output tensor with shape (total_tokens, hidden_size)
token_offset (heterogeneous) - M: Offset of non-padding tokens, and those of padding tokens. Its shape is (batch_size, sequence_length)
cumulated_seq_len (heterogeneous) - M: Cumulated sequence lengths. Its shape is (batch_size + 1)
max_seq_len (heterogeneous) - M: Max sequence length without padding. Its shape is (1)
Examples