Scan#
Scan  16#
Version
name: Scan (GitHub)
domain: main
since_version: 16
function: False
support_level: SupportType.COMMON
shape inference: True
This version of the operator has been available since version 16.
Summary
Scan can be used to iterate over one or more scan_input tensors, constructing zero or more scan_output tensors. It combines ideas from general recurrences, functional programming constructs such as scan, fold, map, and zip, and is intended to enable generalizations of RNNlike constructs for sequencetosequence processing. Other tensors (referred to as state_variables here) can be used to carry a state when iterating from one element to another (similar to hiddenstate in RNNs, also referred to as loopcarried dependences in the context of loops). Many common usages involve a single scan_input tensor (where functionality similar to scan, fold and map can be obtained). When more than one scan_input is used, a behavior similar to zip is obtained.
The attribute body must be a graph, specifying the computation to be performed in every iteration. It takes as input the current values of the state_variables and the current iterated element of the scan_inputs. It must return the (updated) values of the state_variables and zero or more scan_output_element tensors. The values of the scan_output_element tensors are concatenated over all the iterations to produce the scan_output values of the scan construct (similar to the concatenated intermediate hiddenstate values of RNNlike constructs). All the output tensors (state_variables as well as scan_output_element tensors) are required to have the same shape in each iteration of the loop (a restriction imposed to enable efficient memory allocation).
Note that the iterated element passed to the body subgraph does not have a sequence axis. It will have a rank one less than the rank of the corresponding scan_input.
The scan operation returns the final values of the state_variables as well as the scan_outputs.
The optional attribute scan_input_directions specifies the direction (forward or backward) for each scan input. If this attribute is omitted, all sequences are scanned in the forward direction. A bidirectional scan may be performed by specifying the same tensor input twice in the scan_inputs, once with a forward direction, and once with a backward direction.
The scan_output of the operation is produced by concatenating the scan_output_element values produced by the body in each iteration. The optional attribute scan_output_directions specifies the direction in which scan_output is constructed (by appending or prepending the scan_output_element to scan_output in each iteration) for each scan_output. If this attribute is omitted, the scan_output_element is appended to the scan_output in each iteration.
The optional attribute scan_input_axes specifies the axis to be scanned for each scan_input. If omitted, every scan_input will be scanned in axis 0. For example, if axis 0 is the batch axis and axis 1 is the time axis (to be scanned), specify an axis value of 1. Note that scanning a nonzero axis may be less efficient than scanning axis zero.
The optional attribute scan_output_axes specifies the axis along which the scan_outputs are accumulated for each scan_output. For example, if axis 1 is the time axis (to be scanned) for both inputs and outputs, specify a scan_input axis and scan_output axis value of 1.
Note that because of the ONNX restriction that only the last parameter of an operator can be variadic, the initialstates and scaninputs are listed together as one input parameter. Similarly, the finalstates and scanoutputs are listed together as one output parameter. The attribute num_scan_inputs indicates the number M of scaninputs.
The behavior of
 Scan <
num_scan_inputs = m, body = loopbody, scan_input_axes = [axis_1, …, axis_m]
> (init_1, …, init_n, scan_1, …, scan_m)
is equivalent to the following pseudocode:
// scan_i.shape[axis_i] denotes the (max) sequencelength of scan_i // scan_i.shape[axis_i] is required to be equal to scan_j.shape[axis_j] for all i,j. sequence_length = scan_1.shape[axis_1];
// initialize statevariables st_1 = init_1; … st_n = init_n; // initialize scanoutput variables: [] denotes an empty tensor scan_out_1 = []; …; scan_out_k = []; // identify number of iterations:
// execute loop for (int t = 0; t < sequence_length; ++t) {
// generate the scaninput elements: the notation T<axis=k>[t] indicates the subtensor // of rank one less than T obtained by indexing T at position t along axis k. si_1 = scan_1<axis=axis_1>[t]; … ; si_m = scan_m<axis=axis_m>[t]; // execute loopbody st_1, …, st_n, so_1, …, so_k = loopbody(st_1, …, st_n, si_1, …, si_m) // accumulate the scanoutput elements scan_out_1 = Concat<axis=0>(scan_out_1, so_1); … ; scan_out_k = Concat<axis=0>(scan_out_k, so_k);
}
return st_1, …, st_n, scan_out_1, …, scan_out_k;
Sample usage: Encoding RNN using a Scan
The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi, recurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hiddenstate %H_0 can be encoded as a ScanLoop. Note that the loopbody is a nested graph, and it directly computes %Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these values are computed in the outer graph, they need to be passed in as extra state_variables.
 graph rnnencoding {
%H_0 = … %X = … %Y_h, %Y = Scan[body = <graph rnncell1>, num_scan_inputs=1](%H_0, %X) return %Y, %Y_h
}
 graph rnncell1 (
%H_tminus1[FLOAT, tensor] %X_t[FLOAT, tensor]
 ) {
%Wi = … %Ri = … %Wbi = … %Rbi = … %t1 = X_t * (Wi^T) %t2 = H_tminus1*(Ri^T) %t3 = Add(%t1, %t2) %t4 = Add(%t3, %Wbi) %t5 = Add(%t4, %Rbi) %Ht = Tanh(%t5) %Accumulate = Identity(%Ht) return %Ht, %Accumulate
}
Attributes
body (required): The graph run each iteration. It has N+M inputs: (loop state variables…, scan_input_elts…). It has N+K outputs: (loop state variables…, scan_output_elts…). Each scan_output is created by concatenating the value of the specified scan_output_elt value at the end of each iteration of the loop. It is an error if the dimensions of these values change across loop iterations.
num_scan_inputs (required): An attribute specifying the number of scan_inputs M.
scan_input_axes: An optional list of M flags. The ith element of the list specifies the axis to be scanned (the sequence axis) for the ith scan_input. If omitted, 0 will be used as the scan axis for every scan_input. Negative value for an axis means counting dimensions from the back. Accepted range is [r, r1] where r = rank(input).
scan_input_directions: An optional list of M flags. The ith element of the list specifies the direction to be scanned for the ith scan_input tensor: 0 indicates forward direction and 1 indicates reverse direction. If omitted, all scan_input tensors will be scanned in the forward direction.
scan_output_axes: An optional list of K flags. The ith element of the list specifies the axis for the ith scan_output. The scan outputs are accumulated along the specified axis. If omitted, 0 will be used as the scan axis for every scan_output. Negative value for an axis means counting dimensions from the back. Accepted range is [r, r1].
scan_output_directions: An optional list of K flags, one for each scan_output. The ith element of the list specifies whether the ith scan_output should be constructed by appending or prepending a new value in each iteration: 0 indicates appending and 1 indicates prepending. If omitted, all scan_output tensors will be produced by appending a value in each iteration.
Inputs
Between 1 and 2147483647 inputs.
initial_state_and_scan_inputs (variadic)  V: Initial values of the loop’s N state variables followed by M scan_inputs
Outputs
Between 1 and 2147483647 outputs.
final_state_and_scan_outputs (variadic)  V: Final values of the loop’s N state variables followed by K scan_outputs
Type Constraints
V in ( tensor(bfloat16), tensor(bool), tensor(complex128), tensor(complex64), tensor(double), tensor(float), tensor(float16), tensor(int16), tensor(int32), tensor(int64), tensor(int8), tensor(string), tensor(uint16), tensor(uint32), tensor(uint64), tensor(uint8) ): All Tensor types
Examples
_scan_8
# Given an input sequence [x1, ..., xN], sum up its elements using a scan
# returning the final state (x1+x2+...+xN) as well the scan_output
# [x1, x1+x2, ..., x1+x2+...+xN]
#
# create graph to represent scan body
sum_in = onnx.helper.make_tensor_value_info(
"sum_in", onnx.TensorProto.FLOAT, [2]
)
next = onnx.helper.make_tensor_value_info("next", onnx.TensorProto.FLOAT, [2])
sum_out = onnx.helper.make_tensor_value_info(
"sum_out", onnx.TensorProto.FLOAT, [2]
)
scan_out = onnx.helper.make_tensor_value_info(
"scan_out", onnx.TensorProto.FLOAT, [2]
)
add_node = onnx.helper.make_node(
"Add", inputs=["sum_in", "next"], outputs=["sum_out"]
)
id_node = onnx.helper.make_node(
"Identity", inputs=["sum_out"], outputs=["scan_out"]
)
scan_body = onnx.helper.make_graph(
[add_node, id_node], "scan_body", [sum_in, next], [sum_out, scan_out]
)
# create scan op node
no_sequence_lens = "" # optional input, not supplied
node = onnx.helper.make_node(
"Scan",
inputs=[no_sequence_lens, "initial", "x"],
outputs=["y", "z"],
num_scan_inputs=1,
body=scan_body,
)
# create inputs for batchsize 1, sequencelength 3, inner dimension 2
initial = np.array([0, 0]).astype(np.float32).reshape((1, 2))
x = np.array([1, 2, 3, 4, 5, 6]).astype(np.float32).reshape((1, 3, 2))
# final state computed = [1 + 3 + 5, 2 + 4 + 6]
y = np.array([9, 12]).astype(np.float32).reshape((1, 2))
# scanoutput computed
z = np.array([1, 2, 4, 6, 9, 12]).astype(np.float32).reshape((1, 3, 2))
expect(
node,
inputs=[initial, x],
outputs=[y, z],
name="test_scan_sum",
opset_imports=[onnx.helper.make_opsetid("", 8)],
)
_scan_9
# Given an input sequence [x1, ..., xN], sum up its elements using a scan
# returning the final state (x1+x2+...+xN) as well the scan_output
# [x1, x1+x2, ..., x1+x2+...+xN]
#
# create graph to represent scan body
sum_in = onnx.helper.make_tensor_value_info(
"sum_in", onnx.TensorProto.FLOAT, [2]
)
next = onnx.helper.make_tensor_value_info("next", onnx.TensorProto.FLOAT, [2])
sum_out = onnx.helper.make_tensor_value_info(
"sum_out", onnx.TensorProto.FLOAT, [2]
)
scan_out = onnx.helper.make_tensor_value_info(
"scan_out", onnx.TensorProto.FLOAT, [2]
)
add_node = onnx.helper.make_node(
"Add", inputs=["sum_in", "next"], outputs=["sum_out"]
)
id_node = onnx.helper.make_node(
"Identity", inputs=["sum_out"], outputs=["scan_out"]
)
scan_body = onnx.helper.make_graph(
[add_node, id_node], "scan_body", [sum_in, next], [sum_out, scan_out]
)
# create scan op node
node = onnx.helper.make_node(
"Scan",
inputs=["initial", "x"],
outputs=["y", "z"],
num_scan_inputs=1,
body=scan_body,
)
# create inputs for sequencelength 3, inner dimension 2
initial = np.array([0, 0]).astype(np.float32).reshape((2,))
x = np.array([1, 2, 3, 4, 5, 6]).astype(np.float32).reshape((3, 2))
# final state computed = [1 + 3 + 5, 2 + 4 + 6]
y = np.array([9, 12]).astype(np.float32).reshape((2,))
# scanoutput computed
z = np.array([1, 2, 4, 6, 9, 12]).astype(np.float32).reshape((3, 2))
expect(
node,
inputs=[initial, x],
outputs=[y, z],
name="test_scan9_sum",
opset_imports=[onnx.helper.make_opsetid("", 9)],
)
Differences
0  0  Scan can be used to iterate over one or more scan_input tensors,  Scan can be used to iterate over one or more scan_input tensors, 
1  1  constructing zero or more scan_output tensors. It combines ideas from general recurrences,  constructing zero or more scan_output tensors. It combines ideas from general recurrences, 
2  2  functional programming constructs such as scan, fold, map, and zip, and is intended to enable  functional programming constructs such as scan, fold, map, and zip, and is intended to enable 
3  3  generalizations of RNNlike constructs for sequencetosequence processing.  generalizations of RNNlike constructs for sequencetosequence processing. 
4  4  Other tensors (referred to as state_variables here) can be used to carry a state  Other tensors (referred to as state_variables here) can be used to carry a state 
5  5  when iterating from one element to another (similar to hiddenstate in RNNs, also referred  when iterating from one element to another (similar to hiddenstate in RNNs, also referred 
6  6  to as loopcarried dependences in the context of loops).  to as loopcarried dependences in the context of loops). 
7  7  Many common usages involve a single scan_input tensor (where functionality  Many common usages involve a single scan_input tensor (where functionality 
8  8  similar to scan, fold and map can be obtained). When more than one scan_input is used,  similar to scan, fold and map can be obtained). When more than one scan_input is used, 
9  9  a behavior similar to zip is obtained.  a behavior similar to zip is obtained. 
10  10 


11  11  The attribute body must be a graph, specifying the computation to be performed in  The attribute body must be a graph, specifying the computation to be performed in 
12  12  every iteration. It takes as input the current values of the state_variables and  every iteration. It takes as input the current values of the state_variables and 
13  13  the current iterated element of the scan_inputs. It must return the (updated) values  the current iterated element of the scan_inputs. It must return the (updated) values 
14  14  of the state_variables and zero or more scan_output_element tensors. The values of the  of the state_variables and zero or more scan_output_element tensors. The values of the 
15  15  scan_output_element tensors are concatenated over all the iterations to produce the  scan_output_element tensors are concatenated over all the iterations to produce the 
16  16  scan_output values of the scan construct (similar to the concatenated intermediate  scan_output values of the scan construct (similar to the concatenated intermediate 
17  17  hiddenstate values of RNNlike constructs). All the output tensors (state_variables as  hiddenstate values of RNNlike constructs). All the output tensors (state_variables as 
18  18  well as scan_output_element tensors) are required to have the same shape in each iteration  well as scan_output_element tensors) are required to have the same shape in each iteration 
19  19  of the loop (a restriction imposed to enable efficient memory allocation).  of the loop (a restriction imposed to enable efficient memory allocation). 
20  20 


21  21  Note that the iterated element passed to the body subgraph does not have a sequence  Note that the iterated element passed to the body subgraph does not have a sequence 
22  22  axis. It will have a rank one less than the rank of the corresponding scan_input.  axis. It will have a rank one less than the rank of the corresponding scan_input. 
23  23 


24  24  The scan operation returns the final values of the state_variables as well as the  The scan operation returns the final values of the state_variables as well as the 
25  25  scan_outputs.  scan_outputs. 
26  26 


27  27  The optional attribute scan_input_directions specifies the direction (forward or backward)  The optional attribute scan_input_directions specifies the direction (forward or backward) 
28  28  for each scan input. If this attribute is omitted, all sequences are scanned in the forward  for each scan input. If this attribute is omitted, all sequences are scanned in the forward 
29  29  direction. A bidirectional scan may be performed by specifying the same tensor input twice  direction. A bidirectional scan may be performed by specifying the same tensor input twice 
30  30  in the scan_inputs, once with a forward direction, and once with a backward direction.  in the scan_inputs, once with a forward direction, and once with a backward direction. 
31  31 


32  32  The scan_output of the operation is produced by concatenating the scan_output_element  The scan_output of the operation is produced by concatenating the scan_output_element 
33  33  values produced by the body in each iteration. The optional attribute scan_output_directions  values produced by the body in each iteration. The optional attribute scan_output_directions 
34  34  specifies the direction in which scan_output is constructed (by appending or prepending the  specifies the direction in which scan_output is constructed (by appending or prepending the 
35  35  scan_output_element to scan_output in each iteration) for each scan_output. If this attribute  scan_output_element to scan_output in each iteration) for each scan_output. If this attribute 
36  36  is omitted, the scan_output_element is appended to the scan_output in each iteration.  is omitted, the scan_output_element is appended to the scan_output in each iteration. 
37  37 


38  38  The optional attribute scan_input_axes specifies the axis to be scanned for each scan_input.  The optional attribute scan_input_axes specifies the axis to be scanned for each scan_input. 
39  39  If omitted, every scan_input will be scanned in axis 0. For example, if axis 0 is the  If omitted, every scan_input will be scanned in axis 0. For example, if axis 0 is the 
40  40  batch axis and axis 1 is the time axis (to be scanned), specify an axis value of 1.  batch axis and axis 1 is the time axis (to be scanned), specify an axis value of 1. 
41  41  Note that scanning a nonzero axis may be less efficient than scanning axis zero.  Note that scanning a nonzero axis may be less efficient than scanning axis zero. 
42  42 


43  43  The optional attribute scan_output_axes specifies the axis along which the scan_outputs  The optional attribute scan_output_axes specifies the axis along which the scan_outputs 
44  44  are accumulated for each scan_output. For example, if axis 1 is the time axis (to be  are accumulated for each scan_output. For example, if axis 1 is the time axis (to be 
45  45  scanned) for both inputs and outputs, specify a scan_input axis and scan_output axis  scanned) for both inputs and outputs, specify a scan_input axis and scan_output axis 
46  46  value of 1.  value of 1. 
47  47 


48  48  Note that because of the ONNX restriction that only the last parameter of an operator can  Note that because of the ONNX restriction that only the last parameter of an operator can 
49  49  be variadic, the initialstates and scaninputs are listed together as one input parameter.  be variadic, the initialstates and scaninputs are listed together as one input parameter. 
50  50  Similarly, the finalstates and scanoutputs are listed together as one output parameter.  Similarly, the finalstates and scanoutputs are listed together as one output parameter. 
51  51  The attribute num_scan_inputs indicates the number M of scaninputs.  The attribute num_scan_inputs indicates the number M of scaninputs. 
52  52 


53  53  The behavior of  The behavior of 
54  54 


55  55  Scan <  Scan < 
56  56  num_scan_inputs = m,  num_scan_inputs = m, 
57  57  body = loopbody,  body = loopbody, 
58  58  scan_input_axes = [axis_1, ..., axis_m]  scan_input_axes = [axis_1, ..., axis_m] 
59  59  > (init_1, ..., init_n, scan_1, ..., scan_m)  > (init_1, ..., init_n, scan_1, ..., scan_m) 
60  60 


61  61  is equivalent to the following pseudocode:  is equivalent to the following pseudocode: 
62  62 


63  63  // scan_i.shape[axis_i] denotes the (max) sequencelength of scan_i  // scan_i.shape[axis_i] denotes the (max) sequencelength of scan_i 
64  64  // scan_i.shape[axis_i] is required to be equal to scan_j.shape[axis_j] for all i,j.  // scan_i.shape[axis_i] is required to be equal to scan_j.shape[axis_j] for all i,j. 
65  65  sequence_length = scan_1.shape[axis_1];  sequence_length = scan_1.shape[axis_1]; 
66  66 


67  67  // initialize statevariables  // initialize statevariables 
68  68  st_1 = init_1; ... st_n = init_n;  st_1 = init_1; ... st_n = init_n; 
69  69  // initialize scanoutput variables: [] denotes an empty tensor  // initialize scanoutput variables: [] denotes an empty tensor 
70  70  scan_out_1 = []; ...; scan_out_k = [];  scan_out_1 = []; ...; scan_out_k = []; 
71  71  // identify number of iterations:  // identify number of iterations: 
72  72 


73  73  // execute loop  // execute loop 
74  74  for (int t = 0; t < sequence_length; ++t) {  for (int t = 0; t < sequence_length; ++t) { 
75  75  // generate the scaninput elements: the notation T  // generate the scaninput elements: the notation T 
76  76  // of rank one less than T obtained by indexing T at position t along axis k.  // of rank one less than T obtained by indexing T at position t along axis k. 
77  77  si_1 = scan_1  si_1 = scan_1 
78  78  ... ;  ... ; 
79  79  si_m = scan_m  si_m = scan_m 
80  80  // execute loopbody  // execute loopbody 
81  81  st_1, ..., st_n, so_1, ..., so_k = loopbody(st_1, ..., st_n, si_1, ..., si_m)  st_1, ..., st_n, so_1, ..., so_k = loopbody(st_1, ..., st_n, si_1, ..., si_m) 
82  82  // accumulate the scanoutput elements  // accumulate the scanoutput elements 
83  83  scan_out_1 = Concat  scan_out_1 = Concat 
84  84  }  } 
85  85 


86  86  return st_1, ..., st_n, scan_out_1, ..., scan_out_k;  return st_1, ..., st_n, scan_out_1, ..., scan_out_k; 
87  87 


88  88  *Sample usage: Encoding RNN using a Scan*  *Sample usage: Encoding RNN using a Scan* 
89  89 


90  90  The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi,  The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi, 
91  91  recurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hiddenstate %H_0 can  recurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hiddenstate %H_0 can 
92  92  be encoded as a ScanLoop. Note that the loopbody is a nested graph, and it directly computes  be encoded as a ScanLoop. Note that the loopbody is a nested graph, and it directly computes 
93  93  %Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these  %Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these 
94  94  values are computed in the outer graph, they need to be passed in as extra state_variables.  values are computed in the outer graph, they need to be passed in as extra state_variables. 
95  95 


96  96  graph rnnencoding {  graph rnnencoding { 
97  97  %H_0 = ...  %H_0 = ... 
98  98  %X = ...  %X = ... 
99  99  %Y_h, %Y = Scan[body =  %Y_h, %Y = Scan[body = 
100  100  return %Y, %Y_h  return %Y, %Y_h 
101  101  }  } 
102  102 


103  103  graph rnncell1 (  graph rnncell1 ( 
104  104  %H_tminus1[FLOAT, tensor]  %H_tminus1[FLOAT, tensor] 
105  105  %X_t[FLOAT, tensor]  %X_t[FLOAT, tensor] 
106  106  ) {  ) { 
107  107  %Wi = ...  %Wi = ... 
108  108  %Ri = ...  %Ri = ... 
109  109  %Wbi = ...  %Wbi = ... 
110  110  %Rbi = ...  %Rbi = ... 
111  111  %t1 = X_t * (Wi^T)  %t1 = X_t * (Wi^T) 
112  112  %t2 = H_tminus1*(Ri^T)  %t2 = H_tminus1*(Ri^T) 
113  113  %t3 = Add(%t1, %t2)  %t3 = Add(%t1, %t2) 
114  114  %t4 = Add(%t3, %Wbi)  %t4 = Add(%t3, %Wbi) 
115  115  %t5 = Add(%t4, %Rbi)  %t5 = Add(%t4, %Rbi) 
116  116  %Ht = Tanh(%t5)  %Ht = Tanh(%t5) 
117  117  %Accumulate = Identity(%Ht)  %Accumulate = Identity(%Ht) 
118  118  return %Ht, %Accumulate  return %Ht, %Accumulate 
119  119  }  } 
120  120 


121  121  **Attributes**  **Attributes** 
122  122 


123  123  * **body** (required):  * **body** (required): 
124  124  The graph run each iteration. It has N+M inputs: (loop state  The graph run each iteration. It has N+M inputs: (loop state 
125  125  variables..., scan_input_elts...). It has N+K outputs: (loop state  variables..., scan_input_elts...). It has N+K outputs: (loop state 
126  126  variables..., scan_output_elts...). Each scan_output is created by  variables..., scan_output_elts...). Each scan_output is created by 
127  127  concatenating the value of the specified scan_output_elt value at  concatenating the value of the specified scan_output_elt value at 
128  128  the end of each iteration of the loop. It is an error if the  the end of each iteration of the loop. It is an error if the 
129  129  dimensions of these values change across loop iterations.  dimensions of these values change across loop iterations. 
130  130  * **num_scan_inputs** (required):  * **num_scan_inputs** (required): 
131  131  An attribute specifying the number of scan_inputs M.  An attribute specifying the number of scan_inputs M. 
132  132  * **scan_input_axes**:  * **scan_input_axes**: 
133  133  An optional list of M flags. The ith element of the list specifies  An optional list of M flags. The ith element of the list specifies 
134  134  the axis to be scanned (the sequence axis) for the ith scan_input.  the axis to be scanned (the sequence axis) for the ith scan_input. 
135  135  If omitted, 0 will be used as the scan axis for every scan_input.  If omitted, 0 will be used as the scan axis for every scan_input. 
136  136  Negative value for an axis means counting dimensions from the back.  Negative value for an axis means counting dimensions from the back. 
137  137  Accepted range is [r, r1] where r = rank(input).  Accepted range is [r, r1] where r = rank(input). 
138  138  * **scan_input_directions**:  * **scan_input_directions**: 
139  139  An optional list of M flags. The ith element of the list specifies  An optional list of M flags. The ith element of the list specifies 
140  140  the direction to be scanned for the ith scan_input tensor: 0  the direction to be scanned for the ith scan_input tensor: 0 
141  141  indicates forward direction and 1 indicates reverse direction. If  indicates forward direction and 1 indicates reverse direction. If 
142  142  omitted, all scan_input tensors will be scanned in the forward  omitted, all scan_input tensors will be scanned in the forward 
143  143  direction.  direction. 
144  144  * **scan_output_axes**:  * **scan_output_axes**: 
145  145  An optional list of K flags. The ith element of the list specifies  An optional list of K flags. The ith element of the list specifies 
146  146  the axis for the ith scan_output. The scan outputs are accumulated  the axis for the ith scan_output. The scan outputs are accumulated 
147  147  along the specified axis. If omitted, 0 will be used as the scan  along the specified axis. If omitted, 0 will be used as the scan 
148  148  axis for every scan_output. Negative value for an axis means  axis for every scan_output. Negative value for an axis means 
149  149  counting dimensions from the back. Accepted range is [r, r1].  counting dimensions from the back. Accepted range is [r, r1]. 
150  150  * **scan_output_directions**:  * **scan_output_directions**: 
151  151  An optional list of K flags, one for each scan_output. The ith  An optional list of K flags, one for each scan_output. The ith 
152  152  element of the list specifies whether the ith scan_output should be  element of the list specifies whether the ith scan_output should be 
153  153  constructed by appending or prepending a new value in each  constructed by appending or prepending a new value in each 
154  154  iteration: 0 indicates appending and 1 indicates prepending. If  iteration: 0 indicates appending and 1 indicates prepending. If 
155  155  omitted, all scan_output tensors will be produced by appending a  omitted, all scan_output tensors will be produced by appending a 
156  156  value in each iteration.  value in each iteration. 
157  157 


158  158  **Inputs**  **Inputs** 
159  159 


160  160  Between 1 and 2147483647 inputs.  Between 1 and 2147483647 inputs. 
161  161 


162  162  * **initial_state_and_scan_inputs** (variadic)  **V**:  * **initial_state_and_scan_inputs** (variadic)  **V**: 
163  163  Initial values of the loop's N state variables followed by M  Initial values of the loop's N state variables followed by M 
164  164  scan_inputs  scan_inputs 
165  165 


166  166  **Outputs**  **Outputs** 
167  167 


168  168  Between 1 and 2147483647 outputs.  Between 1 and 2147483647 outputs. 
169  169 


170  170  * **final_state_and_scan_outputs** (variadic)  **V**:  * **final_state_and_scan_outputs** (variadic)  **V**: 
171  171  Final values of the loop's N state variables followed by K  Final values of the loop's N state variables followed by K 
172  172  scan_outputs  scan_outputs 
173  173 


174  174  **Type Constraints**  **Type Constraints** 
175  175 


176  176  * **V** in (  * **V** in ( 
177  tensor(bfloat16),  
177  178  tensor(bool),  tensor(bool), 
178  179  tensor(complex128),  tensor(complex128), 
179  180  tensor(complex64),  tensor(complex64), 
180  181  tensor(double),  tensor(double), 
181  182  tensor(float),  tensor(float), 
182  183  tensor(float16),  tensor(float16), 
183  184  tensor(int16),  tensor(int16), 
184  185  tensor(int32),  tensor(int32), 
185  186  tensor(int64),  tensor(int64), 
186  187  tensor(int8),  tensor(int8), 
187  188  tensor(string),  tensor(string), 
188  189  tensor(uint16),  tensor(uint16), 
189  190  tensor(uint32),  tensor(uint32), 
190  191  tensor(uint64),  tensor(uint64), 
191  192  tensor(uint8)  tensor(uint8) 
192  193  ):  ): 
193  194  All Tensor types  All Tensor types 
Scan  11#
Version
name: Scan (GitHub)
domain: main
since_version: 11
function: False
support_level: SupportType.COMMON
shape inference: True
This version of the operator has been available since version 11.
Summary
Scan can be used to iterate over one or more scan_input tensors, constructing zero or more scan_output tensors. It combines ideas from general recurrences, functional programming constructs such as scan, fold, map, and zip, and is intended to enable generalizations of RNNlike constructs for sequencetosequence processing. Other tensors (referred to as state_variables here) can be used to carry a state when iterating from one element to another (similar to hiddenstate in RNNs, also referred to as loopcarried dependences in the context of loops). Many common usages involve a single scan_input tensor (where functionality similar to scan, fold and map can be obtained). When more than one scan_input is used, a behavior similar to zip is obtained.
The attribute body must be a graph, specifying the computation to be performed in every iteration. It takes as input the current values of the state_variables and the current iterated element of the scan_inputs. It must return the (updated) values of the state_variables and zero or more scan_output_element tensors. The values of the scan_output_element tensors are concatenated over all the iterations to produce the scan_output values of the scan construct (similar to the concatenated intermediate hiddenstate values of RNNlike constructs). All the output tensors (state_variables as well as scan_output_element tensors) are required to have the same shape in each iteration of the loop (a restriction imposed to enable efficient memory allocation).
Note that the iterated element passed to the body subgraph does not have a sequence axis. It will have a rank one less than the rank of the corresponding scan_input.
The scan operation returns the final values of the state_variables as well as the scan_outputs.
The optional attribute scan_input_directions specifies the direction (forward or backward) for each scan input. If this attribute is omitted, all sequences are scanned in the forward direction. A bidirectional scan may be performed by specifying the same tensor input twice in the scan_inputs, once with a forward direction, and once with a backward direction.
The scan_output of the operation is produced by concatenating the scan_output_element values produced by the body in each iteration. The optional attribute scan_output_directions specifies the direction in which scan_output is constructed (by appending or prepending the scan_output_element to scan_output in each iteration) for each scan_output. If this attribute is omitted, the scan_output_element is appended to the scan_output in each iteration.
The optional attribute scan_input_axes specifies the axis to be scanned for each scan_input. If omitted, every scan_input will be scanned in axis 0. For example, if axis 0 is the batch axis and axis 1 is the time axis (to be scanned), specify an axis value of 1. Note that scanning a nonzero axis may be less efficient than scanning axis zero.
The optional attribute scan_output_axes specifies the axis along which the scan_outputs are accumulated for each scan_output. For example, if axis 1 is the time axis (to be scanned) for both inputs and outputs, specify a scan_input axis and scan_output axis value of 1.
Note that because of the ONNX restriction that only the last parameter of an operator can be variadic, the initialstates and scaninputs are listed together as one input parameter. Similarly, the finalstates and scanoutputs are listed together as one output parameter. The attribute num_scan_inputs indicates the number M of scaninputs.
The behavior of
 Scan <
num_scan_inputs = m, body = loopbody, scan_input_axes = [axis_1, …, axis_m]
> (init_1, …, init_n, scan_1, …, scan_m)
is equivalent to the following pseudocode:
// scan_i.shape[axis_i] denotes the (max) sequencelength of scan_i // scan_i.shape[axis_i] is required to be equal to scan_j.shape[axis_j] for all i,j. sequence_length = scan_1.shape[axis_1];
// initialize statevariables st_1 = init_1; … st_n = init_n; // initialize scanoutput variables: [] denotes an empty tensor scan_out_1 = []; …; scan_out_k = []; // identify number of iterations:
// execute loop for (int t = 0; t < sequence_length; ++t) {
// generate the scaninput elements: the notation T<axis=k>[t] indicates the subtensor // of rank one less than T obtained by indexing T at position t along axis k. si_1 = scan_1<axis=axis_1>[t]; … ; si_m = scan_m<axis=axis_m>[t]; // execute loopbody st_1, …, st_n, so_1, …, so_k = loopbody(st_1, …, st_n, si_1, …, si_m) // accumulate the scanoutput elements scan_out_1 = Concat<axis=0>(scan_out_1, so_1); … ; scan_out_k = Concat<axis=0>(scan_out_k, so_k);
}
return st_1, …, st_n, scan_out_1, …, scan_out_k;
Sample usage: Encoding RNN using a Scan
The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi, recurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hiddenstate %H_0 can be encoded as a ScanLoop. Note that the loopbody is a nested graph, and it directly computes %Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these values are computed in the outer graph, they need to be passed in as extra state_variables.
 graph rnnencoding {
%H_0 = … %X = … %Y_h, %Y = Scan[body = <graph rnncell1>, num_scan_inputs=1](%H_0, %X) return %Y, %Y_h
}
 graph rnncell1 (
%H_tminus1[FLOAT, tensor] %X_t[FLOAT, tensor]
 ) {
%Wi = … %Ri = … %Wbi = … %Rbi = … %t1 = X_t * (Wi^T) %t2 = H_tminus1*(Ri^T) %t3 = Add(%t1, %t2) %t4 = Add(%t3, %Wbi) %t5 = Add(%t4, %Rbi) %Ht = Tanh(%t5) %Accumulate = Identity(%Ht) return %Ht, %Accumulate
}
Attributes
body (required): The graph run each iteration. It has N+M inputs: (loop state variables…, scan_input_elts…). It has N+K outputs: (loop state variables…, scan_output_elts…). Each scan_output is created by concatenating the value of the specified scan_output_elt value at the end of each iteration of the loop. It is an error if the dimensions of these values change across loop iterations.
num_scan_inputs (required): An attribute specifying the number of scan_inputs M.
scan_input_axes: An optional list of M flags. The ith element of the list specifies the axis to be scanned (the sequence axis) for the ith scan_input. If omitted, 0 will be used as the scan axis for every scan_input. Negative value for an axis means counting dimensions from the back. Accepted range is [r, r1] where r = rank(input).
scan_input_directions: An optional list of M flags. The ith element of the list specifies the direction to be scanned for the ith scan_input tensor: 0 indicates forward direction and 1 indicates reverse direction. If omitted, all scan_input tensors will be scanned in the forward direction.
scan_output_axes: An optional list of K flags. The ith element of the list specifies the axis for the ith scan_output. The scan outputs are accumulated along the specified axis. If omitted, 0 will be used as the scan axis for every scan_output. Negative value for an axis means counting dimensions from the back. Accepted range is [r, r1].
scan_output_directions: An optional list of K flags, one for each scan_output. The ith element of the list specifies whether the ith scan_output should be constructed by appending or prepending a new value in each iteration: 0 indicates appending and 1 indicates prepending. If omitted, all scan_output tensors will be produced by appending a value in each iteration.
Inputs
Between 1 and 2147483647 inputs.
initial_state_and_scan_inputs (variadic)  V: Initial values of the loop’s N state variables followed by M scan_inputs
Outputs
Between 1 and 2147483647 outputs.
final_state_and_scan_outputs (variadic)  V: Final values of the loop’s N state variables followed by K scan_outputs
Type Constraints
V in ( tensor(bool), tensor(complex128), tensor(complex64), tensor(double), tensor(float), tensor(float16), tensor(int16), tensor(int32), tensor(int64), tensor(int8), tensor(string), tensor(uint16), tensor(uint32), tensor(uint64), tensor(uint8) ): All Tensor types
Differences
0  0  Scan can be used to iterate over one or more scan_input tensors,  Scan can be used to iterate over one or more scan_input tensors, 
1  1  constructing zero or more scan_output tensors. It combines ideas from general recurrences,  constructing zero or more scan_output tensors. It combines ideas from general recurrences, 
2  2  functional programming constructs such as scan, fold, map, and zip, and is intended to enable  functional programming constructs such as scan, fold, map, and zip, and is intended to enable 
3  3  generalizations of RNNlike constructs for sequencetosequence processing.  generalizations of RNNlike constructs for sequencetosequence processing. 
4  4  Other tensors (referred to as state_variables here) can be used to carry a state  Other tensors (referred to as state_variables here) can be used to carry a state 
5  5  when iterating from one element to another (similar to hiddenstate in RNNs, also referred  when iterating from one element to another (similar to hiddenstate in RNNs, also referred 
6  6  to as loopcarried dependences in the context of loops).  to as loopcarried dependences in the context of loops). 
7  7  Many common usages involve a single scan_input tensor (where functionality  Many common usages involve a single scan_input tensor (where functionality 
8  8  similar to scan, fold and map can be obtained). When more than one scan_input is used,  similar to scan, fold and map can be obtained). When more than one scan_input is used, 
9  9  a behavior similar to zip is obtained.  a behavior similar to zip is obtained. 
10  10 


11  11  The attribute body must be a graph, specifying the computation to be performed in  The attribute body must be a graph, specifying the computation to be performed in 
12  12  every iteration. It takes as input the current values of the state_variables and  every iteration. It takes as input the current values of the state_variables and 
13  13  the current iterated element of the scan_inputs. It must return the (updated) values  the current iterated element of the scan_inputs. It must return the (updated) values 
14  14  of the state_variables and zero or more scan_output_element tensors. The values of the  of the state_variables and zero or more scan_output_element tensors. The values of the 
15  15  scan_output_element tensors are concatenated over all the iterations to produce the  scan_output_element tensors are concatenated over all the iterations to produce the 
16  16  scan_output values of the scan construct (similar to the concatenated intermediate  scan_output values of the scan construct (similar to the concatenated intermediate 
17  17  hiddenstate values of RNNlike constructs). All the output tensors (state_variables as  hiddenstate values of RNNlike constructs). All the output tensors (state_variables as 
18  18  well as scan_output_element tensors) are required to have the same shape in each iteration  well as scan_output_element tensors) are required to have the same shape in each iteration 
19  19  of the loop (a restriction imposed to enable efficient memory allocation).  of the loop (a restriction imposed to enable efficient memory allocation). 
20  20 


21  21  Note that the iterated element passed to the body subgraph does not have a sequence  Note that the iterated element passed to the body subgraph does not have a sequence 
22  22  axis. It will have a rank one less than the rank of the corresponding scan_input.  axis. It will have a rank one less than the rank of the corresponding scan_input. 
23  23 


24  24  The scan operation returns the final values of the state_variables as well as the  The scan operation returns the final values of the state_variables as well as the 
25  25  scan_outputs.  scan_outputs. 
26  26 


27  27  The optional attribute scan_input_directions specifies the direction (forward or backward)  The optional attribute scan_input_directions specifies the direction (forward or backward) 
28  28  for each scan input. If this attribute is omitted, all sequences are scanned in the forward  for each scan input. If this attribute is omitted, all sequences are scanned in the forward 
29  29  direction. A bidirectional scan may be performed by specifying the same tensor input twice  direction. A bidirectional scan may be performed by specifying the same tensor input twice 
30  30  in the scan_inputs, once with a forward direction, and once with a backward direction.  in the scan_inputs, once with a forward direction, and once with a backward direction. 
31  31 


32  32  The scan_output of the operation is produced by concatenating the scan_output_element  The scan_output of the operation is produced by concatenating the scan_output_element 
33  33  values produced by the body in each iteration. The optional attribute scan_output_directions  values produced by the body in each iteration. The optional attribute scan_output_directions 
34  34  specifies the direction in which scan_output is constructed (by appending or prepending the  specifies the direction in which scan_output is constructed (by appending or prepending the 
35  35  scan_output_element to scan_output in each iteration) for each scan_output. If this attribute  scan_output_element to scan_output in each iteration) for each scan_output. If this attribute 
36  36  is omitted, the scan_output_element is appended to the scan_output in each iteration.  is omitted, the scan_output_element is appended to the scan_output in each iteration. 
37  37 


38  38  The optional attribute scan_input_axes specifies the axis to be scanned for each scan_input.  The optional attribute scan_input_axes specifies the axis to be scanned for each scan_input. 
39  39  If omitted, every scan_input will be scanned in axis 0. For example, if axis 0 is the  If omitted, every scan_input will be scanned in axis 0. For example, if axis 0 is the 
40  40  batch axis and axis 1 is the time axis (to be scanned), specify an axis value of 1.  batch axis and axis 1 is the time axis (to be scanned), specify an axis value of 1. 
41  41  Note that scanning a nonzero axis may be less efficient than scanning axis zero.  Note that scanning a nonzero axis may be less efficient than scanning axis zero. 
42  42 


43  43  The optional attribute scan_output_axes specifies the axis along which the scan_outputs  The optional attribute scan_output_axes specifies the axis along which the scan_outputs 
44  44  are accumulated for each scan_output. For example, if axis 1 is the time axis (to be  are accumulated for each scan_output. For example, if axis 1 is the time axis (to be 
45  45  scanned) for both inputs and outputs, specify a scan_input axis and scan_output axis  scanned) for both inputs and outputs, specify a scan_input axis and scan_output axis 
46  46  value of 1.  value of 1. 
47  47 


48  48  Note that because of the ONNX restriction that only the last parameter of an operator can  Note that because of the ONNX restriction that only the last parameter of an operator can 
49  49  be variadic, the initialstates and scaninputs are listed together as one input parameter.  be variadic, the initialstates and scaninputs are listed together as one input parameter. 
50  50  Similarly, the finalstates and scanoutputs are listed together as one output parameter.  Similarly, the finalstates and scanoutputs are listed together as one output parameter. 
51  51  The attribute num_scan_inputs indicates the number M of scaninputs.  The attribute num_scan_inputs indicates the number M of scaninputs. 
52  52 


53  53  The behavior of  The behavior of 
54  54 


55  55  Scan <  Scan < 
56  56  num_scan_inputs = m,  num_scan_inputs = m, 
57  57  body = loopbody,  body = loopbody, 
58  58  scan_input_axes = [axis_1, ..., axis_m]  scan_input_axes = [axis_1, ..., axis_m] 
59  59  > (init_1, ..., init_n, scan_1, ..., scan_m)  > (init_1, ..., init_n, scan_1, ..., scan_m) 
60  60 


61  61  is equivalent to the following pseudocode:  is equivalent to the following pseudocode: 
62  62 


63  63  // scan_i.shape[axis_i] denotes the (max) sequencelength of scan_i  // scan_i.shape[axis_i] denotes the (max) sequencelength of scan_i 
64  64  // scan_i.shape[axis_i] is required to be equal to scan_j.shape[axis_j] for all i,j.  // scan_i.shape[axis_i] is required to be equal to scan_j.shape[axis_j] for all i,j. 
65  65  sequence_length = scan_1.shape[axis_1];  sequence_length = scan_1.shape[axis_1]; 
66  66 


67  67  // initialize statevariables  // initialize statevariables 
68  68  st_1 = init_1; ... st_n = init_n;  st_1 = init_1; ... st_n = init_n; 
69  69  // initialize scanoutput variables: [] denotes an empty tensor  // initialize scanoutput variables: [] denotes an empty tensor 
70  70  scan_out_1 = []; ...; scan_out_k = [];  scan_out_1 = []; ...; scan_out_k = []; 
71  71  // identify number of iterations:  // identify number of iterations: 
72  72 


73  73  // execute loop  // execute loop 
74  74  for (int t = 0; t < sequence_length; ++t) {  for (int t = 0; t < sequence_length; ++t) { 
75  75  // generate the scaninput elements: the notation T  // generate the scaninput elements: the notation T 
76  76  // of rank one less than T obtained by indexing T at position t along axis k.  // of rank one less than T obtained by indexing T at position t along axis k. 
77  77  si_1 = scan_1  si_1 = scan_1 
78  78  ... ;  ... ; 
79  79  si_m = scan_m  si_m = scan_m 
80  80  // execute loopbody  // execute loopbody 
81  81  st_1, ..., st_n, so_1, ..., so_k = loopbody(st_1, ..., st_n, si_1, ..., si_m)  st_1, ..., st_n, so_1, ..., so_k = loopbody(st_1, ..., st_n, si_1, ..., si_m) 
82  82  // accumulate the scanoutput elements  // accumulate the scanoutput elements 
83  83  scan_out_1 = Concat  scan_out_1 = Concat 
84  84  }  } 
85  85 


86  86  return st_1, ..., st_n, scan_out_1, ..., scan_out_k;  return st_1, ..., st_n, scan_out_1, ..., scan_out_k; 
87  87 


88  88  *Sample usage: Encoding RNN using a Scan*  *Sample usage: Encoding RNN using a Scan* 
89  89 


90  90  The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi,  The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi, 
91  91  recurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hiddenstate %H_0 can  recurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hiddenstate %H_0 can 
92  92  be encoded as a ScanLoop. Note that the loopbody is a nested graph, and it directly computes  be encoded as a ScanLoop. Note that the loopbody is a nested graph, and it directly computes 
93  93  %Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these  %Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these 
94  94  values are computed in the outer graph, they need to be passed in as extra state_variables.  values are computed in the outer graph, they need to be passed in as extra state_variables. 
95  95 


96  96  graph rnnencoding {  graph rnnencoding { 
97  97  %H_0 = ...  %H_0 = ... 
98  98  %X = ...  %X = ... 
99  99  %Y_h, %Y = Scan[body =  %Y_h, %Y = Scan[body = 
100  100  return %Y, %Y_h  return %Y, %Y_h 
101  101  }  } 
102  102 


103  103  graph rnncell1 (  graph rnncell1 ( 
104  104  %H_tminus1[FLOAT, tensor]  %H_tminus1[FLOAT, tensor] 
105  105  %X_t[FLOAT, tensor]  %X_t[FLOAT, tensor] 
106  106  ) {  ) { 
107  107  %Wi = ...  %Wi = ... 
108  108  %Ri = ...  %Ri = ... 
109  109  %Wbi = ...  %Wbi = ... 
110  110  %Rbi = ...  %Rbi = ... 
111  111  %t1 = X_t * (Wi^T)  %t1 = X_t * (Wi^T) 
112  112  %t2 = H_tminus1*(Ri^T)  %t2 = H_tminus1*(Ri^T) 
113  113  %t3 = Add(%t1, %t2)  %t3 = Add(%t1, %t2) 
114  114  %t4 = Add(%t3, %Wbi)  %t4 = Add(%t3, %Wbi) 
115  115  %t5 = Add(%t4, %Rbi)  %t5 = Add(%t4, %Rbi) 
116  116  %Ht = Tanh(%t5)  %Ht = Tanh(%t5) 
117  117  %Accumulate = Identity(%Ht)  %Accumulate = Identity(%Ht) 
118  118  return %Ht, %Accumulate  return %Ht, %Accumulate 
119  119  }  } 
120  120 


121  121  **Attributes**  **Attributes** 
122  122 


123  123  * **body** (required):  * **body** (required): 
124  124  The graph run each iteration. It has N+M inputs: (loop state  The graph run each iteration. It has N+M inputs: (loop state 
125  125  variables..., scan_input_elts...). It has N+K outputs: (loop state  variables..., scan_input_elts...). It has N+K outputs: (loop state 
126  126  variables..., scan_output_elts...). Each scan_output is created by  variables..., scan_output_elts...). Each scan_output is created by 
127  127  concatenating the value of the specified scan_output_elt value at  concatenating the value of the specified scan_output_elt value at 
128  128  the end of each iteration of the loop. It is an error if the  the end of each iteration of the loop. It is an error if the 
129  129  dimensions of these values change across loop iterations.  dimensions of these values change across loop iterations. 
130  130  * **num_scan_inputs** (required):  * **num_scan_inputs** (required): 
131  131  An attribute specifying the number of scan_inputs M.  An attribute specifying the number of scan_inputs M. 
132  132  * **scan_input_axes**:  * **scan_input_axes**: 
133  133  An optional list of M flags. The ith element of the list specifies  An optional list of M flags. The ith element of the list specifies 
134  134  the axis to be scanned (the sequence axis) for the ith scan_input.  the axis to be scanned (the sequence axis) for the ith scan_input. 
135  135  If omitted, 0 will be used as the scan axis for every scan_input.  If omitted, 0 will be used as the scan axis for every scan_input. 
136  Negative value for an axis means counting dimensions from the back.  
137  Accepted range is [r, r1] where r = rank(input).  
136  138  * **scan_input_directions**:  * **scan_input_directions**: 
137  139  An optional list of M flags. The ith element of the list specifies  An optional list of M flags. The ith element of the list specifies 
138  140  the direction to be scanned for the ith scan_input tensor: 0  the direction to be scanned for the ith scan_input tensor: 0 
139  141  indicates forward direction and 1 indicates reverse direction. If  indicates forward direction and 1 indicates reverse direction. If 
140  142  omitted, all scan_input tensors will be scanned in the forward  omitted, all scan_input tensors will be scanned in the forward 
141  143  direction.  direction. 
142  144  * **scan_output_axes**:  * **scan_output_axes**: 
143  145  An optional list of K flags. The ith element of the list specifies  An optional list of K flags. The ith element of the list specifies 
144  146  the axis for the ith scan_output. The scan outputs are accumulated  the axis for the ith scan_output. The scan outputs are accumulated 
145  147  along the specified axis. If omitted, 0 will be used as the scan  along the specified axis. If omitted, 0 will be used as the scan 
146  148  axis for every scan_output. 

149  counting dimensions from the back. Accepted range is [r, r1].  
147  150  * **scan_output_directions**:  * **scan_output_directions**: 
148  151  An optional list of K flags, one for each scan_output. The ith  An optional list of K flags, one for each scan_output. The ith 
149  152  element of the list specifies whether the ith scan_output should be  element of the list specifies whether the ith scan_output should be 
150  153  constructed by appending or prepending a new value in each  constructed by appending or prepending a new value in each 
151  154  iteration: 0 indicates appending and 1 indicates prepending. If  iteration: 0 indicates appending and 1 indicates prepending. If 
152  155  omitted, all scan_output tensors will be produced by appending a  omitted, all scan_output tensors will be produced by appending a 
153  156  value in each iteration.  value in each iteration. 
154  157 


155  158  **Inputs**  **Inputs** 
156  159 


157  160  Between 1 and 2147483647 inputs.  Between 1 and 2147483647 inputs. 
158  161 


159  162  * **initial_state_and_scan_inputs** (variadic)  **V**:  * **initial_state_and_scan_inputs** (variadic)  **V**: 
160  163  Initial values of the loop's N state variables followed by M  Initial values of the loop's N state variables followed by M 
161  164  scan_inputs  scan_inputs 
162  165 


163  166  **Outputs**  **Outputs** 
164  167 


165  168  Between 1 and 2147483647 outputs.  Between 1 and 2147483647 outputs. 
166  169 


167  170  * **final_state_and_scan_outputs** (variadic)  **V**:  * **final_state_and_scan_outputs** (variadic)  **V**: 
168  171  Final values of the loop's N state variables followed by K  Final values of the loop's N state variables followed by K 
169  172  scan_outputs  scan_outputs 
170  173 


171  174  **Type Constraints**  **Type Constraints** 
172  175 


173  176  * **V** in (  * **V** in ( 
174  177  tensor(bool),  tensor(bool), 
175  178  tensor(complex128),  tensor(complex128), 
176  179  tensor(complex64),  tensor(complex64), 
177  180  tensor(double),  tensor(double), 
178  181  tensor(float),  tensor(float), 
179  182  tensor(float16),  tensor(float16), 
180  183  tensor(int16),  tensor(int16), 
181  184  tensor(int32),  tensor(int32), 
182  185  tensor(int64),  tensor(int64), 
183  186  tensor(int8),  tensor(int8), 
184  187  tensor(string),  tensor(string), 
185  188  tensor(uint16),  tensor(uint16), 
186  189  tensor(uint32),  tensor(uint32), 
187  190  tensor(uint64),  tensor(uint64), 
188  191  tensor(uint8)  tensor(uint8) 
189  192  ):  ): 
190  193  All Tensor types  All Tensor types 
Scan  9#
Version
name: Scan (GitHub)
domain: main
since_version: 9
function: False
support_level: SupportType.COMMON
shape inference: True
This version of the operator has been available since version 9.
Summary
Scan can be used to iterate over one or more scan_input tensors, constructing zero or more scan_output tensors. It combines ideas from general recurrences, functional programming constructs such as scan, fold, map, and zip, and is intended to enable generalizations of RNNlike constructs for sequencetosequence processing. Other tensors (referred to as state_variables here) can be used to carry a state when iterating from one element to another (similar to hiddenstate in RNNs, also referred to as loopcarried dependences in the context of loops). Many common usages involve a single scan_input tensor (where functionality similar to scan, fold and map can be obtained). When more than one scan_input is used, a behavior similar to zip is obtained.
The attribute body must be a graph, specifying the computation to be performed in every iteration. It takes as input the current values of the state_variables and the current iterated element of the scan_inputs. It must return the (updated) values of the state_variables and zero or more scan_output_element tensors. The values of the scan_output_element tensors are concatenated over all the iterations to produce the scan_output values of the scan construct (similar to the concatenated intermediate hiddenstate values of RNNlike constructs). All the output tensors (state_variables as well as scan_output_element tensors) are required to have the same shape in each iteration of the loop (a restriction imposed to enable efficient memory allocation).
Note that the iterated element passed to the body subgraph does not have a sequence axis. It will have a rank one less than the rank of the corresponding scan_input.
The scan operation returns the final values of the state_variables as well as the scan_outputs.
The optional attribute scan_input_directions specifies the direction (forward or backward) for each scan input. If this attribute is omitted, all sequences are scanned in the forward direction. A bidirectional scan may be performed by specifying the same tensor input twice in the scan_inputs, once with a forward direction, and once with a backward direction.
The scan_output of the operation is produced by concatenating the scan_output_element values produced by the body in each iteration. The optional attribute scan_output_directions specifies the direction in which scan_output is constructed (by appending or prepending the scan_output_element to scan_output in each iteration) for each scan_output. If this attribute is omitted, the scan_output_element is appended to the scan_output in each iteration.
The optional attribute scan_input_axes specifies the axis to be scanned for each scan_input. If omitted, every scan_input will be scanned in axis 0. For example, if axis 0 is the batch axis and axis 1 is the time axis (to be scanned), specify an axis value of 1. Note that scanning a nonzero axis may be less efficient than scanning axis zero.
The optional attribute scan_output_axes specifies the axis along which the scan_outputs are accumulated for each scan_output. For example, if axis 1 is the time axis (to be scanned) for both inputs and outputs, specify a scan_input axis and scan_output axis value of 1.
Note that because of the ONNX restriction that only the last parameter of an operator can be variadic, the initialstates and scaninputs are listed together as one input parameter. Similarly, the finalstates and scanoutputs are listed together as one output parameter. The attribute num_scan_inputs indicates the number M of scaninputs.
The behavior of
 Scan <
num_scan_inputs = m, body = loopbody, scan_input_axes = [axis_1, …, axis_m]
> (init_1, …, init_n, scan_1, …, scan_m)
is equivalent to the following pseudocode:
// scan_i.shape[axis_i] denotes the (max) sequencelength of scan_i // scan_i.shape[axis_i] is required to be equal to scan_j.shape[axis_j] for all i,j. sequence_length = scan_1.shape[axis_1];
// initialize statevariables st_1 = init_1; … st_n = init_n; // initialize scanoutput variables: [] denotes an empty tensor scan_out_1 = []; …; scan_out_k = []; // identify number of iterations:
// execute loop for (int t = 0; t < sequence_length; ++t) {
// generate the scaninput elements: the notation T<axis=k>[t] indicates the subtensor // of rank one less than T obtained by indexing T at position t along axis k. si_1 = scan_1<axis=axis_1>[t]; … ; si_m = scan_m<axis=axis_m>[t]; // execute loopbody st_1, …, st_n, so_1, …, so_k = loopbody(st_1, …, st_n, si_1, …, si_m) // accumulate the scanoutput elements scan_out_1 = Concat<axis=0>(scan_out_1, so_1); … ; scan_out_k = Concat<axis=0>(scan_out_k, so_k);
}
return st_1, …, st_n, scan_out_1, …, scan_out_k;
Sample usage: Encoding RNN using a Scan
The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi, recurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hiddenstate %H_0 can be encoded as a ScanLoop. Note that the loopbody is a nested graph, and it directly computes %Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these values are computed in the outer graph, they need to be passed in as extra state_variables.
 graph rnnencoding {
%H_0 = … %X = … %Y_h, %Y = Scan[body = <graph rnncell1>, num_scan_inputs=1](%H_0, %X) return %Y, %Y_h
}
 graph rnncell1 (
%H_tminus1[FLOAT, tensor] %X_t[FLOAT, tensor]
 ) {
%Wi = … %Ri = … %Wbi = … %Rbi = … %t1 = X_t * (Wi^T) %t2 = H_tminus1*(Ri^T) %t3 = Add(%t1, %t2) %t4 = Add(%t3, %Wbi) %t5 = Add(%t4, %Rbi) %Ht = Tanh(%t5) %Accumulate = Identity(%Ht) return %Ht, %Accumulate
}
Attributes
body (required): The graph run each iteration. It has N+M inputs: (loop state variables…, scan_input_elts…). It has N+K outputs: (loop state variables…, scan_output_elts…). Each scan_output is created by concatenating the value of the specified scan_output_elt value at the end of each iteration of the loop. It is an error if the dimensions of these values change across loop iterations.
num_scan_inputs (required): An attribute specifying the number of scan_inputs M.
scan_input_axes: An optional list of M flags. The ith element of the list specifies the axis to be scanned (the sequence axis) for the ith scan_input. If omitted, 0 will be used as the scan axis for every scan_input.
scan_input_directions: An optional list of M flags. The ith element of the list specifies the direction to be scanned for the ith scan_input tensor: 0 indicates forward direction and 1 indicates reverse direction. If omitted, all scan_input tensors will be scanned in the forward direction.
scan_output_axes: An optional list of K flags. The ith element of the list specifies the axis for the ith scan_output. The scan outputs are accumulated along the specified axis. If omitted, 0 will be used as the scan axis for every scan_output.
scan_output_directions: An optional list of K flags, one for each scan_output. The ith element of the list specifies whether the ith scan_output should be constructed by appending or prepending a new value in each iteration: 0 indicates appending and 1 indicates prepending. If omitted, all scan_output tensors will be produced by appending a value in each iteration.
Inputs
Between 1 and 2147483647 inputs.
initial_state_and_scan_inputs (variadic)  V: Initial values of the loop’s N state variables followed by M scan_inputs
Outputs
Between 1 and 2147483647 outputs.
final_state_and_scan_outputs (variadic)  V: Final values of the loop’s N state variables followed by K scan_outputs
Type Constraints
V in ( tensor(bool), tensor(complex128), tensor(complex64), tensor(double), tensor(float), tensor(float16), tensor(int16), tensor(int32), tensor(int64), tensor(int8), tensor(string), tensor(uint16), tensor(uint32), tensor(uint64), tensor(uint8) ): All Tensor types
Differences
0  0  Scan can be used to iterate over one or more scan_input tensors,  Scan can be used to iterate over one or more scan_input tensors, 
1  1  constructing zero or more scan_output tensors. It combines ideas from general recurrences,  constructing zero or more scan_output tensors. It combines ideas from general recurrences, 
2  2  functional programming constructs such as scan, fold, map, and zip, and is intended to enable  functional programming constructs such as scan, fold, map, and zip, and is intended to enable 
3  3  generalizations of RNNlike constructs for sequencetosequence processing.  generalizations of RNNlike constructs for sequencetosequence processing. 
4  4  Other tensors (referred to as state_variables here) can be used to carry a state  Other tensors (referred to as state_variables here) can be used to carry a state 
5  5  when iterating from one element to another (similar to hiddenstate in RNNs, also referred  when iterating from one element to another (similar to hiddenstate in RNNs, also referred 
6  6  to as loopcarried dependences in the context of loops). All these tensors are required to 

7  have the same shape in each iteration of the loop (a restriction imposed to enable efficient  
8  7  memory allocation). Many common usages involve a single scan_input tensor (where functionality 

9  8  similar to scan, fold and map can be obtained). When more than one scan_input is used,  similar to scan, fold and map can be obtained). When more than one scan_input is used, 
10  9  a behavior similar to zip is obtained.  a behavior similar to zip is obtained. 
11  10 


12  11  The attribute body must be a graph, specifying the computation to be performed in  The attribute body must be a graph, specifying the computation to be performed in 
13  12  every iteration. It takes as input the current values of the state_variables and  every iteration. It takes as input the current values of the state_variables and 
14  13  the current iterated element of the scan_inputs. It must return the (updated) values  the current iterated element of the scan_inputs. It must return the (updated) values 
15  14  of the state_variables and zero or more scan_output_element tensors. The values of the  of the state_variables and zero or more scan_output_element tensors. The values of the 
16  15  scan_output_element tensors are concatenated over all the iterations to produce the  scan_output_element tensors are concatenated over all the iterations to produce the 
17  16  scan_output values of the scan construct (similar to the concatenated intermediate  scan_output values of the scan construct (similar to the concatenated intermediate 
17  hiddenstate values of RNNlike constructs). All the output tensors (state_variables as  
18  18  hiddenstate values of RNNlike constructs). 

19  of the loop (a restriction imposed to enable efficient memory allocation).  
19  20 


21  Note that the iterated element passed to the body subgraph does not have a sequence  
22  axis. It will have a rank one less than the rank of the corresponding scan_input.  
23 
 
20  24  The scan operation returns the final values of the state_variables as well as the  The scan operation returns the final values of the state_variables as well as the 
21  25  scan_outputs.  scan_outputs. 
22  26 


23  The operation supports batching, and the batchaxis is required to be 0.  
24  When multiple scan_input tensors are used, they must all have the same batchsize,  
25  and they must all have the same maximumsequencelength (the dimensionality of the  
26  sequence axis or scan axis). The sequence axis or scan axis is required to be 1.  
27 
 
28  The operation has an optional sequence_lens input (of shape [BATCH_SIZE]) to  
29  allow variable length sequences of length <= the maximumsequencelength. If this  
30  input is not specified, all sequences are assumed to be of length equal to  
31  maximumsequencelength. For variable length input sequences, the scan_outputs  
32  will consist of a sequence of same length as the input, padded to the  
33  27  maximumsequencelength. 

34 
 
35  The optional attribute directions can be used to scan a sequence in the reverse direction.  
36  28  If this attribute is omitted, all sequences are scanned in the forward direction. 

37  29  A bidirectional scan be performed by specifying the same tensor input twice in the 

38  30  scan_inputs, once with a forward direction, and once with a backward direction. 

39  31 


32  The scan_output of the operation is produced by concatenating the scan_output_element  
33  values produced by the body in each iteration. The optional attribute scan_output_directions  
34  specifies the direction in which scan_output is constructed (by appending or prepending the  
35  scan_output_element to scan_output in each iteration) for each scan_output. If this attribute  
36  is omitted, the scan_output_element is appended to the scan_output in each iteration.  
37 
 
38  The optional attribute scan_input_axes specifies the axis to be scanned for each scan_input.  
39  If omitted, every scan_input will be scanned in axis 0. For example, if axis 0 is the  
40  batch axis and axis 1 is the time axis (to be scanned), specify an axis value of 1.  
41  Note that scanning a nonzero axis may be less efficient than scanning axis zero.  
42 
 
43  The optional attribute scan_output_axes specifies the axis along which the scan_outputs  
44  are accumulated for each scan_output. For example, if axis 1 is the time axis (to be  
45  scanned) for both inputs and outputs, specify a scan_input axis and scan_output axis  
46  value of 1.  
47 
 
40  48  Note that because of the ONNX restriction that only the last parameter of an operator can  Note that because of the ONNX restriction that only the last parameter of an operator can 
41  49  be variadic, the initialstates and scaninputs are listed together as one input parameter.  be variadic, the initialstates and scaninputs are listed together as one input parameter. 
42  50  Similarly, the finalstates and scanoutputs are listed together as one output parameter.  Similarly, the finalstates and scanoutputs are listed together as one output parameter. 
43  51  The attribute num_scan_inputs indicates the number M of scaninputs.  The attribute num_scan_inputs indicates the number M of scaninputs. 
44  52 


45  53  The behavior of  The behavior of 
46  54 


47  55  Scan <  Scan < 
48  56  num_scan_inputs = m,  num_scan_inputs = m, 
49  57  body = loopbody 

58  scan_input_axes = [axis_1, ..., axis_m]  
50  59  > (sequence_lengths, init_1, ..., init_n, scan_1, ..., scan_m) 

51  60 


52  61  is equivalent to the following pseudocode:  is equivalent to the following pseudocode: 
53  62 


54  // T.shape[0] denotes the batchsize of T  
55  // The batchsize of scan_1, ..., scan_m are all required to be equal  
56  batch_size = scan_1.shape[0];  
57 
 
58  63  // scan_i.shape[1] denotes the (max) sequencelength of scan_i 

59  64  // scan_i.shape[1] is required to be equal to scan_j.shape[1] for all i,j. 

60  65  max_sequence_length = scan_1.shape[1]; 

61  66 


62  for (int batch = 0; batch < batch_size; ++batch) {  
63  67  // initialize statevariables 

64  68  st_1 = init_1; ... st_n = init_n; 

65  69  // initialize scanoutput variables: [] denotes an empty tensor 

66  70  scan_out_1 = []; ...; scan_out_k = []; 

67  71  // identify number of iterations: 

72 
 
68  73  N = (sequence_lengths specified) ? sequence_lengths[batch] : max_sequence_length; 

69 
 
70  74  // execute loop 

71  for (int t = 0; t < N; ++t) {  
72  75  // generate the scaninput elements: the notation T 

73  76  // of rank one less than T obtained by indexing T at position t along axis k. 

74  77  si_1 = (scan_1 

75  78  ... ; 

76  79  si_m = (scan_m 

77  80  // execute loopbody 

78  81  st_1, ..., st_n, so_1, ..., so_k = loopbody(st_1, ..., st_n, si_1, ..., si_m) 

79  82  // accumulate the scanoutput elements 

80  83  scan_out_1 = Concat 

81  }  
82  // accumulate the outputs for this batch:  
83  bst_1[batch] = st_1; ..., bst_n[batch] = st_n;  
84  // Note scanoutputs will have size max_sequence_length, but only first N values will be meaningful.  
85  // The remaining values have an undefined value.  
86  b_scan_out_1[batch] = scan_out_1; ...; b_scan_out_k[batch] = scan_out_k;  
87  84  }  } 
85 
 
88  86  return bst_1, ..., bst_n, b_scan_out_1, ..., b_scan_out_k; 

89  87 


90  88  *Sample usage: Encoding RNN using a Scan*  *Sample usage: Encoding RNN using a Scan* 
91  89 


92  90  The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi,  The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi, 
93  91  recurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hiddenstate %H_0 can  recurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hiddenstate %H_0 can 
94  92  be encoded as a ScanLoop. Note that the loopbody is a nested graph, and it directly computes  be encoded as a ScanLoop. Note that the loopbody is a nested graph, and it directly computes 
95  93  %Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these  %Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these 
96  94  values are computed in the outer graph, they need to be passed in as extra state_variables.  values are computed in the outer graph, they need to be passed in as extra state_variables. 
97  95 


98  96  graph rnnencoding {  graph rnnencoding { 
99  97  %H_0 = ...  %H_0 = ... 
100  98  %X = ...  %X = ... 
101  99  %Y_h, %Y = Scan[body = 

102  100  return %Y, %Y_h  return %Y, %Y_h 
103  101  }  } 
104  102 


105  103  graph rnncell1 (  graph rnncell1 ( 
106  104  %H_tminus1[FLOAT, tensor]  %H_tminus1[FLOAT, tensor] 
107  105  %X_t[FLOAT, tensor]  %X_t[FLOAT, tensor] 
108  106  ) {  ) { 
109  107  %Wi = ...  %Wi = ... 
110  108  %Ri = ...  %Ri = ... 
111  109  %Wbi = ...  %Wbi = ... 
112  110  %Rbi = ...  %Rbi = ... 
113  111  %t1 = X_t * (Wi^T)  %t1 = X_t * (Wi^T) 
114  112  %t2 = H_tminus1*(Ri^T)  %t2 = H_tminus1*(Ri^T) 
115  113  %t3 = Add(%t1, %t2)  %t3 = Add(%t1, %t2) 
116  114  %t4 = Add(%t3, %Wbi)  %t4 = Add(%t3, %Wbi) 
117  115  %t5 = Add(%t4, %Rbi)  %t5 = Add(%t4, %Rbi) 
118  116  %Ht = Tanh(%t5)  %Ht = Tanh(%t5) 
119  117  %Accumulate = Identity(%Ht)  %Accumulate = Identity(%Ht) 
120  118  return %Ht, %Accumulate  return %Ht, %Accumulate 
121  119  }  } 
122  120 


123  121  **Attributes**  **Attributes** 
124  122 


125  123  * **body** (required):  * **body** (required): 
126  124  The graph run each iteration. It has N+M inputs: (loop state  The graph run each iteration. It has N+M inputs: (loop state 
127  125  variables..., scan_input_elts...). It has N+K outputs: (loop state  variables..., scan_input_elts...). It has N+K outputs: (loop state 
128  126  variables..., scan_output_elts...). Each scan_output is created by  variables..., scan_output_elts...). Each scan_output is created by 
129  127  concatenating the value of the specified scan_output_elt value at  concatenating the value of the specified scan_output_elt value at 
130  128  the end of each iteration of the loop. It is an error if the  the end of each iteration of the loop. It is an error if the 
131  129  dimensions of these values change across loop iterations.  dimensions of these values change across loop iterations. 
130  * **num_scan_inputs** (required):  
132  131  * **directions**: 

132  * **scan_input_axes**:  
133  133  An optional list of M flags. The ith element of the list specifies  An optional list of M flags. The ith element of the list specifies 
134  the axis to be scanned (the sequence axis) for the ith scan_input.  
135  If omitted, 0 will be used as the scan axis for every scan_input.  
136  * **scan_input_directions**:  
137  An optional list of M flags. The ith element of the list specifies  
134  138  the direction to be scanned for the ith scan_input tensor: 0  the direction to be scanned for the ith scan_input tensor: 0 
135  139  indicates forward direction and 1 indicates reverse direction. If  indicates forward direction and 1 indicates reverse direction. If 
136  140  omitted, all scan_input tensors will be scanned in the forward  omitted, all scan_input tensors will be scanned in the forward 
137  direction.  
138  141  * **num_scan_inputs** (required): 

139  142  An attribute specifying the number of scan_inputs M. 

140 
 
141  143  **Inputs** 

142 
 
143  144  Between 2 and 2147483647 inputs. 

144 
 
145  * **sequence_lens** (optional, heterogeneous)  **I**:  
145  along the specified axis. If omitted, 0 will be used as the scan  
146  146  Optional tensor specifying lengths of the sequences in a batch. If 

147  147  this input is not specified, all sequences are assumed to be of the 

148  An optional list of K flags, one for each scan_output. The ith  
149  element of the list specifies whether the ith scan_output should be  
150  constructed by appending or prepending a new value in each  
151  iteration: 0 indicates appending and 1 indicates prepending. If  
152  omitted, all scan_output tensors will be produced by appending a  
153  value in each iteration.  
154 
 
155  **Inputs**  
156 
 
148  157  maximum sequence length (the dimension of the sequence axis of the 

149  scan_input tensors).  
158 
 
150  159  * **initial_state_and_scan_inputs** (variadic)  **V**:  * **initial_state_and_scan_inputs** (variadic)  **V**: 
151  160  Initial values of the loop's N state variables followed by M  Initial values of the loop's N state variables followed by M 
152  161  scan_inputs  scan_inputs 
153  162 


154  163  **Outputs**  **Outputs** 
155  164 


156  165  Between 1 and 2147483647 outputs.  Between 1 and 2147483647 outputs. 
157  166 


158  167  * **final_state_and_scan_outputs** (variadic)  **V**:  * **final_state_and_scan_outputs** (variadic)  **V**: 
159  168  Final values of the loop's N state variables followed by K  Final values of the loop's N state variables followed by K 
160  169  scan_outputs  scan_outputs 
161  170 


162  171  **Type Constraints**  **Type Constraints** 
163  172 


164  * **I** in (  
165  tensor(int64)  
166  ):  
167  Int64 tensor  
168  173  * **V** in (  * **V** in ( 
169  174  tensor(bool),  tensor(bool), 
170  175  tensor(complex128),  tensor(complex128), 
171  176  tensor(complex64),  tensor(complex64), 
172  177  tensor(double),  tensor(double), 
173  178  tensor(float),  tensor(float), 
174  179  tensor(float16),  tensor(float16), 
175  180  tensor(int16),  tensor(int16), 
176  181  tensor(int32),  tensor(int32), 
177  182  tensor(int64),  tensor(int64), 
178  183  tensor(int8),  tensor(int8), 
179  184  tensor(string),  tensor(string), 
180  185  tensor(uint16),  tensor(uint16), 
181  186  tensor(uint32),  tensor(uint32), 
182  187  tensor(uint64),  tensor(uint64), 
183  188  tensor(uint8)  tensor(uint8) 
184  189  ):  ): 
185  190  All Tensor types  All Tensor types 
Scan  8#
Version
name: Scan (GitHub)
domain: main
since_version: 8
function: False
support_level: SupportType.COMMON
shape inference: True
This version of the operator has been available since version 8.
Summary
Scan can be used to iterate over one or more scan_input tensors, constructing zero or more scan_output tensors. It combines ideas from general recurrences, functional programming constructs such as scan, fold, map, and zip, and is intended to enable generalizations of RNNlike constructs for sequencetosequence processing. Other tensors (referred to as state_variables here) can be used to carry a state when iterating from one element to another (similar to hiddenstate in RNNs, also referred to as loopcarried dependences in the context of loops). All these tensors are required to have the same shape in each iteration of the loop (a restriction imposed to enable efficient memory allocation). Many common usages involve a single scan_input tensor (where functionality similar to scan, fold and map can be obtained). When more than one scan_input is used, a behavior similar to zip is obtained.
The attribute body must be a graph, specifying the computation to be performed in every iteration. It takes as input the current values of the state_variables and the current iterated element of the scan_inputs. It must return the (updated) values of the state_variables and zero or more scan_output_element tensors. The values of the scan_output_element tensors are concatenated over all the iterations to produce the scan_output values of the scan construct (similar to the concatenated intermediate hiddenstate values of RNNlike constructs).
The scan operation returns the final values of the state_variables as well as the scan_outputs.
The operation supports batching, and the batchaxis is required to be 0. When multiple scan_input tensors are used, they must all have the same batchsize, and they must all have the same maximumsequencelength (the dimensionality of the sequence axis or scan axis). The sequence axis or scan axis is required to be 1.
The operation has an optional sequence_lens input (of shape [BATCH_SIZE]) to allow variable length sequences of length <= the maximumsequencelength. If this input is not specified, all sequences are assumed to be of length equal to maximumsequencelength. For variable length input sequences, the scan_outputs will consist of a sequence of same length as the input, padded to the maximumsequencelength.
The optional attribute directions can be used to scan a sequence in the reverse direction. If this attribute is omitted, all sequences are scanned in the forward direction. A bidirectional scan be performed by specifying the same tensor input twice in the scan_inputs, once with a forward direction, and once with a backward direction.
Note that because of the ONNX restriction that only the last parameter of an operator can be variadic, the initialstates and scaninputs are listed together as one input parameter. Similarly, the finalstates and scanoutputs are listed together as one output parameter. The attribute num_scan_inputs indicates the number M of scaninputs.
The behavior of
 Scan <
num_scan_inputs = m, body = loopbody
> (sequence_lengths, init_1, …, init_n, scan_1, …, scan_m)
is equivalent to the following pseudocode:
// T.shape[0] denotes the batchsize of T // The batchsize of scan_1, …, scan_m are all required to be equal batch_size = scan_1.shape[0];
// scan_i.shape[1] denotes the (max) sequencelength of scan_i // scan_i.shape[1] is required to be equal to scan_j.shape[1] for all i,j. max_sequence_length = scan_1.shape[1];
 for (int batch = 0; batch < batch_size; ++batch) {
// initialize statevariables st_1 = init_1; … st_n = init_n; // initialize scanoutput variables: [] denotes an empty tensor scan_out_1 = []; …; scan_out_k = []; // identify number of iterations: N = (sequence_lengths specified) ? sequence_lengths[batch] : max_sequence_length;
// execute loop for (int t = 0; t < N; ++t) {
// generate the scaninput elements: the notation T<axis=k>[t] indicates the subtensor // of rank one less than T obtained by indexing T at position t along axis k. si_1 = (scan_1<axis=0>[batch])<axis=1>[t]; … ; si_m = (scan_m<axis=0>[batch])<axis=1>[t]; // execute loopbody st_1, …, st_n, so_1, …, so_k = loopbody(st_1, …, st_n, si_1, …, si_m) // accumulate the scanoutput elements scan_out_1 = Concat<axis=0>(scan_out_1, so_1); … ; scan_out_k = Concat<axis=0>(scan_out_k, so_k);
} // accumulate the outputs for this batch: bst_1[batch] = st_1; …, bst_n[batch] = st_n; // Note scanoutputs will have size max_sequence_length, but only first N values will be meaningful. // The remaining values have an undefined value. b_scan_out_1[batch] = scan_out_1; …; b_scan_out_k[batch] = scan_out_k;
} return bst_1, …, bst_n, b_scan_out_1, …, b_scan_out_k;
Sample usage: Encoding RNN using a Scan
The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi, recurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hiddenstate %H_0 can be encoded as a ScanLoop. Note that the loopbody is a nested graph, and it directly computes %Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these values are computed in the outer graph, they need to be passed in as extra state_variables.
 graph rnnencoding {
%H_0 = … %X = … %Y_h, %Y = Scan[body = <graph rnncell1>, num_scan_inputs=1](“”, %H_0, %X) return %Y, %Y_h
}
 graph rnncell1 (
%H_tminus1[FLOAT, tensor] %X_t[FLOAT, tensor]
 ) {
%Wi = … %Ri = … %Wbi = … %Rbi = … %t1 = X_t * (Wi^T) %t2 = H_tminus1*(Ri^T) %t3 = Add(%t1, %t2) %t4 = Add(%t3, %Wbi) %t5 = Add(%t4, %Rbi) %Ht = Tanh(%t5) %Accumulate = Identity(%Ht) return %Ht, %Accumulate
}
Attributes
body (required): The graph run each iteration. It has N+M inputs: (loop state variables…, scan_input_elts…). It has N+K outputs: (loop state variables…, scan_output_elts…). Each scan_output is created by concatenating the value of the specified scan_output_elt value at the end of each iteration of the loop. It is an error if the dimensions of these values change across loop iterations.
directions: An optional list of M flags. The ith element of the list specifies the direction to be scanned for the ith scan_input tensor: 0 indicates forward direction and 1 indicates reverse direction. If omitted, all scan_input tensors will be scanned in the forward direction.
num_scan_inputs (required): An attribute specifying the number of scan_inputs M.
Inputs
Between 2 and 2147483647 inputs.
sequence_lens (optional, heterogeneous)  I: Optional tensor specifying lengths of the sequences in a batch. If this input is not specified, all sequences are assumed to be of the maximum sequence length (the dimension of the sequence axis of the scan_input tensors).
initial_state_and_scan_inputs (variadic)  V: Initial values of the loop’s N state variables followed by M scan_inputs
Outputs
Between 1 and 2147483647 outputs.
final_state_and_scan_outputs (variadic)  V: Final values of the loop’s N state variables followed by K scan_outputs
Type Constraints
I in ( tensor(int64) ): Int64 tensor
V in ( tensor(bool), tensor(complex128), tensor(complex64), tensor(double), tensor(float), tensor(float16), tensor(int16), tensor(int32), tensor(int64), tensor(int8), tensor(string), tensor(uint16), tensor(uint32), tensor(uint64), tensor(uint8) ): All Tensor types