com.microsoft - SoftmaxCrossEntropyGrad#
This version of the operator has been available since version 1 of domain com.microsoft.
reduction: Type of reduction to apply to loss: none, sum, mean(default). ‘none’: the output is the loss for each sample in the batch.’sum’: the output will be summed. ‘mean’: the sum of the output will be divided by the batch_size. Default value is
dY (heterogeneous) - T: gradient of Y
log_prob (heterogeneous) - T: logsoftmax(logits), N-D input of shape (-1, num_classes).
label (heterogeneous) - T: The onehot label is N-D input with the same shape as logits.
d_logits (heterogeneous) - T: gradient of logits