External Data#
Loading an ONNX Model with External Data#
[Default] If the external data is under the same directory of the model, simply use
onnx.load()
import onnx
onnx_model = onnx.load("path/to/the/model.onnx")
If the external data is under another directory, use
load_external_data_for_model()
to specify the directory path and load after usingonnx.load()
import onnx
from onnx.external_data_helper import load_external_data_for_model
onnx_model = onnx.load("path/to/the/model.onnx", load_external_data=False)
load_external_data_for_model(onnx_model, "data/directory/path/")
# Then the onnx_model has loaded the external data from the specific directory
Converting an ONNX Model to External Data#
import onnx
from onnx.external_data_helper import convert_model_to_external_data
onnx_model = ... # Your model in memory as ModelProto
convert_model_to_external_data(onnx_model, all_tensors_to_one_file=True, location="filename", size_threshold=1024, convert_attribute=False)
# Must be followed by save_model to save the converted model to a specific path
onnx.save_model(onnx_model, "path/to/save/the/model.onnx")
# Then the onnx_model has converted raw data as external data and saved to specific directory
Converting and Saving an ONNX Model to External Data#
import onnx
onnx_model = ... # Your model in memory as ModelProto
onnx.save_model(onnx_model, "path/to/save/the/model.onnx", save_as_external_data=True, all_tensors_to_one_file=True, location="filename", size_threshold=1024, convert_attribute=False)
# Then the onnx_model has converted raw data as external data and saved to specific directory
onnx.checker for Models with External Data#
Models with External Data (<2GB)#
Current checker supports checking models with external data. Specify either loaded onnx model or model path to the checker.
Large models >2GB#
However, for those models larger than 2GB, please use the model path for onnx.checker and the external data needs to be under the same directory.
import onnx
onnx.checker.check_model("path/to/the/model.onnx")
# onnx.checker.check_model(loaded_onnx_model) will fail if given >2GB model
TensorProto: data_location and external_data fields#
There are two fields related to the external data in TensorProto message type.
data_location field#
data_location
field stores the location of data for this tensor. Value MUST be one of:
MESSAGE
- data stored in type-specific fields inside the protobuf message.RAW
- data stored in raw_data field.EXTERNAL
- data stored in an external location as described by external_data field.value
not set - legacy value. Assume data is stored in raw_data (if set) otherwise in message.
external_data field#
external_data
field stores key-value pairs of strings describing data location
Recognized keys are:
"location"
(required) - file path relative to the filesystem directory where the ONNX protobuf model was stored. Up-directory path components such as .. are disallowed and should be stripped when parsing."offset"
(optional) - position of byte at which stored data begins. Integer stored as string. Offset values SHOULD be multiples 4096 (page size) to enable mmap support."length"
(optional) - number of bytes containing data. Integer stored as string."checksum"
(optional) - SHA1 digest of file specified in under ‘location’ key.
After an ONNX file is loaded, all external_data
fields may be updated with an additional key ("basepath")
, which stores the path to the directory from which he ONNX model file was loaded.
External data files#
Data stored in external data files will be in the same binary bytes string format as is used by the raw_data
field in current ONNX implementations.
Reference https://github.com/onnx/onnx/pull/678