Training ONNX model using external python script file

An external python script file may be used to train a model outside of Breeze. The result is then imported into Breeze and may be used as any model trained inside of Breeze.

TIP On Windows you can install python and dependencies to run the example files as part of the installation of Breeze. This will create a local venv python version at BREEZE_INSTALL_LOCATION\python

On Linux and MacOS create a venv python manually at BREEZE_INSTALL_LOCATION/python and resolve dependencies to use the builtin example scripts from BREEZE_INSTALL_LOCATION/Runtime/ExampleCode/TrainPythonModel/requirements.txt

Execution of script file

TEXT

[python executable|py|python] SCRIPT_FILE TRAINING_CSV TEST_CSV ONNX_DESTINATION TRAINING_INFORMATION_FILE

You can select a custom python interpreter when training the mode using the “Browse interpreter” option in the script file selection dialog. If no interpreter has been select py will be used on Windows and on macOS and Linux python is used.

The standard output is shown in a log window in Breeze during the execution of the script.

Requirements

A valid default python environment (with all relevant dependencies installed)
Consumption of 4 positional arguments (see below)
Creation of 2 files
- ONNX model file
  - With valid opsets for all relevant dependencies
- JSON file in specific format (see below)

Positional arguments

TRAINING_CSV - The absolute path to a csv file containing the label and wavelength information for the training data set.
TEST_CSV - The absolute path to a csv file containing the label and wavelength information for the test data set.
ONNX_DESTINATION - The .onnx file which the script file should have created after execution.
- Will be in Breeze Temp folder
- Does not exist when the script is executed and must be created
TRAINING_INFORMATION_FILE - Training results in the format below
- Will be in Breeze Temp folder
- Does not exist when the script is executed and must be created
- See format below

Training data

The training and test data will be in csv (; separated) format. Below is an example of a SNV treated spectrum. Starting with the true class label value of 1 then the treated wavelength data.

TEXT

1;1.2151460647583008;1.9284714460372925;....;N

The format will be the same for the test data set.

Example script file

Below is a simple neural network defined using TensorFlow and Keras (see attachments for complete script file).

The script requires python version 3.7-3.10

File	Modified
File HSI-classification.py	2024-09-18
Text File requirements.txt	2024-09-18
File Utils.py	2024-09-18

Additional example scripts can be found in the Breeze Runtime install directory: Breeze\Runtime\ExampleCode\TrainPythonModel

The positional arguments are read into variables using:

PY

TRAIN_DATA = sys.argv[-4]
TEST_DATA = sys.argv[-3]
ONNX_DESTINATION = sys.argv[-2]
TRAINING_INFORMATION_FILE = sys.argv[-1]

The model definition must follow the import format (see Import of ONNX model )

PY

inputs = keras.Input(shape=(wls,), name="Features")

x = inputs
x = keras.layers.Dense(units=16, activation="tanh")(x)
x = keras.layers.Dense(units=16, activation="tanh")(x)
x = keras.layers.Dense(units=16, activation="tanh")(x)
# x = keras.layers.Dropout(0.5)(x) # add noise
y1 = keras.layers.Dense(number_of_classes, activation="softmax", name="Score.output")(x)

# Custom layer for argmax value to Breeze class label
y2 = Lookup(classes)(y1)

model = keras.Model(inputs, [y1, y2])

When the model is trained the conversation to ONNX format should be done in the script.

PY

shape_dict = {
    "model/lookup/None_Lookup/LookupTableFindV2:0": [None, 1]
}

tf2onnx.convert.from_keras(
    model,
    opset=15,
    output_path=ONNX_DESTINATION,
    shape_override=shape_dict,
    extra_opset=[utils.make_opsetid(constants.AI_ONNX_ML_DOMAIN, 1)],
)

And the training information written to the destination JSON file (Utils script can be used write_training_results).

PY

modelTrainingInfo = {
    "Accuracy": str(score_training_data[-1]),
    "AccuracyTest": str(score_test_data[-1]) if score_test_data is not None else "-1",
    "RuntimeInSeconds": str(end_time - start_time),
    "AlgorithmName": "External",
    "DisplayName": "Neural Network",
}
wrapper = {
    "Results": [modelTrainingInfo]
}

with open(TRAINING_INFORMATION_FILE, "w") as f:
    json.dump(wrapper, f)

Expected result

A model evaluation table with the results from the TRAINING_INFORMATION_FILE file. The ONNX model file will be imported and the test set applied (or training set if no test set) after the training outside of Breeze.

`TRAINING_INFORMATION_FILE` format

JSON

{
    "Results": [
        {
            "Accuracy": "0.9798425436019897",         # Training result
            "AccuracyTest": "-1",                     # Result from test set (if applicable else "-1")
            "CrossValidationResults": "-1"            # Result from cross validation (if applicable else "-1")
            "RuntimeInSeconds": "11.319475650787354", # Runtime in seconds
            "DisplayName": "Neural Network"           # Any non-empty string - to be displayed in table
        }
    ]
}

Any number of Results may be added to the JSON file. The first result will be marked as used in Breeze for the purpose of presentation.