Generate Experiment Using Deep Network Designer
This example shows how to use Experiment Manager to tune the hyperparameters of a regression network trained in Deep Network Designer.
You can use Deep Network Designer to create a network, import data, and train the network. You can then use Experiment Manager to sweep through a range of hyperparameter values and find the optimal training options.
Load the digit sample data as an image datastore. The digits data set consists of 10,000 synthetic grayscale images of handwritten digits. Each image is 28-by-28-by-1 pixels and has been rotated by a certain angle. You can use deep learning to train a regression model to predict the angle of the image.
Load the digit images and their corresponding angle of rotation.
[XTrain,~,anglesTrain] = digitTrain4DArrayData; [XValidation,~,anglesValidation] = digitTest4DArrayData;
To train a regression network using Deep Network Designer, the data must be in a datastore. Convert the images and angles to
adsXTrain = arrayDatastore(XTrain,IterationDimension=4); adsAnglesTrain = arrayDatastore(anglesTrain);
To input both the images and angles from both datastores into a deep learning network, combine them using the
combine function. The input and target datastores are combined by horizontally concatenating the data returned by the
cdsTrain = combine(adsXTrain,adsAnglesTrain);
Repeat the data processing steps on the validation data.
adsXValidation = arrayDatastore(XValidation,IterationDimension=4); adsAnglesValidation = arrayDatastore(anglesValidation); cdsValidation = combine(adsXValidation,adsAnglesValidation);
Define Network Architecture
Define the network architecture. You can build this network interactively by dragging layers in Deep Network Designer. Alternatively, you can create this network at the command line and import it into Deep Network Designer.
layers = [ imageInputLayer([28 28 1]) convolution2dLayer(3,8,Padding="same") batchNormalizationLayer reluLayer averagePooling2dLayer(2,Stride=2) convolution2dLayer(3,16,Padding="same") batchNormalizationLayer reluLayer averagePooling2dLayer(2,Stride=2) convolution2dLayer(3,32,Padding="same") batchNormalizationLayer reluLayer convolution2dLayer(3,32,Padding="same") batchNormalizationLayer reluLayer dropoutLayer(0.2) fullyConnectedLayer(1) regressionLayer]; deepNetworkDesigner(layers)
Import the digits data into Deep Network Designer. Select the Data tab and click Import Data > Import Datastore. Select
cdsTrain as the training data and
cdsValidation as the validation data.
Import the data by clicking Import.
Specify the training options and train the network. On the Training tab, click Training Options. For this example, set the solver to
adam and keep the other default settings. Set the training options by clicking Close.
Train the network using the imported data and the specified training options by clicking Train. The training progress plot shows the mini-batch loss and root mean squared error (RMSE) as well as the validation loss and error.
Once training is complete, you can generate an experiment to sweep through a range of hyperparameter values to find the optimal training options.
To generate an experiment, on the Training tab, click Export > Create Experiment.
Deep Network Designer generates an experiment template using your network and imported data. The app then opens Experiment Manager. In Experiment Manager, you can choose to add the new experiment to a new project, an existing project, or the current project.
Experiments consist of a description, a table of hyperparameters, a setup function, and a collection of metric functions to evaluate the results of the experiment.
The Hyperparameters section specifies the strategy (
Exhaustive Sweep) and hyperparameter values to use for the experiment. When you run the experiment, Experiment Manager trains the network using every combination of hyperparameter values specified in the hyperparameter table. By default, Deep Network Designer generates an experiment to sweep over a range of learning rates centered on the learning rate you used to train.
The Setup Function configures the training data, network architecture, and training options for the experiment. Deep Network Designer automatically configures the setup function to use your network and data. The input to the setup function is a structure
params with fields from the hyperparameter table. To view or edit the setup function, under Setup Function, click Edit.
If your network contains custom layers or the training options contain a relative checkpoint path, Deep Network Designer generates supporting functions in the experiment setup script. You must check and edit these supporting functions before running the experiment.
In Experiment Manager, run the experiment by clicking Run. When you run the experiment, Experiment Manager trains the network defined by the setup function. Each trial uses one of the learning rates specified in the hyperparameter table.
While the experiment is running, click Training Plot to display the training plot and track the progress of each trial.
A table of results displays the RMSE and loss for each trial. When the experiment finishes, you can sort the trials by the RMSE or loss metrics to see which trial performs the best. In this example, trial 3 with an initial learning rate of 0.01 performs the best.
To add another hyperparameter to sweep over, you must add it to the Hyperparameters table and update the setup function.
Add another hyperparameter to the Hyperparameters table by clicking Add. For this example, add a new hyperparameter called
mySolver with the values
["adam" "sgdm" "rmsprop"].
Next, edit the setup function to sweep over the new hyperparameter. To edit the setup function, under Setup Function, click Edit. In the Training Options section of the live script, change the first argument of the
trainingOptions function from
params.mySolver. Click Save and close the setup function.
Run the updated experiment by clicking Run. Experiment Manager tries every combination of the learning rate and solver hyperparameters. In this example, trial 5 with an initial learning rate of 0.001 and a SGDM solver performs the best.