[bug?] 2018a trainnetwork accuracy suddenly dropped with multi-gpu
1 visualización (últimos 30 días)
Mostrar comentarios más antiguos
When I use trainnet I experienced the accuracy dropped suddenly and it was not able to come back normal. The following is my trainingoption.
options = trainingOptions('sgdm','Momentum', 0.9,'InitialLearnRate', 1e-3,'L2Regularization', 0.0005,'MaxEpochs', 20000, 'MiniBatchSize',4,'Shuffle', 'every-epoch', 'CheckpointPath',newdirstoragetraicheckpoint, 'ExecutionEnvironment','multi-gpu','Plots','training-progress', 'VerboseFrequency', 2);
| 2320 | 39426 | 10:23:50 | 82.76% | 0.2362 | 0.0010 |
| 2320 | 39428 | 10:23:52 | 83.29% | 0.2832 | 0.0010 |
| 2320 | 39430 | 10:23:54 | 25.52% | 3.1097 | 0.0010 |
| 2320 | 39432 | 10:23:56 | 27.04% | 3.0014 | 0.0010 |
| 2320 | 39434 | 10:23:58 | 23.22% | 2.9561 | 0.0010 |
I've never had this issue before in 2017b so I suspect it's something to do with the new trainnetwork in 2018a. One thing I notice is that 2017b didn't introduce multi-gpu support for 'ExecutionEnvironment', could this be the reason? I'm running the same script again in 2017b at the moment with the 'ExecutionEnvironment' set to 'gpu' to see if it will occur.
2 comentarios
Joss Knight
el 14 de Abr. de 2018
Nothing obvious changed in the multi-gpu training between R2017b and R2018a, although NCCL was upgraded. What happens when you take the most recent checkpoint before the loss jumped and input the layers from that network back into training, does the same thing happen?
This sort of behaviour isn't unheard of, because the loss landscape can be non-smooth near the solution and you can suddenly step to a bad solution with no means of escaping the local minimum. You may have been unlucky and this will never happen again. Try lowering the learn rate or use a learn rate drop schedule to ensure the learn rate is lower when you reach this unstable region.
Respuestas (0)
Ver también
Categorías
Más información sobre Image Data Workflows en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!