TreeBagger using obscene amount of memory when run in parallel
8 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Nicholas
el 12 de Oct. de 2011
Comentada: amanita
el 28 de Feb. de 2014
Hi,
Im experiencing issues when running TreeBagger on a cluster. I run this code on a large cluster with 64 processors and 128 GB of memory. However, when I try to use TreeBagger on my dataset (~200 MB in size) with 5000 trees, matlab errors out after a few hours with OUT of MEMORY issues.
Here are my steps:
1. send a batch job to the cluster via distributed computing toolbox and open a matlabpool with 32 workers.
2. options = statset('UseParallel', 'Always');
3. B= TreeBagger(ntrees, tsp, tsp_label, 'Fboot', fboot, 'Options', options); where ntrees = 5000 and fboot=0.5.
I dont understand why TreeBagger is using so much memory (>128GB). When I run this same job locally on my 16GB computer, the memory use does not exceed 16GB. Am I doing something improperly?
Thanks for your help!
0 comentarios
Respuesta aceptada
Steve
el 12 de Oct. de 2011
Nicholas,
Each worker in the matlabpool is a separate matlab executable with its own working memory. In the case of TreeBagger, each worker has a separate copy of the TreeBagger data, which includes your full dataset, and eventually, all or most of the trees, plus any additional object contents. Thus, for TreeBagger, total memory consumption tends to increase quasi-linearly with the size of the matlabpool.
If you run in serial mode on your own computer, there is only one copy of this memory. (Though if you run in parallel on K cores locally, there will be K copies of the data.)
You might try to run with a smaller matlabpool if total memory consumption across the matlabpool is a limiting factor.
Best,
Steve
2 comentarios
amanita
el 28 de Feb. de 2014
I had the same problem. I had parfor loops and treebaggers inside. It was faster to run the outer parfor loops and the treebaggers in serial rather than serial loops and treebaggers in parallel. Nice to know why!
Más respuestas (0)
Ver también
Categorías
Más información sobre Classification Ensembles en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!