After I tried parfor (with default options) in MATLAB (R2021a) my parallel computations in Rcpp Rstudio, which uses openmp, became much slower. My code in Rstudio is written in C++ and compiled through the Rcpp package. But now when I use more than one thread in Rstudio the computations are much slower than they were before. Could it be that MATLAB changed some default settings in my compiler? In that case, how could I reverse the changes? I am using ubuntu 20.04. I have an AMD processor.
In C++ I use a very simple loop with no interactions among threads (completely parallel):
omp_set_num_threads(nproc);
#pragma omp parallel for schedule(static)
for (int ii=0; ii<=nproc-1; ii++){
// command to send job to thread ii
}
I think it is something to do with MATLAB creating a pool of 12 workers. This is what MATLAB did when using parfor. Now my parallel computing in Rstudio behaves very differently (and slowly). For example, if I specify 32 threads in my code (which is 50% of the 64 I have), the total CPU use (according to top) is only around 20% (which is roughly 12 over 64). If I specify only 5 threads (which is just about 8%), then the total CPU use is also around 20%. However, if I specify 3 threads, the total CPU is 15%, and with one thread it is 9%.
Before I used MATLAB the CPU use was proportional to the number of threads I specified. Now it is not. How can I undo the settings for parallel computing made by MATLAB?
I checked the values of many internal variables of openmp and the problem was not there:
int f=omp_get_num_threads();
int ddd=omp_get_dynamic();
Rcout << "get thread " << f << std::endl;
Rcout << "get dyn " << ddd << std::endl;
Rcout << "several " << omp_get_thread_limit() << " " << omp_get_max_threads() << " " << omp_get_nested() << " " << omp_get_proc_bind() << " " << omp_get_default_device() << " " << omp_get_max_task_priority() << " " << " " << omp_get_max_active_levels() << " " <<
" " << std::endl;
Rcout << "meeting " << omp_get_num_places() << " " << omp_get_place_num_procs << " " << omp_get_place_num << " " << omp_get_partition_num_places() << std::endl;
All these variables had the same values as in another computer where things were working correctly.