Is C++ MEX API significantly slower than the C MEX API?
17 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
MathWorks recommend "Whenever possible, choose C++ over C applications.", however I cannot find a way to match the performance of the old C API and, considering that the main reason we use MEX is to make our code run faster - what's the point of the new C++ API? And is it actually slower or am I using it completely wrong?
I've tried doing some tests (all code attached) and in all cases whenever using the new C++ API, even if there is no data transfer and the function itself does nothing - it comes out significantly slower. I tried to rule out the C++ compiler being worse by compiling the same C code with a C++ compiler and it does turn out similar to the C one. All MEX functions were compiled using -O flag, as well as -R2018a for ProdSum ones.
nop() - functions that do absolutely nothing. nothing in the table below is literally nothing, simply tic/toc in 2 lines, nop() is a Matlab function and Cnop/Cppnop are C and C++ MEX functions. Cppnop2 is C MEX function compiled with C++ compiler.
nothing x100000 | 23.9487ms | 0.2395us/call | 1.000x
nop() x100000 | 31.8879ms | 0.3189us/call | 1.332x
Cnop() x100000 | 77.2739ms | 0.7727us/call | 3.227x - C API & C Compiler
Cppnop() x100000 | 607.6918ms | 6.0769us/call | 25.375x - C++ API & C++ Compiler
Cppnop2() x100000 | 85.2547ms | 0.8525us/call | 3.560x - C API & C++ Compiler
empty() - functions that return empty result ([]). inline, empty() and @()[] are 3 different ways to achieve the same in Matlab. CEmpty/CppEmpty are C and C++ MEX functions. CppEmpty2 is, again, a C MEX function compiled with C++ compiler.
inline x100000 | 24.1816ms | 0.2418us/call | 1.000x
empty() x100000 | 29.1480ms | 0.2915us/call | 1.205x
@()[] x100000 | 33.1176ms | 0.3312us/call | 1.370x
CEmpty() x100000 | 116.2378ms | 1.1624us/call | 4.807x - C API & C Compiler
CppEmpty() x100000 | 784.0485ms | 7.8405us/call | 32.423x - C++ API & C++ Compiler
CppEmpty2() x100000 | 120.9537ms | 1.2095us/call | 5.002x - C API & C++ Compiler
The above functions are mainly to evaluate the overhead of just calling MEX without any data transfer. The C++ MEX API version comes out 6-8 times slower which or around 5-6us per call (which is nothing really but can add up).
ProdSum() - functions that calculate product of all values in a double array in cells (of different lengths) and computes a sum of those. ProdSum and cellfun are Matlab options, while CProdSum/CppProdSum are again C and C++ MEX functions. CppProdSum2 is C MEX entry function combined with C++ classes (so C++ using C MEX API) and is as fasts if not slightly faster than the C MEX function.
ProdSum() x100 | 9.0598ms | 90.5980us/call | 1.000x | val = 931.56
cellfun x100 | 197.1907ms | 1971.9070us/call | 21.765x | val = 931.56
CProdSum() x100 | 3.8130ms | 38.1300us/call | 0.421x | val = 931.56 - C API & C Compiler
CppProdSum() x100 | 147.2003ms | 1472.0030us/call | 16.248x | val = 931.56 - C++ API & C++ Compiler
CppProdSum2() x100 | 3.6594ms | 36.5940us/call | 0.404x | val = 931.56 - C API & C++ Compiler
In this case the C++ API MEX function comes out 40x slower when there is some data transfer and the time difference per call is a lot bigger than in the nop/empty cases. In the ProdSum test it comes out barely faster than the cellfun option so would be pretty much worthless compared to the standard Matlab functions and the old C MEX API when the performance matters.
Currently, it makes no sense to use the C++ API if you don't need to as you can still combine C++ with C API and achieve much better performance. Is there a way to make the MEX functions using the C++ API as fast as the C API ones? Or is this a know limitation of the C++ API? And if so, is that going to be addressed in the future releases?
8 comentarios
William
el 18 de Oct. de 2025 a las 19:14
Editada: William
el 18 de Oct. de 2025 a las 19:16
Just found this thread after running into the issue myself. I have some older coordinate transforms code (written in "C++" but it could compile as C and uses the C matrix API) that have been quite good to me. I needed to implement a new transform, and decided to use the C++ API as I have been needing to learn more about modern C++ for another project, and found at first it was two orders of magnitude slower than the older transform functions---from ~20ms for 500 calls over large matrices (really an array of column vectors) to 4s. I wouldn't necessarily expect 1:1 performance because the transform is different, but this is ridiculous.
I removed explicit use of ColumnMajorIterator<T> (foolish me thinking being explicit was good) and got a 10x speedup to about 270ms, but haven't been able to get much better through trying out some optimizations. I'm not aware of an easy way to hook into a mex function and profile it itself, and as the problem appears to be related to accessing the Matlab data structures, I can't just separate the real math part of the transform and profile that. I agree this is a real shame both for customers in general and because I've found I actually quite like modern C++ constructs (heresy, I know) and some of the helpful API sugar Mathworks added, but it looks like it's back to the Matrix API for me.
William
el 23 de Oct. de 2025 a las 14:15
Update: I did end up making a C Matrix API version of the function and it does indeed run in about the exact same time as the old transforms, so it does seem like the C++ API just has a lot more overhead for data access or function calls in the first place since the actual algorithm didn't change at all.
Respuestas (0)
Ver también
Categorías
Más información sobre Write C Functions Callable from MATLAB (MEX Files) en Help Center y File Exchange.
Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!