why is this Matlab Code faster than the C++ code below? I want to understand what Matlab internally does better and faster than C++

24 visualizaciones (últimos 30 días)
why is this Matlab Code
function primes = sieve_era2(N)
% sieve of Erathostenes without upper bound of search space (could theoretically run forever)
if nargin == 0
N = Inf;
end
primes.number(1) = 2;
primes.counter(1) = primes.number(1);
k = 2;
k1 = 2;
tic;
while k <= N
k = k + 1; % check next k if it is prime
if mod(k,100000) == 0
fprintf("numbers checked: %i, number of primes found: %i, largest prime found: %i, time: %.2f seconds \n", k, k1, primes.number(end), toc);
end
primes.counter = primes.counter - 1; % all counters reduced by 1
if min(primes.counter) == 0
primes.counter((primes.counter == 0)) = primes.number((primes.counter == 0));
continue; % current numer is not a prime
end
k1 = k1 + 1; % no counter was reduced to zero --> current number is a new prime
primes.number(k1-1) = k;
primes.counter(k1-1) = primes.number(k1-1);
end
end
faster than this C++ code:
// sieve_era2.cpp : sieve of Erathostenes without upper bound of search space (could theoretically run forever)
#include <iostream>
#include <vector>
#include <algorithm>
#include <time.h>
#include <chrono>
using namespace std;
using namespace std::chrono;
struct primes
{
std::vector<int> number {2};
std::vector<int> counter {2};
} p;
primes sieve_era2(int N)
{
int k = 2;
bool prime{ true };
while (k <= N)
{
k = k + 1;
for (int j = 0; j <= p.counter.size()-1; j++)
{
p.counter[j] = p.counter[j] - 1;
if (p.counter[j] == 0)
{
p.counter[j] = p.number[j];
prime = false;
}
}
if (prime == false)
{
prime = true;
continue;
}
p.number.push_back(k);
p.counter.push_back(p.number.back());
}
return p;
}
int main()
{
primes p;
int N = 200000;
unsigned __int64 tic = duration_cast<milliseconds>(std::chrono::system_clock::now().time_since_epoch()).count();
p = sieve_era2(N);
unsigned __int64 toc = duration_cast<milliseconds>(std::chrono::system_clock::now().time_since_epoch()).count();
cout << (toc - tic) / 1000 << " Seconds " << std::endl;
system("pause");
Matlab runs 12 sec, C++ about 55 sec.
I want to understand what Matlab internally might be doing better and faster than C++

Respuesta aceptada

Chris
Chris el 17 de Jun. de 2022
Editada: Chris el 17 de Jun. de 2022
I see an efficiency in primes.counter = primes.counter - 1;
Matlab uses LAPACK for matrix/vector operations, which I think should be faster than a for loop in C.
Same for the if block that follows--especially for the if block, since you're using if once per outer loop in Matlab, and many times per outer loop in C.
You could try timing those operations separately, a few thousand at a time. In Matlab, for instance:
counter = rand(10000,1);
timeit(@() counterTest(counter))
ans = 0.0034
function counterTest(counter)
for idx = 1:1000
counter = counter-1;
end
end
  4 comentarios
Chris
Chris el 19 de Jun. de 2022
Interesting. You're saying 32-bit integer operations should require equal time to 64-bit integers using BLAS functions?
It appears to me that the PetscBLASInt type allows BLAS to handle both data types separately.
https://petsc.org/release/docs/manualpages/Sys/PetscBLASInt.html
Thomas
Thomas el 19 de Jun. de 2022
This makes sense - thank you very much for your help! I will try the timing thing, you suggested.

Iniciar sesión para comentar.

Más respuestas (1)

Jan
Jan el 17 de Jun. de 2022
Editada: Jan el 19 de Jun. de 2022
Not an answer, but an improvement of the Matlab code, which run in 10.7 sec on my R2018b i5m instead of 13.0 sec of the original version for N=2e5:
function primes = sieve_era2m(N)
% sieve of Erathostenes without upper bound of search space (could theoretically run forever)
if nargin == 0
N = Inf;
end
number(1) = 2;
counter(1) = number(1);
k1 = 2;
show = 1e5;
tic;
for k = 3:N+1
if k == show
fprintf('checked: %i, primes found: %i, largest: %i, time: %.2f s\n', ...
k, k1, number(end), toc);
show = show + 1e5;
end
counter = counter - 1; % all counters reduced by 1
if all(counter) % current numer is not a prime
number(k1) = k;
counter(k1) = number(k1);
k1 = k1 + 1; % no counter was reduced to zero --> current number is a new prime
else
ncounter = ~counter;
counter(ncounter) = number(ncounter);
end
end
primes.number = number;
primes.counter = counter;
end
And with UINT32 and without output it runs in 7.7 sec:
function primes = sieve_era2i(N)
number(1) = uint32(2);
counter(1) = uint32(2);
one = uint32(1);
tic;
k1 = uint32(1);
for k = uint32(3):uint32(N)
counter = counter - one;
if all(counter)
k1 = k1 + one;
number(k1) = k;
counter(k1) = k;
else
for u = one:k1
if ~counter(u)
counter(u) = number(u);
end
end
end
end
primes.number = number;
primes.counter = counter;
end
EDITED: And a version taking 6.8 sec:
function primes = sieve_era2j(N)
piN = ceil(N / log(N));
number = zeros(1, piN, 'uint32'); % Pre-allocation
counter = zeros(1, piN, 'uint32'); % Pre-allocation
number(1) = uint32(2);
counter(1) = uint32(2);
one = uint32(1);
tic;
k1 = uint32(1);
for k = uint32(3):uint32(N)
new = 1;
for u = one:k1
counter(u) = counter(u) - one;
if counter(u) == 0
counter(u) = number(u);
new = 0;
break;
end
end
for v = u+1:k1 % Count the rest without setting [new] again
counter(v) = counter(v) - one;
if counter(v) == 0
counter(v) = number(v);
end
end
if new % New prime found:
k1 = k1 + one;
number(k1) = k;
counter(k1) = k;
end
end
primes.number = number(one:k1);
primes.counter = counter(one:k1);
end
Now to an answer: I cannot profile your C++ code. My guess is that this is the bottleneck:
p.number.push_back(k);
p.counter.push_back(p.number.back());
The iterative growing of arrays is expensive. It looks like Matlab strategies to reduce the effect is more powerful.

Categorías

Más información sobre Logical en Help Center y File Exchange.

Productos


Versión

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by