Using timeit() in the debugger causes function speedup until edits are made

I have two solvers that I have coded for solving linear equations. After experimenting with them, I can see that one is taking less time, even though it is computing more matrix-vector products (MVP) and more daxpys. The time difference is not significant for small matrices, but for large ones the discrepancy is noticeable and actually quite perplexing when one notices that the one that computes faster performs more MVPs and more daxpys. Bear in mind the systems are the same.
So I tried to get to the bottom of this. I then computed the average time for 25 MVPs as the programs ran over all the iterations, and what I noticed was that the "slow" solver was indeed computing MVPs on average slower than the "fast" one. Over thousands or millions of MVPs, the small difference in average time/MVP amounted to nearly the difference in overall run time. I then sought to understand why one is computing MVPs at a faster rate. I have not found that answer, but one thing I did notice was that the average time per MVP speeds up considerably after entering debug mode, running timeit(@() A*P), and then running out of debug mode. The "slow" code sped up considerably (faster than the "fast" code), but the "fast" code was unaffected. But, as soon as I made an edit to the "slow" code, it retured to the usual solving speeds.
Something is not making sense to me. I have seen suggestions in other forums that it could have to do with JIT, but I don't really know what that is and I'm skeptical that is the culprit. Any insight from someone more experienced than me in Matlab would be appreciated.
Here is some output. You can see where the average time per MVP was slow then sped up after doing the timeit(...) hack in debug mode like I described above.
>> traceTesting('h01_4444.mat',0.005*(4^4),175,100,[2,4],false)
Computing Hutchinson trace using Twin BiCG solver up to tolerance 1.280000e+00.
Twin BiCG: solving *first* system: it = 0, ||r1|| = 5.543e+01
Twin BiCG: solving *first* system: it = 50, ||r1|| = 1.020e+00
Twin BiCG: solving *first* system: it = 100, ||r1|| = 5.111e-04
Twin BiCG converged (first system): it = 121, ||r1|| = 7.151376e-06
Hutchinson (Twin BiCG) noise number 25, error = 5.592471e+00, avg time = 2.551564e-02, avg mvp time = 1.306117e-04
Hutchinson (Twin BiCG) noise number 50, error = 3.429886e+00, avg time = 2.431404e-02, avg mvp time = 1.355107e-04
Hutchinson (Twin BiCG) noise number 75, error = 3.078834e+00, avg time = 2.412139e-02, avg mvp time = 1.362848e-04
Hutchinson (Twin BiCG) noise number 100, error = 2.607429e+00, avg time = 2.410497e-02, avg mvp time = 1.366702e-04
Hutchinson (Twin BiCG) noise number 125, error = 2.382652e+00, avg time = 2.398005e-02, avg mvp time = 1.368071e-04
Hutchinson (Twin BiCG) noise number 150, error = 2.116026e+00, avg time = 2.441677e-02, avg mvp time = 1.372798e-04
Hutchinson (Twin BiCG) noise number 175, error = 1.917154e+00, avg time = 2.429972e-02, avg mvp time = 1.377248e-04
Hutchinson (Twin BiCG) noise number 200, error = 1.906488e+00, avg time = 2.387122e-02, avg mvp time = 1.377267e-04
Hutchinson (Twin BiCG) noise number 225, error = 1.804015e+00, avg time = 2.418052e-02, avg mvp time = 1.379283e-04
Hutchinson (Twin BiCG) noise number 250, error = 1.673230e+00, avg time = 2.390470e-02, avg mvp time = 1.377716e-04
Hutchinson (Twin BiCG) noise number 275, error = 1.591129e+00, avg time = 2.359043e-02, avg mvp time = 1.375608e-04
Hutchinson (Twin BiCG) noise number 300, error = 1.523522e+00, avg time = 2.346701e-02, avg mvp time = 1.372989e-04
Hutchinson (Twin BiCG) noise number 325, error = 1.481106e+00, avg time = 2.377756e-02, avg mvp time = 1.372047e-04
Hutchinson (Twin BiCG) noise number 350, error = 1.428140e+00, avg time = 2.372067e-02, avg mvp time = 1.371041e-04
Hutchinson (Twin BiCG) noise number 375, error = 1.387023e+00, avg time = 2.348483e-02, avg mvp time = 1.367644e-04
Hutchinson (Twin BiCG) noise number 400, error = 1.328517e+00, avg time = 2.355259e-02, avg mvp time = 1.366078e-04
Hutchinson (Twin BiCG) noise number 425, error = 1.289146e+00, avg time = 2.355508e-02, avg mvp time = 1.365266e-04
Hutchinson trace (Twin BiCG) converged on noise 435 with
--> 52756 MVP (7.200284e+00 s; 1.364828e-04 s/MVP),
--> 158147 daxpys (3.062910e+00 s; 1.936749e-05 s/daxpy),
--> 243 dot products,
--> error 1.278158e+00
Hutchinson trace time = 1.095617e+01 s
Hutchinson trace results saved in resultsH_alg24_h01_4444.mat
>> traceTesting('h01_4444.mat',0.005*(4^4),175,100,[2,4],false)
Computing Hutchinson trace using Twin BiCG solver up to tolerance 1.280000e+00.
Twin BiCG: solving *first* system: it = 0, ||r1|| = 5.543e+01
Twin BiCG: solving *first* system: it = 50, ||r1|| = 1.020e+00
Twin BiCG: solving *first* system: it = 100, ||r1|| = 5.111e-04
Twin BiCG converged (first system): it = 121, ||r1|| = 7.151376e-06
84 V = A*P;
K>> timeit(@() A*P)
ans =
2.0566e-04
>> traceTesting('h01_4444.mat',0.005*(4^4),175,100,[2,4],false)
Computing Hutchinson trace using Twin BiCG solver up to tolerance 1.280000e+00.
Twin BiCG: solving *first* system: it = 0, ||r1|| = 5.543e+01
Twin BiCG: solving *first* system: it = 50, ||r1|| = 1.020e+00
Twin BiCG: solving *first* system: it = 100, ||r1|| = 5.111e-04
Twin BiCG converged (first system): it = 121, ||r1|| = 7.151376e-06
Hutchinson (Twin BiCG) noise number 25, error = 5.592471e+00, avg time = 2.119070e-02, avg mvp time = 1.161429e-04
Hutchinson (Twin BiCG) noise number 50, error = 3.429886e+00, avg time = 1.983746e-02, avg mvp time = 1.158496e-04
Hutchinson (Twin BiCG) noise number 75, error = 3.078834e+00, avg time = 1.997454e-02, avg mvp time = 1.160166e-04
Hutchinson (Twin BiCG) noise number 100, error = 2.607429e+00, avg time = 1.982483e-02, avg mvp time = 1.159164e-04
Hutchinson (Twin BiCG) noise number 125, error = 2.382652e+00, avg time = 1.981113e-02, avg mvp time = 1.158252e-04
Hutchinson (Twin BiCG) noise number 150, error = 2.116026e+00, avg time = 1.982886e-02, avg mvp time = 1.157824e-04
Hutchinson (Twin BiCG) noise number 175, error = 1.917154e+00, avg time = 1.982694e-02, avg mvp time = 1.157253e-04
Hutchinson (Twin BiCG) noise number 200, error = 1.906488e+00, avg time = 1.993398e-02, avg mvp time = 1.156890e-04
Hutchinson (Twin BiCG) noise number 225, error = 1.804015e+00, avg time = 1.983914e-02, avg mvp time = 1.156711e-04
Hutchinson (Twin BiCG) noise number 250, error = 1.673230e+00, avg time = 1.982386e-02, avg mvp time = 1.156546e-04
Hutchinson (Twin BiCG) noise number 275, error = 1.591129e+00, avg time = 1.982529e-02, avg mvp time = 1.156422e-04
Hutchinson (Twin BiCG) noise number 300, error = 1.523522e+00, avg time = 1.981696e-02, avg mvp time = 1.156247e-04
Hutchinson (Twin BiCG) noise number 325, error = 1.481106e+00, avg time = 1.981202e-02, avg mvp time = 1.156083e-04
Hutchinson (Twin BiCG) noise number 350, error = 1.428140e+00, avg time = 1.984848e-02, avg mvp time = 1.156206e-04
Hutchinson (Twin BiCG) noise number 375, error = 1.387023e+00, avg time = 1.982129e-02, avg mvp time = 1.156112e-04
Hutchinson (Twin BiCG) noise number 400, error = 1.328517e+00, avg time = 1.982918e-02, avg mvp time = 1.156058e-04
Hutchinson (Twin BiCG) noise number 425, error = 1.289146e+00, avg time = 1.983626e-02, avg mvp time = 1.155995e-04
Hutchinson trace (Twin BiCG) converged on noise 435 with
--> 52756 MVP (6.098267e+00 s; 1.155938e-04 s/MVP),
--> 158147 daxpys (1.511571e+00 s; 9.558014e-06 s/daxpy),
--> 243 dot products,
--> error 1.278158e+00
Hutchinson trace time = 9.077659e+00 s
Hutchinson trace results saved in resultsH_alg24_h01_4444.mat

Respuestas (1)

Stephen23
Stephen23 hace alrededor de 14 horas
Editada: Stephen23 hace alrededor de 6 horas
This behaviour is almost certainly due to MATLAB's JIT (Just-In-Time) engine:
The clues in your description:
  1. The timeit hack triggers full JIT compilation. timeit works by calling the anonymous function many times in a tight loop. That repeated execution is exactly what prompts MATLAB's execution engine to fully JIT-compile the code path for A*P. Once compiled, that native code path stays active in memory, hence the speedup you observe for the remainder of the run.
  2. Editing the file invalidates the JIT cache. MATLAB tracks file modification timestamps. Any edit (even a trivial whitespace change) causes MATLAB to mark the cached compiled version as stale and fall back to interpretation on the next call. This perfectly explains why your "slow" code reverts to its original speed after an edit.
  3. The "fast" code is unaffected because it was presumably already hitting a well-compiled code path: either its structure is more amenable to early JIT optimisation, or it happens to share a code pattern MATLAB had already compiled from a prior call elsewhere in the session.
To confirm this hypothesis, you can add a warm-up call before your actual timed run (e.g., call A*P a handful of times in a loop) before starting your benchmarking. If the slow solver then runs at the post-timeit speed from the outset, JIT warm-up is the culprit.
"I have two solvers that I have coded for solving linear equations"
And from Steven Lord's comment:

1 comentario

From the mention of Twin BiCG I'm guessing one of the iterative methods on this documentation page might be better suited to the user's needs.

Iniciar sesión para comentar.

Productos

Versión

R2024b

Etiquetas

Preguntada:

hace alrededor de 18 horas

Editada:

hace alrededor de 6 horas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by