Problem 44383. Code breaker, Part III: Operation Xiangliu
Solution Stats
Problem Comments
-
6 Comments
I don't understand the computing of the certitude parameter.
In Example1: ...Twelve of the 51 characters have been matched.. Why 51 characters ?
In Example2:... Eight of the 37 characters have been matched: "My", " 's ", and "been". 7 not 8 characters ?
Could you be more specific ?
Thank you for these great problems !
Hello, Jean-Marie Sainthillier. Thanks for your feedback. Regarding Ex. 1, the key information is given in a preceding paragraph about the general calculation of 'certitude', which states: "[...] accounting for [...] of the characters (excluding spaces)". I excluded spaces because there are none in the 'dictionary', so they would skew the results. E.g. if the phrase "They didn't understand" were padded with one hundred spaces before encoding, then we would always get a low 'certitude' if spaces weren't excluded (viz. 28%), suggesting only low–moderate confidence in the decryption, despite decoding half of the matchable characters, which should have given us high confidence (viz. 100%) given the small size of our 'dictionary'. Arguably punctuation could have been excluded too, but I opted to _include_ punctuation, not least because apostrophes do occur in the 'dictionary'. That is why there are 8 matched characters in Ex. 2: the apostrophe is counted (in _'s_). In a bigger 'dictionary', hyphens and full stops could also occur. Some punctuation, such as commas and semicolons, would perhaps never occur in any practical 'dictionary', but their effect on the 'certitude' is relatively small, and segregating punctuation according to function is beyond the present scope!! —DIV
Thank you :)
I would like to suggest that when timing code for exercises that you only measure the function implemented by the user. In this case, your test suite code should be:
for i = 1 : qBig
% EDIT (2018-06-17). Ensure each case is unique.
characters = [' , .' char(randi([97,122], [1,23]))];
x(1+i) = {characters( randperm(numel(characters)) )};
% END EDIT (2018-06-17)
end;
t0 = clock;
t0_cpu = cputime;
% Loop
for i = 1 : qBig
y(1) = x(1);
y(2) = x(1+i);
solution = decode(y, bncWordlist);
end;
% Compute and display elapsed time.
dt = etime(clock, t0);
dt_cpu = (cputime - t0_cpu);
for dummy = 1 : 10, disp(' . '); end;
fprintf('Your wall time to decode %u message batches = %2.2f seconds.\n\r', qBig, dt)
Finally, I am not sure that cputime or clock are indeed good tools for measuring efficient code (this is not their purpose). Timeit is much more adequate since it runs our code several times and uses the median for removing noise. The function cputime measures the amount of work that the CPU is currently doing, which may include everything that is running on a MATLAB session.
This problem is way too hard. It is not enough to solve the test cases or do vectorized code, we must find the right combination of MATLAB functions that achieve the greatest speed, which is boring. This means we have to write many times the same code until the right combination of functions becomes obvious (I had to create 6 different codes or more, until I finally got a valid solution). Moreover it is not clear why the number 8 (seconds) is somehow a fair measurement of timing for this task since we must try all combinations and deal with strings. Operation Phoenix and Orthos were nice, good job. They had a limit of 3 seconds, but we didn't have to look up words in a dictionary (adding just 5 more seconds for searching in a dicionary of 100 words that must be used at every iteration doesn't seem fair). Even after converting the dictionary words to hash keys, I was able to barely pass the test suite.
Hello, Rafael S. T. Vieira.
Firstly, thank-you for your constructive feedback.
When I wrote the Test Suite — a few years ago — I explored a few options for the timing, including use of the timeit() function. I believe that there was a specific rationale for the choice I ultimately made, although I can't confidently recall all the details off the top of my head.
As for the implementation of the timing, I concede that currently it includes some overhead from generating some random strings. However, I doubt that the overhead is 'significant' — I would be *guessing* that it would account for less than 1% of the total time elapsed. So while your proposed amendment could be a nice improvement, it's probably best left as a tip for future Problems: I prefer not to change an existing Test Suite unless there is a compelling reason.
Next, let me share with you the basis for the 8-second benchmark. I wrote an unoptimised reference solution and added a small safety margin. My hypothesis was that if my unoptimised code could satisfy the Test, then most likely other (experienced) Cody players could also pass the Test if they avoid slow operations. When the Problem was published, indeed the Cody players who have solved it were able to pass the test using a variety of approaches. Indeed, some advanced Cody players were able to highly optimise their code and deliver impressively quick times.
You may notice that I built in an allowance for increasing computational power. This allowance also had a safety margin. In other words, the increased number of iterations is somewhat less than the projected increase (from historical data) that would maintain equity. Of course, even predictions based on data can be wrong, so I have checked my reference solution again (twice).
I got the following:
"Your wall time to decode 2370 message batches = 7.99 seconds."
"Your wall time to decode 2370 message batches = 7.62 seconds."
In both cases it passed. The first instance was amusingly close to the limit, but I mention again that this code was not optimised yet it still passed. I can assure you that it has no "hashes".
Finally, the subtext is that this is intended to be a difficult Problem: a challenge for experienced Cody players. Some will like that, and will even enjoy challenging themselves to optimise their own code. Others won't be interested, and there are thousands of other Problems on Cody for them to solve.
—DIV
Solution Comments
Show commentsProblem Recent Solvers11
Suggested Problems
-
5215 Solvers
-
336 Solvers
-
Longest run of consecutive numbers
4723 Solvers
-
575 Solvers
-
Finding an element in a vector
179 Solvers
More from this Author32
Problem Tags
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!