# integer strings decoding ... speed optimization

2 views (last 30 days)
Michal Kvasnicka on 13 Nov 2017
Commented: Michal Kvasnicka on 15 Nov 2017
I have the following problem:
I need decode integer sequences "c" to char string messages "m" by following association:
numpos = 10 % ( = size(c,2)/2)
c = [3 4 1 1 4 2 5 2 3 3,1 1 1 1 2 2 2 3 3 3]
Each row of "c" represents 2*numpos integers, where first numpos parameters encoded position of
types = {'a' 'b@2' 'c@6' 'd@10' 'e@11'}
and second numpos parameters are applied only if type contains character '@' like this:
m = ' c:1@6 d:1@10 a a d:2@10 b:2@2 e:2@11 b:3@2 c:3@6 c:3@6'
My current solution is as follows:
function m = c2m(c,types)
numpos = size(c,2)/2;
F = cellfun(@(f) [' ' f], strrep(types,'@',':%d@'),'unif',0);
m = arrayfun(@(f,k) sprintf(f{1},k),F(c(:,1:numpos)),c(:,numpos+(1:numpos)),'unif', 0);
m = arrayfun(@(i) horzcat(m{i,:}), (1:numlines)', 'unif', 0)
end
and the testing code is as follows:
numlines = 10;
c = repmat([3 4 1 1 4 2 5 2 3 3,1 1 1 1 2 2 2 3 3 3],numlines,1);
types = {'a' 'b@2' 'c@6' 'd@10' 'e@11'};
m = c2m(c,types);
m =
10×1 cell array
{' c:1@6 d:1@10 a a d:2@10 b:2@2 e:2@11 b:3@2 c:3@6 c:3@6'}
{' c:1@6 d:1@10 a a d:2@10 b:2@2 e:2@11 b:3@2 c:3@6 c:3@6'}
{' c:1@6 d:1@10 a a d:2@10 b:2@2 e:2@11 b:3@2 c:3@6 c:3@6'}
{' c:1@6 d:1@10 a a d:2@10 b:2@2 e:2@11 b:3@2 c:3@6 c:3@6'}
{' c:1@6 d:1@10 a a d:2@10 b:2@2 e:2@11 b:3@2 c:3@6 c:3@6'}
{' c:1@6 d:1@10 a a d:2@10 b:2@2 e:2@11 b:3@2 c:3@6 c:3@6'}
{' c:1@6 d:1@10 a a d:2@10 b:2@2 e:2@11 b:3@2 c:3@6 c:3@6'}
{' c:1@6 d:1@10 a a d:2@10 b:2@2 e:2@11 b:3@2 c:3@6 c:3@6'}
{' c:1@6 d:1@10 a a d:2@10 b:2@2 e:2@11 b:3@2 c:3@6 c:3@6'}
{' c:1@6 d:1@10 a a d:2@10 b:2@2 e:2@11 b:3@2 c:3@6 c:3@6'}
The code is still too slow for me, I am looking for any speed up. In this case the most significant fraction of CPU time is spent at built-in function "sprintf".
Typical realistic sizes of problem are:
numpos ~ 30 ... 60
numlines ~ 1e4 ... 1e5
Any idea?

Michal Kvasnicka on 15 Nov 2017
Edited: Michal Kvasnicka on 15 Nov 2017
Probably fastest and simplest solution, I found so far ... using latest new Matlab (>= R2016b) features, see function insertBefore and string datatype.
function m = c2m(c,types)
types = string(types);
numpos = size(c,2)/2;
a = c(:,1:numpos);
b = c(:,(numpos+1):end);
m = types(a);
m = insertBefore(m,"@", ":" + b);
m = join(m,2);
end

Jan on 15 Nov 2017
Does this consider that some types as "a" do not get an element of b?
Michal Kvasnicka on 15 Nov 2017
I am not sure, what do you mean exactly. Please clarify your question.

Jan on 13 Nov 2017
Edited: Jan on 13 Nov 2017
[EDITED] Consider all rows of c:
function m = c2m(c,types)
[s1, s2] = size(c);
numpos = s2 / 2;
m = cell(s1, 1);
typesF = strrep(types, '@', ':%d@'); % types to format specifiers
hasNum = ~strcmp(types, typesF); % true if the type has a '%d'
for im = 1:s1
c1 = c(im, 1:numpos);
c2 = c(im, numpos+1:end);
FmtSpec = sprintf(' %s', typesF{c1}); % Complete list of format specs
m{im} = sprintf(FmtSpec, c2(hasNum(c1))); % All c2, if c1 has a number spec
end
end
UNTESTED - I have no Matlab currently.

Show 1 older comment
Jan on 13 Nov 2017
But does it work correctly if c is a single row? It would be kind to provide this information, because I cannot run Matlab currently. The code is just written in the forum's interface.
I've edited the code to expand it to multi-row input.
% FmtSpec = sprintf(' %s', typesF{c1}); % Replace with:
FmtSpec = CStr2String(typesF{c1}, ' ', 'noTrail');
Michal Kvasnicka on 14 Nov 2017
FmtSpec = CStr2String(typesF{c1}, ' ', 'noTrail');
should be
FmtSpec = CStr2String(typesF(c1), ' ', 'noTrail');
But the speed up with MEX file is only about a few percent.
Michal Kvasnicka on 14 Nov 2017
Jan, thanks a lot for your help. Your code is very good. Especially the fact, that the for-loop is possible to simple transform to parfor-loop to get some additional speed-up without any re-programming.