# Why fwrite is ~320x slower in the second interation and onwards when writing interleaved data?

14 views (last 30 days)
I'd like to write a binary file, which should contain four vectors of integers in the same length in an interleaved fashion.
For example, let si,j denote the sample from the i th channel at time step j. I can prepare a long vector just as above and then pass it to fwrite and that is easier.
But that design is not great, because when individual vectors are really long, and when I have to accomodate a lot of vectors rather than just four, it will enconter a memory issue.
So, a clever way to to this is to write one vector input at a time without keeping all of them in the memory.
The code like below works, but I found that the performace of fwrite is much worse in the second iteration onwards compared to the first iteration.
While t3 in the first loop was less than a second (0.281 sec), t3 took roughly 1 min 30 sec for the second to fourth iterations (~320 times slower). I wonder why this is so slow, and if there is a work around for this.
fid = fopen(newfile,'w');
for i = 1:4
data = load_data(filepath); % int16 vector
status = fseek(fid,(i-1)*2,'bof');
if status == -1
error('fseek failed')
end
t1 = datetime;
fwrite(fid, data, 'int16', 2*(n-1));
t2 = datetime;
t3 = duration(t2-t1) % why this is slower in second iteration and onwards?
end
fclose(fid);
t3 = duration
00:00:00.2810
t3 = duration
00:01:37.2600
t3 = duration
00:01:29.9700
t3 = duration
00:01:29.6490
90 (sec) / 0.281 (sec) == ~320 (times)
A simplified code for testing
The longer the vectors are the slower the second and third iterations become, suggesting the slowness has something to do with data before reaching the end of file rather than adding extra bytes at the end of the file.
len = 1000;
i1 = ones(len,1,'int16');
i2 = ones(len,1,'int16').*2;
i3 = ones(len,1,'int16').*3;
fid = fopen('temp.bin','w')
t1= datetime;
fwrite(fid,i1,'int16',2*2);
t2 = datetime;
t3(1,1) = duration(t2-t1);
t1= datetime;
fseek(fid,2, 'bof')
fwrite(fid,i2,'int16',2*2);
t2 = datetime;
t3(2,1) = duration(t2-t1);
t1= datetime;
fseek(fid,2*2, 'bof')
fwrite(fid,i3,'int16',2*2);
t2 = datetime;
t3(3,1) = duration(t2-t1);
fclose(fid);
t3.Format = 'dd:hh:mm:ss.SSSS'
len = 1000
t3 = 3×1 duration array
00:00:00.0070
00:00:00.0140
00:00:00.0190
2~3 times slower
len = 10000
t3 = 3×1 duration array
00:00:00.0060
00:00:00.0850
00:00:00.1050
14~18 times slower
len = 100000
t3 = 3×1 duration array
00:00:00.0280
00:00:00.9820
00:00:00.9080
35 times slower

Guillaume on 22 Jul 2019
I would suspect the reason for the much slower writes in iteration 2 and onward is that for the first iteration, since the file is new, you're just writing (your data, and 0s in between). From iteration 2, matlab must first read the relevant part of the file, insert the new content and rewrite that.
If you can't fit the whole interlaced data in memory in one go, then you should read the input data in chunk, interlace it in memory, then write it as one continuous block. Repeat for the next chunk of data.

Kouichi C. Nakamura on 23 Jul 2019
Thanks, Guillaume for very clear presentation.
I just received a reply from MathWorks and they consider editing the documentation.
He also said:
> I would like just to add that the response from Guillaume is quite accurate, and the main difference is that during the first loop all information can be saved directly as everything in between is simply padded with zeros, whilst for the following iteration the information in between needs to be preserved. The checks and verification that take place to avoid overwriting data are the ones causing the delay.
We don't know the buffer size, right? Do you have any suggestions for the chunk size in terms of bytes? The chunk size should be smaller than the buffer size. Then, I guess 1 GB might be too big, for example. But it's all guessing game, and I feel quite uncomfortable. One might say, you see this is the downside of using proprietary language like MATLAB.
Guillaume on 23 Jul 2019
No, if you're reading and writing in chunk, the buffer size doesn't matter. and for best performance, I would think you'd want to be above that buffer size. The chunk size should only depend on how much memory you want to use / have available.
So the code would be something like:
%filenames: cell array of input files
chunksize = 4000; %whatever you want, the actual size in bytes will be nfiles * chunksize * sizeof(datatype)
buffer = zeros(numel(filemames), chunksize, 'int16');
fidin = zeros(size(filenames));
%open the input files
for fileidx = 1:numel(filenames)
assert(fidin(fileidx) > 0, 'failure to open file %d', fileidx);
end
fidout = fopen(newfile, 'w');
while true
for fileidx = 1:numel(filenames)
end
fwrite(fidout, buffer, 'int16'); %since data has been read in rows and write is by column, the data is written interlaced
if numread(1) < chunksize || feof(fidin(1))
break;
end
end
fclose(fidout);
for fileidx = 1:numel(filenames)
fclose(fidin(fidin));
end
Kouichi C. Nakamura on 23 Jul 2019

Walter Roberson on 22 Jul 2019
Edited: Walter Roberson on 22 Jul 2019
fseek after end of file followed by fwrite, results in the data being written at eof. This is contrary to POSIX which requires that the gap be filled with 0 (possibly implicitly with a Demand Zero scheme). In MATLAB if you want to write at some point after eof you must write into the gap yourself.

#### 1 Comment

Kouichi C. Nakamura on 23 Jul 2019
> In MATLAB if you want to write at some point after eof you must write into the gap yourself.
In my example above, there are four iterations.
For i = 1, fwrite reaches the end of the file for the first time.
For i = 2 to 4, however, fwrite needs to add 2 extra bytes at the end of file for every iteration. But does this small addition takes 320x more time? Or have I completely missed your point here? I don't really know what POSIX means to be honest.
In oder to further examine the issue, I wrote a simplified test code, which I thought inherits all the key features of the original code. To my dismay, this didn't result in massive slow down. Mmm, I tested this on Mac, while the above was done on Windows.
i1 = ones(1000,1,'int16');
i2 = ones(1000,1,'int16').*2;
i3 = ones(1000,1,'int16').*3;
fid = fopen('temp.bin','w')
t1= datetime;
fwrite(fid,i1,'int16',2*2);
t2 = datetime;
t3(1,1) = duration(t2-t1);
t1= datetime;
fseek(fid,2, 'bof')
fwrite(fid,i2,'int16',2*2);
t2 = datetime;
t3(2,1) = duration(t2-t1);
t1= datetime;
fseek(fid,2*2, 'bof')
fwrite(fid,i3,'int16',2*2);
t2 = datetime;
t3(3,1) = duration(t2-t1);
fclose(fid);
t3.Format = 'dd:hh:mm:ss.SSSS'
t3 = 3×1 duration array
00:00:00.0665
00:00:00.0738
00:00:00.1156

R2019a

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!