Possible bug in H5D.write, truncation of VLEN strings

2 visualizaciones (últimos 30 días)
Hello,
I have discovered a potential bug, or at least some flaky behavior when using the low level HDF5 write function. When I try to write a long string as a variable length string, it seems to get truncated at 512 bytes (511 + the terminating null). I can write it just fine as a fixed length string.
The minimal script below reproduces the error. I see this on R2012a on both Linux and Mac. Am I missing a parameter or function call that controls the VLEN buffer size, or is something improperly hard coded in the underlying mex function?
Cheers, Souheil
-------------
% Create a long string
str = repmat('Hello from matlab. ',[1 1000]);
fprintf('Size of string = %d\n',length(str));
% Create an HDF5 file
filename = 'vlen_string_bug.h5';
fid = H5F.create(filename,'H5F_ACC_TRUNC','H5P_DEFAULT','H5P_DEFAULT');
% Write to a dataset as a variable length string
VLstr_type = H5T.copy('H5T_C_S1');
H5T.set_size(VLstr_type,'H5T_VARIABLE');
space = H5S.create_simple(1, 1, []);
dset = H5D.create(fid, 'VLstr', VLstr_type, space, 'H5P_DEFAULT');
fprintf('Size of VLEN_BUF before = %d\n',H5D.vlen_get_buf_size(dset, VLstr_type, space));
H5D.write(dset, VLstr_type, 'H5S_ALL', 'H5S_ALL', 'H5P_DEFAULT', {str});
fprintf('Size of VLEN_BUF after = %d\n',H5D.vlen_get_buf_size(dset, VLstr_type, space));
H5T.close(VLstr_type);
H5S.close(space);
H5D.close(dset);
% Write to a dataset as a fixed length string
Fstr_type = H5T.copy('H5T_C_S1');
H5T.set_size(Fstr_type, length(str));
space = H5S.create_simple (1, 1, []);
dset = H5D.create (fid, 'Fstr', Fstr_type, space, 'H5P_DEFAULT');
H5D.write(dset, Fstr_type, 'H5S_ALL', 'H5S_ALL', 'H5P_DEFAULT', str);
H5T.close(Fstr_type);
H5S.close(space);
H5D.close(dset);
% Close the file
H5F.close(fid);
% Read the strings back in using the high level read function
t = h5read(filename,'/VLstr');
vlstr = t{1};
fprintf('Size of VLEN string on disk = %d\n',length(vlstr));
t = h5read(filename,'/Fstr');
fstr = t{1};
fprintf('Size of fixed string on disk = %d\n',length(fstr));

Respuesta aceptada

Souheil Inati
Souheil Inati el 27 de Sept. de 2012
Looks like this is a bug in the R2012a mex files on mac and linux. It seems that R2012b resolves it. Thanks for everyone's input.

Más respuestas (1)

per isakson
per isakson el 15 de Sept. de 2012
Editada: per isakson el 15 de Sept. de 2012
I ran the example h5ex_t_vlstring with your long string. Yes, it is truncated as you state.
However, HDF5 User's Guide, page 228, says:
[...] a length and data buffer must be allocated.
I don't see how.
This is not much of an answer. However, could it be that 512 is a default value that needs to be replaced by an appropriate value.
  4 comentarios
Oleg Komarov
Oleg Komarov el 15 de Sept. de 2012
Editada: Oleg Komarov el 15 de Sept. de 2012
I found a description on the fields for H5F.get_mdc_config on http://www.hdfgroup.org/HDF5/doc/RM/RM_H5F.html#File-SetMdcConfig and maybe the properties set_initial_size and initial_size are relevant to the buffer.
However, I am unsure where to set those properties, at the File, dataset or property list level (H5F, H5D, H5P)...
I think it would be faster if you submitted a technical support request to TMW or to the HDFgroup.
Post any solution here (I am curious as well).
per isakson
per isakson el 16 de Sept. de 2012
Editada: per isakson el 16 de Sept. de 2012
Here is a link to hdf-forum. A few Matlab related questions have been answered there. I cannot really contribute.

Iniciar sesión para comentar.

Categorías

Más información sobre Entering Commands en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by