Is there a way to identify group names in an H5 file programmatically without using h5info?

25 visualizaciones (últimos 30 días)
I am working with large (0.5-2 GB) and complex h5 data files and trying to identify the high level group names in the files. The names of the groups change for each file, so I need to be able to programmatically identify them. Below the high level of groups, the file structure is consistent, so I can efficiently use h5read once I have these 5-10 group names. Using hdf5info works, but is very slow because it is scanning the entire file and giving me much more metadata than I really care about (each high level group has thousands of nested lower level groups/datasets/attributes). The MATLAB recommended h5info is much slower for some reason. In fact, I have never actually let it run to completion, usually giving up after half an hour.
I have also tried setting the "ReadAttributes" bool to FALSE which for some reason made hdf5info take even longer to run. Is there a more efficient way to identify only the top level of group names in the h5 file?
Thanks,

Respuesta aceptada

Jacob
Jacob el 23 de Nov. de 2022
I finally figured it out by using the low level H5 functions built into MATLAB (H5G.get_objname_by_idx). This is exponentially faster than running the full hdf5info.
fid = H5F.open('test.H5');
idx = 0;
while true
this_name = H5G.get_objname_by_idx(fid,idx);
if isempty(this_name)
break
end
group_names{idx+1,1} = this_name;
idx = idx+1;
end
num_grps = length(group_names);
H5F.close(fid);
  1 comentario
John Wolter
John Wolter el 3 de Jul. de 2024
Editada: John Wolter el 3 de Jul. de 2024
Note that Jacob's routine does not test the object names found to see if they are actually groups. I wrote a version of this to traverse the full groups tree and discovered that H5G.get_objname_by_idx found groups, datasets, and datatypes in my file. There were no links in my file, so I don't know if it would find those as well.
I used the routine below to distingush between datasets and groups, but I wouldn't be surprised if there is a more elegant solution.
try
did = H5D.open(fid, full_name);
% dataset open succeeded; do action appropriate for a dataset
...
catch
try
gid = H5G.open(fid, full_name);
% group open succeeded; do action appropriate for a group
...
catch
...
end
end

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Workspace Variables and MAT Files en Help Center y File Exchange.

Productos


Versión

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by