Why DatasetRef 'get' method is faster with index rather than name?
6 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Marco
el 3 de Jun. de 2025
Comentada: Walter Roberson
el 16 de Jul. de 2025
I would like to load some elements from a big dataset (output of a Simulink simulation).
I decided to use the Simulink.SimulationData.DatasetRef() fuction to avoid loading the entire dataset in my workspace. For example:
ref_data = Simulink.SimulationData.DatasetRef(saving_path, "logout");
Then, I tried to use the get() method of the DatasetRef to load some elements. I noticed that if I pass the element name, the method is slow, whereas if I pass the element index, the method is much faster.
Here there is an example:
clear
saving_path = 'dataset.mat';
el_name = 'el_name';
ref_data = Simulink.SimulationData.DatasetRef(saving_path, "logout");
tic
a = ref_data.get(el_name).Values;
disp('Time with name:')
toc
tic
index = find(strcmp(ref_data.getElementNames, el_name));
b = ref_data.get(index).Values;
disp('Time with index:')
toc
if isequal(a,b)
disp('a and b are equal')
end
The result is:
Time with name:
Elapsed time is 4.327172 seconds.
Time with index:
Elapsed time is 0.035908 seconds.
a and b are equal
(tested in Matlab R2024b abd Matlab R2022b)
Why does the call with the element name take much more time?
The solution with the index is simple and effective, but less readable.
3 comentarios
Walter Roberson
el 4 de Jun. de 2025
Interesting. I would expect minor differences, but no-where near the difference that you see.
Respuesta aceptada
Ronit
el 16 de Jul. de 2025
Editada: Ronit
el 16 de Jul. de 2025
This slowdown happens because 'get(name)' does a linear search through all element names each time, which is slow for large datasets. In contrast, 'get(index)' directly accesses the element, making it much faster.
If you need to access elements by name but want better speed, I recommend building your own mapping at the start as a workaround:
names = ref_data.getElementNames;
name2idx = containers.Map(names, 1:numel(names));
idx = name2idx(el_name);
a = ref_data.get(idx).Values;
Setting up the map is also linear time, but you only pay this cost once. The big advantage is that all subsequent lookups by name are extremely fast (constant time), rather than slow linear searches every time. This makes a big difference if you need to access many elements by name repeatedly.
Please refer to the documentation page for 'containers.Map' for more details: https://www.mathworks.com/help/matlab/ref/containers.map.html
I hope this helps with your query!
1 comentario
Más respuestas (0)
Ver también
Categorías
Más información sobre Outputs en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!