Definition of "Frame" for Recurrent models

7 visualizaciones (últimos 30 días)
Tommaso
Tommaso el 27 de En. de 2025
Respondida: Wang Chen el 31 de En. de 2025
Hi all,
I am profiling the performance of the MATLAB Deep Learning HDL Toolbox for LSTM-based models.
When estimating the processor's performance with the estimatePerformance function, I couldn't find a clear definition of the term "Frame" in the context of recurrent models.
Does it refer to a single element within the sequence to be processed (i.e. a timestep), or does it represent the entire sequence?
Since changing the number of timesteps in the model doesn't affect the processing time estimate for 1 frame, I am assuming it refers to a single element of the sequence. However, I would appreciate confirmation or clarification on this.
Thanks in advance for your help!

Respuestas (2)

Sreeram
Sreeram el 29 de En. de 2025
Hi Tommaso,
The term "Frame" stands for the entire sequence here, and not a single element. This applies to all layers supported by the toolbox, including those used in recurrent models like LSTMs.
The "FrameCount" parameter in "estimatePerformance" function specifies the number of such sequences (or frames) considered during performance estimation. By default, "FrameCount" is set to 1, meaning the performance metrics are calculated based on processing a single sequence. Adjusting the "FrameCount" allows you to estimate performance for multiple sequences as needed.
Thanks!
  1 comentario
Tommaso
Tommaso el 29 de En. de 2025
Hi Sreeram,
Thank you for clarifying that the term "Frame" refers to the entire sequence.
However, I don't understand why the Frames/s metric (i.e., Sequences/s) does not depend on the number of timesteps (i.e., the length of the sequences).
For reference, here is the code I am running. I have imported the model from tensorflow. The input is a sequence of 235 words (#words = "n_timesteps") and then each word is transformed on a array of 32 elements by the Embedding layer. Thus, the LSTM processes a 32-element array for each of the 235 timesteps.
clc
% Initialize model (code generated by the Deep Network Designer App after importing the tensorflow model)
params = load("params_2025_01_29__16_09_42.mat");
net = dlnetwork;
% Number of timesteps (toggle comment to explore the three cases)
%n_timesteps = 1
n_timesteps = 235 % Original value imported from tf
%n_timesteps = 100000
tempNet = [
sequenceInputLayer(1,"Name","embedding_input","MinLength",n_timesteps)
wordEmbeddingLayer(32,10000,"Name","embedding","Weights",params.embedding.Weights)
lstmLayer(32,"Name","lstm","BiasInitializer","narrow-normal","InputWeightsInitializer","narrow-normal","RecurrentWeightsInitializer","narrow-normal")
fullyConnectedLayer(1,"Name","dense","Bias",params.dense.Bias,"Weights",params.dense.Weights)
sigmoidLayer("Name","dense_sigmoid")];
net = addLayers(net,tempNet);
% clean up helper variable
clear tempNet;
net = initialize(net);
plot(net);
% Initialize dlhdl processor with default configuration
hPC = dlhdl.ProcessorConfig
% Optimize configuration for the target network
optimizeConfigurationForNetwork(hPC, net)
hPC.estimateResources
hPC.estimatePerformance(net)
I need to estimate the total time needed for processing the "n_timesteps" elements given in input to the model.
In this example, whether I provide a sequence of 1, 235, or 10000 elements, the Frame/s value remains the same.
Could you please explain this behavior?
Thank you very much for your help!

Iniciar sesión para comentar.


Wang Chen
Wang Chen el 31 de En. de 2025
Hi Tommaso,
To clarify, when running an LSTM network, the term "Frame" in the estimatePerformance report refers to an element in the sequence, not the entire sequence.
For example, following examples shows an example of the profiler report, which uses the same format as the estimatePerformance report:
In the context of the LSTM network:
  1. The FramesNum is the sequence length (The number of elements in the sequence)
  2. The Total Latency is the total FPGA clock cycles used to run the entire sequence
  3. The Frames/s is calculated to reflect how many elements can be processed each second. So based on this Frames/s number, and your sequence length, you can calculate the total seconds needed to finish the entire sequence.
Thanks,
Wang

Categorías

Más información sobre System Integration of Deep Learning Processor IP Core en Help Center y File Exchange.

Productos


Versión

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by