Contenido principal

qnn.LPAI

Interface to predict responses of deep learning model for QNN LPAI backend

Since R2025b

Description

The qnn.LPAI System object is an interface to predict responses of deep learning model represented as a QNN context binary for the LPAI backend of Qualcomm® AI Direct Engine.

To create the interface to predict responses of QNN LPAI:

  1. Create the qnn.LPAI object and set its properties.

  2. Call the object with arguments, as if it were a function.

To learn more about how System objects work, see What Are System Objects?

The code generated using qnn.LPAI System object can be deployed to Qualcomm Android Board that is available under the Hardware board parameter in Configuration Parameters.

Creation

Description

Windows Host

qnnlpai = qnn.LPAI("BINARY",QNNHostModel=qnnhostmodel.dll,QNNContextBinary=qnncontextbinary.bin) creates an interface to predict responses of QNN model (.dll for host and context binary file (.bin) for target) for the LPAI backend.

qnnhtp = qnn.HTP("BINARY",QNNHostModel=qnnhostmodel.dll,QNNContextBinary=qnncontextbinary.bin,DeQuantizeOutput=true) creates an interface similar to the previous syntax and performs dequantization of the output..

Linux Host

qnnlpai = qnn.LPAI("BINARY",QNNContextBinary=qnncontextbinary.bin) creates an interface to predict responses of QNN model (context binary file (.bin) for the host and the target) for the LPAI backend.

qnnhtp = qnn.HTP("BINARY",QNNContextBinary=qnncontextbinary.bin,DeQuantizeOutput=true) creates an interface similar to the previous syntax and performs dequantization of the output.

Properties

expand all

Unless otherwise indicated, properties are nontunable, which means you cannot change their values after calling the object. Objects lock when you call them, and the release function unlocks them.

If a property is tunable, you can change its value at any time.

For more information on changing property values, see System Design in MATLAB Using System Objects.

This property is read-only.

The format of deep learning network optimized to run on HTP (NPU) backend on the target, specified as a string.

Data Types: string

Specify the QNN model (.dll) to perform inference on Windows. Specifying this property is not required for Linux host. For details on creating an QNN model to run on device processors like LPAI, refer to Qualcomm AI Engine Direct SDK documentation.

If the QNN model is not present in the Current Folder in MATLAB, specify the absolute path along with <filename>.so

Specify the QNN context binary file (.bin) on the target to perform inference. For details on creating an binary file to run on device processors like LPAI, refer to Qualcomm AI Engine Direct SDK documentation.

If the QNN context binary file is not present in the Current Folder in MATLAB, specify the absolute path along with <filename>.bin

Set the value to true to dequantize the output after inference. This setting results in the output data type always being single, irrespective of the deep learning neural network's output layer data type.

Usage

Description

qnnresponse = qnnlpai(x) predicts responses for QNN LPAI backend using qnnlpai System object, based on the input data, x

Instead of calling the System object directly, you can also use the predict function to obtain the response.

Input Arguments

expand all

The input signal, specified as an N-dimensional array. The array must be of the same size specified for the Input layer size of the QNN host model.

The System objects support multiple-input multiple-output tensor with a maximum of four dimensions, but the batch size must always be 1. For example, if the input layer of the original deep learning network is 128-by-128-by-3, the input signal dimension must be either 128-by-128-by-3 or 1-by-128-by-128-by-3.

If the leading dimensions are 1 (singleton dimensions), you can remove these dimensions without affecting compatibility. For example, if the input layer of an AI model expects an input of size 1-by-1-by-128-by-3, you can specify an input of size 1-by-1-by-128-by-3 or 128-by-3. You can remove these dimensions because dimensions of size 1 can be broadcast to match the expected shape.

The input datatype must be as per the QNN network's input layer datatype. Additionally, the input can be floating-point even for quantized QNN network.

Output Arguments

expand all

The response after computing predictions using the selected QNN model , represented as an N-dimensional array. The output datatypes match the QNN network's output layers' datatypes. If you set DequantizeOuptut to true, the output is always single.

The System objects support a multiple-input multiple-output tensor with a maximum of four dimensions, but the batch size must always be 1.

Object Functions

To use an object function, specify the System object™ as the first input argument. For example, to release system resources of a System object named obj, use this syntax:

release(obj)

expand all

predictPredict response based on given data using System objects created for QNN backends (HTP, CPU, or LPAI) or eNPU
releaseRelease resources and allow changes to System object property values and input characteristics
cloneCreate duplicate System object

Examples

collapse all

  1. Prepare the QNN model for host and target, for the LPAI backend. To create interface to HTP backend, it is recommended that you copy the files to the Current Folder in MATLAB. Alternatively, you can note the absolute path of the file.

  2. Prepare the input data for inference. This example uses uniformly distributed random numbers of single data type.

    x = rand(299,299,3,'single');
  3. Create the QNN LPAI interface object. This example first checks for the operating system (Linux or Windows) to use the appropriate model file. The host and target models defined using the variables must be present in the in the Current Folder in MATLAB.

    if isunix
        obj = qnn.LPAI("BINARY",...
            QNNContextBinary=inception.serialized.bin);
    else
        obj = qnn.LPAI("BINARY",...
            QNNHostModel=Inception.dll,...
            QNNContextBinary=inception.serialized.bin");
    end

  4. Predict the response by using the Predict function.

    obj.predict(x);

Version History

Introduced in R2025b