System Identification Overview

System identification is a methodology for building mathematical models of dynamic systems using measurements of the input and output signals of the system.

The process of system identification requires that you:

Measure the input and output signals from your system in time or frequency domain.
Select a model structure.
Apply an estimation method to estimate values for the adjustable parameters in the candidate model structure.
Evaluate the estimated model to see if the model is adequate for your application needs.

Dynamic Systems and Models

In a dynamic system, the values of the output signals depend on both the instantaneous values of the input signals and also on the past behavior of the system. For example, a car seat is a dynamic system—the seat shape (settling position) depends on both the current weight of the passenger (instantaneous value) and how long the passenger has been riding in the car (past behavior).

A model is a mathematical relationship between the input and output variables of the system. Models of dynamic systems are typically described by differential or difference equations, transfer functions, state-space equations, and pole-zero-gain models.

You can represent dynamic models in both continuous-time and discrete-time form.

An often-used example of a dynamic model is the equation of motion of a spring-mass-damper system. As the following figure shows, the mass moves in response to the force F(t) applied on the base to which the mass is attached. The input and output of this system are the force F(t) and displacement y(t), respectively.

Force vector and base on the left, spring and damper in the middle, and mass and output vector on the right

Continuous-Time Dynamic Model Example

You can represent the same physical system as several equivalent models. For example, you can represent the mass-spring-damper system in continuous time as a second-order differential equation:

$m \frac{d^{2} y}{d t^{2}} + c \frac{d y}{d t} + k y (t) = F (t)$

Here, m is the mass, k is the stiffness constant of the spring, and c is the damping coefficient. The solution to this differential equation lets you determine the displacement of the mass y(t), as a function of external force F(t) at any time t for known values of constant m, c, and k.

Consider the displacement y(t) and velocity $v (t) = \frac{d y (t)}{d t}$ as state variables:

$x (t) = [\begin{matrix} y (t) \\ v (t) \end{matrix}]$

You can express the previous equation of motion as a state-space model of the system:

$\begin{array}{l} \frac{d x}{d t} = A x (t) + B F (t) \\ y (t) = C x (t) \end{array}$

The matrices A, B, and C are related to the constants m, c, and k as follows:

$\begin{matrix} A = [\begin{matrix} 0 & 1 \\ - \frac{k}{m} & - \frac{c}{m} \end{matrix}] \\ B = [\begin{matrix} 0 & \frac{1}{m} \end{matrix}] \\ C = [\begin{matrix} 1 & 0 \end{matrix}] \end{matrix}$

You can also obtain a transfer function model of the spring-mass-damper system by taking the Laplace transform of the differential equation:

$G (s) = \frac{Y (s)}{F (s)} = \frac{1}{(m s^{2} + c s + k)}$

Here, s is the Laplace variable.

Discrete-Time Dynamic Model Example

Suppose you can observe only the input and output variables F(t) and y(t) of the mass-spring-damper system at discrete time instants t = nT_s, where T_s is a fixed time interval and n = 0, 1 , 2, .... The variables are said to be sampled with sample time T_s. Then, you can represent the relationship between the sampled input-output variables as a second-order difference equation, such as

$y (t) + a_{1} y (t - T_{s}) + a_{2} y (t - 2 T_{s}) = b F (t - T_{s})$

Often, for simplicity, T_s is taken as one time unit, and the equation can be written as

$y (t) + a_{1} y (t - 1) + a_{2} y (t - 2) = b F (t - 1)$

Here, a₁ and a₂ are the model parameters. The model parameters are related to the system constants m, c, and k, and the sample time T_s.

This difference equation shows the dynamic nature of the model. The displacement value at the time instant t depends not only on the value of force F at a previous time instant, but also on the displacement values at the previous two time instants y(t–1) and y(t–2).

You can use this equation to compute the displacement at a specific time. The displacement is represented as a weighted sum of the past input and output values:

$y (t) = b F (t - 1) - a_{1} y (t - 1) - a_{2} y (t - 2)$

This equation shows an iterative way of generating values of the output y(t) starting from initial conditions y(0) and y(1) and measurements of input F(t). This computation is called simulation.

Alternatively, the output value at a given time t can be computed using the measured values of output at the previous two time instants and the input value at a previous time instant. This computation is called prediction. For more information on simulation and prediction using a model, see topics on the Simulation and Prediction page.

You can also represent a discrete-time equation of motion in state-space and transfer-function forms by performing the transformations similar to those described in Continuous-Time Dynamic Model Example.

Use Measured Data in System Identification

System identification uses the input and output signals you measure from a system to estimate the values of adjustable parameters in a given model structure. You can build models using time-domain input-output signals, frequency response data, time-series signals, and time-series spectra.

To obtain a good model of your system, you must have measured data that reflects the dynamic behavior of the system. The accuracy of your model depends on the quality of your measurement data, which in turn depends on your experimental design.

Time-Domain Data

Time-domain data consists of the input and output variables of the system that you record at a uniform sampling interval over a period of time.

For example, if you measure the input force F(t) and mass displacement y(t) of the spring-mass-damper system illustrated in Dynamic Systems and Models at a uniform sampling frequency of 10 Hz, you obtain the following vectors of measured values:

$\begin{array}{l} u_{m e a s} = [F (T_{s}), F (2 T_{s}), F (3 T_{s}), ..., F (N T_{s})] \\ y_{m e a s} = [y (T_{s}), y (2 T_{s}), y (3 T_{s}), ..., y (N T_{s})] \end{array}$

Here, T_s = 0.1 seconds and NT_s is the time of the last measurement.

If you want to build a discrete-time model from this data, the data vectors u_meas and y_meas and the sample time T_s provide sufficient information for creating such a model.

If you want to build a continuous-time model, you must also know the intersample behavior of the input signals during the experiment. For example, the input can be piecewise constant (zero-order hold) or piecewise linear (first-order hold) between samples.

Frequency-Domain Data

Frequency-domain data represents measurements of the system input and output variables that you record or store in the frequency domain. The frequency-domain signals are Fourier transforms of the corresponding time-domain signals.

Frequency-domain data can also represent the frequency response of the system, represented by the set of complex response values over a given frequency range. The frequency response describes the outputs to sinusoidal inputs. If the input is a sine wave with frequency ω, then the output is also a sine wave of the same frequency, whose amplitude is A(ω) times the input signal amplitude and a phase shift of Φ(ω) with respect to the input signal. The frequency response is A(ω)e^(iΦ(ω)).

In the case of the mass-spring-damper system, you can obtain the frequency response data by using a sinusoidal input force and measuring the corresponding amplitude gain and phase shift of the response over a range of input frequencies.

You can use frequency-domain data to build both discrete-time and continuous-time models of your system.

Data Quality Requirements

System identification requires that your data capture the important dynamics of your system. Good experimental design ensures that you measure the right variables with sufficient accuracy and duration to capture the dynamics you want to model. In general, your experiment must:

Use inputs that excite the system dynamics adequately. For example, a single step is seldom enough excitation.
Measure data long enough to capture the important time constants.
Set up a data acquisition system that has a good signal-to-noise ratio.
Measure data at appropriate sampling intervals or frequency resolution.

You can analyze the data quality before building the model using the functions and techniques described in Analyze Data. For example, you can analyze the input spectra to determine if the input signals have sufficient power over the bandwidth of the system. To get analysis and processing recommendations for your specific data, use advice.

You can also analyze your data to determine peak frequencies, input delays, important time constants, and indication of nonlinearities using nonparametric analysis tools in this toolbox. You can use this information for configuring model structures for building models from data. For more information, see:

Build Models from Data

Model Structure

A model structure is a mathematical relationship between input and output variables that contains unknown parameters. Examples of model structures are transfer functions with adjustable poles and zeros, state-space equations with unknown system matrices, and nonlinear parameterized functions.

The following difference equation represents a simple model structure:

$y (k) + a y (k - 1) = b u (k)$

Here, a and b are adjustable parameters.

The system identification process requires that you choose a model structure and apply the estimation methods to determine the numerical values of the model parameters.

You can use one of the following approaches to choose the model structure:

You want a model that is able to reproduce your measured data and is as simple as possible. You can try various mathematical structures available in the toolbox. This modeling approach is called black-box modeling.
You want a specific structure for your model, which you might have derived from first principles, but do not know numerical values of its parameters. You can represent the model structure as a set of equations or as a state-space system in MATLAB^® and estimate the values of its parameters from data. This approach is known as grey-box modeling.

Estimate Model Parameters

The System Identification Toolbox™ software estimates model parameters by minimizing the error between the model output and the measured response. The output y_model of the linear model is given by

y_model(t) = Gu(t)

Here, G is the transfer function.

To determine G, the toolbox minimizes the difference between the model output y_model(t) and the measured output y_meas(t). The minimization criterion is a weighted norm of the error, v(t), where

v(t) = y_meas(t) – y_model(t).

y_model(t) is one of the following:

Simulated response Gu(t) of the model for a given input u(t)
Predicted response of the model for a given input u(t) and past measurements of the output (y_meas(t-1), y_meas(t-2),...)

Accordingly, the error v(t) is called the simulation error or prediction error. The estimation algorithms adjust parameters in the model structure G such that the norm of this error is as small as possible.

Configure Parameter Estimation Algorithm

You can configure the estimation algorithm by:

Configuring the minimization criterion to focus the estimation in a desired frequency range, for example, to put more emphasis at lower frequencies and deemphasize higher frequency noise contributions. You can also configure the criterion to target the intended application needs for the model, such as simulation or prediction.
Specifying optimization options for iterative estimation algorithms.
The majority of estimation algorithms in this toolbox are iterative. You can configure an iterative estimation algorithm by specifying options, such as the optimization method and the maximum number of iterations.

For more information about configuring the estimation algorithm, see Options to Configure the Loss Function and the topics for estimating specific model structures.

Black-Box Modeling

Select Black-Box Model Structure and Order

Black-box modeling is useful when your primary interest is in fitting the data regardless of a particular mathematical structure of the model. The toolbox provides several linear and nonlinear black-box model structures, which have traditionally been useful for representing dynamic systems. These model structures vary in complexity depending on the flexibility you need to account for the dynamics and noise in your system. You can choose one of these structures and compute its parameters to fit the measured response data.

Black-box modeling is usually a trial-and-error process, where you estimate the parameters of various structures and compare the results. Typically, you start with the simple linear model structure and progress to more complex structures. You might also choose a model structure because you are more familiar with this structure or because you have specific application needs.

The simplest linear black-box structures require the fewest options to configure:

Transfer function, with a given number of poles and zeros
Linear ARX model, which is the simplest input-output polynomial model
State-space model, which you can estimate by specifying the number of model states

Estimation of some of these structures also uses noniterative estimation algorithms, which further reduces complexity.

You can configure a model structure using the model order. The definition of model order varies depending on the type of model you select. For example, if you choose a transfer function representation, the model order is related to the number of poles and zeros. For state-space representation, the model order corresponds to the number of states. In some cases, such as for linear ARX and state-space model structures, you can estimate the model order from the data.

If the simple model structures do not produce good models, you can select more complex model structures by:

Specifying a higher model order for the same linear model structure. A higher model order increases the model flexibility for capturing complex phenomena. However, an unnecessarily high order can make the model less reliable.
Explicitly modeling the noise by including the He(t) term, as shown in the following equation.
y(t) = Gu(t) + He(t)
Here, H models the additive disturbance by treating the disturbance as the output of a linear system driven by a white noise source e(t).
Using a model structure that explicitly models the additive disturbance can help to improve the accuracy of the measured component G. Furthermore, such a model structure is useful when your main interest is using the model for predicting future response values.
Using a different linear model structure.
See Linear Model Structures.
Using a nonlinear model structure.
Nonlinear models have more flexibility in capturing complex phenomena than linear models of similar orders. See Nonlinear Model Structures.

Ultimately, you choose the simplest model structure that provides the best fit to your measured data. For more information, see Estimating Linear Models Using Quick Start.

Regardless of the structure you choose for estimation, you can simplify the model for your application needs. For example, you can separate out the measured dynamics (G) from the noise dynamics (H) to obtain a simpler model that represents just the relationship between y and u. You can also linearize a nonlinear model about an operating point.

Use Nonlinear Model Structures

A linear model is often sufficient to accurately describe the system dynamics and, in most cases, a best practice is to first try to fit linear models. If the linear model output does not adequately reproduce the measured output, you might need to use a nonlinear model.

You can assess the need to use a nonlinear model structure by plotting the response of the system to an input. If you notice that the responses differ depending on the input level or input sign, try using a nonlinear model. For example, if the output response to an input step up is faster than the response to a step down, you might need a nonlinear model.

Before building a nonlinear model of a system that you know is nonlinear, try transforming the input and output variables such that the relationship between the transformed variables is linear. For example, consider a system that has current and voltage as inputs to an immersion heater, and the temperature of the heated liquid as an output. The output depends on the inputs through the power of the heater, which is equal to the product of current and voltage. Instead of building a nonlinear model for this two-input and one-output system, you can create a new input variable by taking the product of the current and voltage and building a linear model that describes the relationship between power and temperature.

If you cannot determine variable transformations that yield a linear relationship between input and output variables, you can use nonlinear structures such as nonlinear ARX or Hammerstein-Wiener models. For a list of supported nonlinear model structures and when to use them, see Nonlinear Model Structures.

Black-Box Estimation Example

You can use the System Identification app or commands to estimate linear and nonlinear models of various structures. In most cases, you choose a model structure and estimate the model parameters using a single command.

Consider the mass-spring-damper system described in Dynamic Systems and Models. If you do not know the equation of motion of this system, you can use a black-box modeling approach to build a model. For example, you can estimate transfer functions or state-space models by specifying the orders of these model structures.

A transfer function is a ratio of polynomials:

$G (s) = \frac{(b_{0} + b_{1} s + b_{2} s^{2} + ...)}{(1 + f_{1} s + f_{2} s^{2} + ...)}$

For the mass-spring damper system, this transfer function is

$G (s) = \frac{1}{(m s^{2} + c s + k)}$

which is a system with no zeros and 2 poles.

In discrete-time, the transfer function of the mass-spring-damper system can be

$G (z^{- 1}) = \frac{b z^{- 1}}{(1 + f_{1} z^{- 1} + f_{2} z^{- 2})}$

where the model orders correspond to the number of coefficients of the numerator and the denominator (nb = 1 and nf = 2) and the input-output delay equals the lowest order exponent of z^–1 in the numerator (nk = 1).

In continuous time, you can build a linear transfer function model using the tfest command.

m = tfest(data,2,0)

Here, data is your measured input-output data, represented as an iddata object, and the model order is the set of number of poles (2) and the number of zeros (0).

Similarly, you can build a discrete-time model Output Error structure using the oe command.

m = oe(data,[1 2 1])

The model order is [nb nf nk] = [1 2 1]. Usually, you do not know the model orders in advance. Try several model order values until you find the orders that produce an acceptable model.

Alternatively, you can choose a state-space structure to represent the mass-spring-damper system and estimate the model parameters using the ssest or the n4sid command.

m = ssest(data,2)

Here, the second argument 2 represents the order, or the number of states in the model.

In black-box modeling, you do not need the equation of motion for the system — only a guess of the model orders.

For more information about building models, see Steps for Using the System Identification App and Model Estimation Commands.

Grey-Box Modeling

In some situations, you can deduce the model structure from physical principles. For example, the mathematical relationship between the input force and the resulting mass displacement in the spring-mass-damper system illustrated in Dynamic Systems and Models is well known. In state-space form, the model is given by

$\begin{array}{l} \frac{d x}{d t} = A x (t) + B F (t) \\ y (t) = C x (t) \end{array}$

where x(t) = [y(t);v(t)] is the state vector. The coefficients A, B, and C are functions of the model parameters:

A = [0 1; –k/m –c/m]

B = [0; 1/m]

C = [1 0]

Here, you fully know the model structure but do not know the values of its parameters—m, c, and k.

In the grey-box approach, you use the data to estimate the values of the unknown parameters of your model structure. You specify the model structure by a set of differential or difference equations in MATLAB and provide some initial guess for the unknown parameters specified.

In general, you build grey-box models by:

Creating a template model structure.
Configuring the model parameters with initial values and constraints (if any).
Applying an estimation method to the model structure and computing the model parameter values.

The following table summarizes the ways you can specify a grey-box model structure.

Grey-Box Structure Representation Learn More

Grey-Box Structure Representation	Learn More
Represent the state-space model structure as a structured `idss` model object and estimate the state-space matrices A, B, and C. You can compute the parameter values, such as m, c, and k, from the state space matrices A and B. For example, m = 1/B(2) and k = –A(2,1)m.	Estimate State-Space Models with Canonical Parameterization Estimate State-Space Models with Structured Parameterization
Represent the state-space model structure as an `idgrey` model object. You can directly estimate the values of parameters m, c, and k.	Grey-Box Model Estimation

Represent the state-space model structure as a structured idss model object and estimate the state-space matrices A, B, and C.

You can compute the parameter values, such as m, c, and k, from the state space matrices A and B. For example, m = 1/B(2) and k = –A(2,1)m.

Represent the state-space model structure as an idgrey model object. You can directly estimate the values of parameters m, c, and k. Grey-Box Model Estimation

Evaluate Model Quality

After you estimate the model, you can evaluate the model quality by:

Ultimately, you must assess the quality of your model based on whether the model adequately addresses the needs of your application. For information about other available model analysis techniques, see Model Analysis.

If you do not get a satisfactory model, you can iteratively improve your results by trying a different model structure, changing the estimation algorithm settings, or performing additional data processing. If these changes do not improve your results, you might need to revisit your experimental design and data gathering procedures.

Compare Model Response to Measured Response

Typically, you evaluate the quality of a model by comparing the model response to the measured output for the same input signal.

Suppose you use a black-box modeling approach to create dynamic models of the spring-mass damper system. You try various model structures and orders, such as:

model1 = arx(data, [2 1 1]);
model2 = n4sid(data, 3)

You can simulate these models with a particular input and compare their responses against the measured values of the displacement for the same input applied to the real system. The following figure compares the simulated and measured responses for a step input.

Plot with simulated outputs of the models 1 and 2 in blue and red, respectively, and the measured output in black. Visually, model 2 is a better fit to the measured data than model 1.

The figure indicates that model2 is better than model1 because model2 better fits the data (65% vs. 83%).

The fit percentage indicates the agreement between the model response and the measured output: 100 means a perfect fit, and 0 indicates a poor fit (that is, the model output has the same fit to the measured output as the mean of the measured output).

For more information, see topics on the Compare Output with Measured Data page.

Analyze Residuals

The System Identification Toolbox software lets you perform residual analysis to assess the model quality. Residuals represent the portion of the output data not explained by the estimated model. A good model has residuals uncorrelated with past inputs.

For more information, see the topics on the Residual Analysis page.

Analyze Model Uncertainty

When you estimate the model parameters from data, you obtain their nominal values that are accurate within a confidence region. The size of this region is determined by the values of the parameter uncertainties computed during estimation. The magnitude of the uncertainties provide a measure of the reliability of the model. Large uncertainties in parameters can result from unnecessarily high model orders, inadequate excitation levels in the input data, and a poor signal-to-noise ratio in measured data.

You can compute and visualize the effect of parameter uncertainties on the model response in the time and frequency domains using pole-zero maps, Bode response plots, and step response plots. For example, in the following Bode plot of an estimated model, the shaded regions represent the uncertainty in amplitude and phase of the frequency response of the model, computed using the uncertainty in the parameters. The plot shows that the uncertainty is low only in the 5 to 50 rad/s frequency range, which indicates that the model is reliable only in this frequency range.

Amplitude(top) and phase(bottom) plots, with plot values shown in a dark purple trace and the uncertainty region shown as a light purple region surrounding the trace

For more information, see Compute Model Uncertainty.

Resources

The System Identification Toolbox documentation provides you with the necessary information to use this product. Additional resources are available to help you learn more about specific aspects of system identification theory and applications.

The following book describes methods for system identification and physical modeling:

Ljung, Lennart, and Torkel Glad. Modeling of Dynamic Systems. Prentice Hall Information and System Sciences Series. Englewood Cliffs, NJ: PTR Prentice Hall, 1994.

These books provide detailed information about system identification theory and algorithms:

Ljung, Lennart. System Identification: Theory for the User. Second edition. Prentice Hall Information and System Sciences Series. Upper Saddle River, NJ: PTR Prentice Hall, 1999.
Söderström, Torsten, and Petre Stoica. System Identification. Prentice Hall International Series in Systems and Control Engineering. New York: Prentice Hall, 1989.

For information about working with frequency-domain data, see the following book:

Pintelon, Rik, and Johan Schoukens. System Identification. A Frequency Domain Approach. Hoboken, NJ: John Wiley & Sons, 2001. https://doi.org/10.1002/0471723134.

For information on nonlinear identification, see the following references:

Sjöberg, Jonas, Qinghua Zhang, Lennart Ljung, Albert Benveniste, Bernard Delyon, Pierre-Yves Glorennec, Håkan Hjalmarsson, and Anatoli Juditsky. “Nonlinear Black-Box Modeling in System Identification: A Unified Overview.” Automatica 31, no. 12 (December 1995): 1691–1724. https://doi.org/10.1016/0005-1098(95)00120-8.
Juditsky, Anatoli, Håkan Hjalmarsson, Albert Benveniste, Bernard Delyon, Lennart Ljung, Jonas SjÖberg, and Qinghua Zhang. “Nonlinear Black-Box Models in System Identification: Mathematical Foundations.” Automatica 31, no. 12 (December 1995): 1725–50. https://doi.org/10.1016/0005-1098(95)00119-1.
Zhang, Qinghua, and Albert Benveniste. “Wavelet Networks.” IEEE Transactions on Neural Networks 3, no. 6 (November 1992): 889–98. https://doi.org/10.1109/72.165591.
Zhang, Qinghua. “Using Wavelet Network in Nonparametric Estimation.” IEEE Transactions on Neural Networks 8, no. 2 (March 1997): 227–36. https://doi.org/10.1109/72.557660.

For more information about systems and signals, see the following book:

Oppenheim, Alan V., and Alan S. Willsky, Signals and Systems. Upper Saddle River, NJ: PTR Prentice Hall, 1985.

The following textbook describes numerical techniques for parameter estimation using criterion minimization:

Dennis, J. E., Jr., and Robert B. Schnabel. Numerical Methods for Unconstrained Optimization and Nonlinear Equations. Upper Saddle River, NJ: PTR Prentice Hall, 1983.