## System Identification Overview

### What Is System Identification?

System identification is a methodology for building mathematical models of dynamic systems using measurements of the system’s input and output signals.

The process of system identification requires that you:

### About Dynamic Systems and Models

#### What Is a Dynamic Model?

In a dynamic system, the values of the output signals depend on both the instantaneous values of its input signals and also on the past behavior of the system. For example, a car seat is a dynamic system—the seat shape (settling position) depends on both the current weight of the passenger (instantaneous value) and how long this passenger has been riding in the car (past behavior).

A model is a mathematical relationship between a system’s input and output variables. Models of dynamic systems are typically described by differential or difference equations, transfer functions, state-space equations, and pole-zero-gain models.

You can represent dynamic models both in continuous-time and discrete-time form.

An often-used example of a dynamic model is the equation of motion of a spring-mass-damper system. As shown in the next figure, the mass moves in response to the force F(t) applied on the base to which the mass is attached. The input and output of this system are the force F(t) and displacement y(t) respectively.

Mass-Spring-Damper System Excited by Force F(t) #### Continuous-Time Dynamic Model Example

You can represent the same physical system as several equivalent models. For example, you can represent the mass-spring-damper system in continuous time as a second order differential equation:

`$m\frac{{d}^{2}y}{d{t}^{2}}+c\frac{dy}{dt}+ky\left(t\right)=F\left(t\right)$`

where m is the mass, k the spring’s stiffness constant, and c the damping coefficient. The solution to this differential equation lets you determine the displacement of the mass, y(t), as a function of external force F(t) at any time t for known values of constant m, c and k.

Consider the displacement, y(t), and velocity, $v\left(t\right)=\frac{dy\left(t\right)}{dt}$, as state variables:

`$x\left(t\right)=\left[\begin{array}{c}y\left(t\right)\\ v\left(t\right)\end{array}\right]$`

You can express the previous equation of motion as a state-space model of the system:

`$\begin{array}{l}\frac{dx}{dt}=Ax\left(t\right)+BF\left(t\right)\\ y\left(t\right)=Cx\left(t\right)\end{array}$`

The matrices A, B, and C are related to the constants m, c and k as follows:

`$\begin{array}{c}A=\left[\begin{array}{cc}0& 1\\ -\frac{k}{m}& -\frac{c}{m}\end{array}\right]\\ B=\left[\begin{array}{cc}0& \frac{1}{m}\end{array}\right]\\ C=\left[\begin{array}{cc}1& 0\end{array}\right]\end{array}$`

You can also obtain a transfer function model of the spring-mass-damper system by taking the Laplace transform of the differential equation:

`$G\left(s\right)=\frac{Y\left(s\right)}{F\left(s\right)}=\frac{1}{\left(m{s}^{2}+cs+k\right)}$`

where s is the Laplace variable.

#### Discrete-Time Dynamic Model Example

Suppose you can only observe the input and output variables F(t) and y(t) of the mass-spring-damper system at discrete time instants t = nTs, where Ts is a fixed time interval and n = 0, 1 , 2, .... The variables are said to be sampled with sample time Ts. Then, you can represent the relationship between the sampled input-output variables as a second order difference equation, such as:

`$y\left(t\right)+{a}_{1}y\left(t-{T}_{s}\right)+{a}_{2}y\left(t-2{T}_{s}\right)=bF\left(t-{T}_{s}\right)$`

Often, for simplicity, Ts is taken as one time unit, and the equation can be written as:

`$y\left(t\right)+{a}_{1}y\left(t-1\right)+{a}_{2}y\left(t-2\right)=bF\left(t-1\right)$`

where a1 and a2 are the model parameters. The model parameters are related to the system constants m, c, and k, and the sample time Ts.

This difference equation shows the dynamic nature of the model. The displacement value at the time instant t depends not only on the value of force F at a previous time instant, but also on the displacement values at the previous two time instants y(t–1) and y(t–2).

You can use this equation to compute the displacement at a specific time. The displacement is represented as a weighted sum of the past input and output values:

`$y\left(t\right)=bF\left(t-1\right)-{a}_{1}y\left(t-1\right)-{a}_{2}y\left(t-2\right)$`

This equation shows an iterative way of generating values of output y(t) starting from initial conditions (y(0) and y(1)) and measurements of input F(t). This computation is called simulation.

Alternatively, the output value at a given time t can be computed using the measured values of output at previous two time instants and the input value at a previous time instant. This computation is called prediction. For more information on simulation and prediction using a model, see topics on the Simulation and Prediction page.

You can also represent a discrete-time equation of motion in state-space and transfer-function forms by performing the transformations similar to those described in Continuous-Time Dynamic Model Example.

### System Identification Requires Measured Data

#### Why Does System Identification Require Data?

System identification uses the input and output signals you measure from a system to estimate the values of adjustable parameters in a given model structure.

Obtaining a good model of your system depends on how well your measured data reflects the behavior of the system. See Data Quality Requirements.

Using this toolbox, you build models using time-domain input-output signals, frequency response data, time series signals, and time-series spectra.

#### Time Domain Data

Time-domain data consists of the input and output variables of the system that you record at a uniform sampling interval over a period of time.

For example, if you measure the input force, F(t), and mass displacement, y(t), of the spring-mass-damper system at a uniform sampling frequency of 10 Hz, you obtain the following vectors of measured values:

`$\begin{array}{l}{u}_{meas}=\left[F\left({T}_{s}\right),F\left(2{T}_{s}\right),F\left(3{T}_{s}\right),...,F\left(N{T}_{s}\right)\right]\\ {y}_{meas}=\left[y\left({T}_{s}\right),y\left(2{T}_{s}\right),y\left(3{T}_{s}\right),...,y\left(N{T}_{s}\right)\right]\end{array}$`

where Ts = 0.1 seconds and NTs is time of the last measurement.

If you want to build a discrete-time model from this data, the data vectors umeas and ymeas and the sample time Ts provide sufficient information for creating such a model.

If you want to build a continuous-time model, you should also know the intersample behavior of the input signals during the experiment. For example, the input may be piecewise constant (zero-order hold) or piecewise linear (first-order hold) between samples.

#### Frequency Domain Data

Frequency domain data represents measurements of the system input and output variables that you record or store in the frequency domain. The frequency domain signals are Fourier transforms of the corresponding time domain signals.

Frequency domain data can also represent the frequency response of the system, represented by the set of complex response values over a given frequency range. The frequency response describes the outputs to sinusoidal inputs. If the input is a sine wave with frequency ω, then the output is also a sine wave of the same frequency, whose amplitude is A(ω) times the input signal amplitude and a phase shift of Φ(ω) with respect to the input signal. The frequency response is A(ω)e(iΦ(ω)).

In the case of the mass-spring-damper system, you can obtain the frequency response data by using a sinusoidal input force and measuring the corresponding amplitude gain and phase shift of the response, over a range of input frequencies.

You can use frequency-domain data to build both discrete-time and continuous-time models of your system.

#### Data Quality Requirements

System identification requires that your data capture the important dynamics of your system. Good experimental design ensures that you measure the right variables with sufficient accuracy and duration to capture the dynamics you want to model. In general, your experiment must:

• Use inputs that excite the system dynamics adequately. For example, a single step is seldom enough excitation.

• Measure data long enough to capture the important time constants.

• Set up data acquisition system to have good signal-to-noise ratio.

• Measure data at appropriate sampling intervals or frequency resolution.

You can analyze the data quality before building the model using techniques available in the Signal Processing Toolbox software. For example, analyze the input spectra to determine if the input signals have sufficient power over the bandwidth of the system.

You can also analyze your data to determine peak frequencies, input delays, important time constants, and indication of nonlinearities using non-parametric analysis tools in this toolbox. You can use this information for configuring model structures for building models from data. For more information, see:

### Building Models from Data

#### System Identification Requires a Model Structure

A model structure is a mathematical relationship between input and output variables that contains unknown parameters. Examples of model structures are transfer functions with adjustable poles and zeros, state space equations with unknown system matrices, and nonlinear parameterized functions.

The following difference equation represents a simple model structure:

`$y\left(k\right)+ay\left(k-1\right)=bu\left(k\right)$`

where a and b are adjustable parameters.

The system identification process requires that you choose a model structure and apply the estimation methods to determine the numerical values of the model parameters.

You can use one of the following approaches to choose the model structure:

• You want a model that is able to reproduce your measured data and is as simple as possible. You can try various mathematical structures available in the toolbox. This modeling approach is called black-box modeling.

• You want a specific structure for your model, which you may have derived from first principles, but do not know numerical values of its parameters. You can then represent the model structure as a set of equations or state-space system in MATLAB® and estimate the values of its parameters from data. This approach is known as grey-box modeling.

#### How the Toolbox Computes Model Parameters

The System Identification Toolbox™ software estimates model parameters by minimizing the error between the model output and the measured response. The output ymodel of the linear model is given by:

ymodel(t) = Gu(t)

where G is the transfer function.

To determine G, the toolbox minimizes the difference between the model output ymodel(t) and the measured output ymeas(t). The minimization criterion is a weighted norm of the error, v(t), where:

v(t) = ymeas(t) – ymodel(t).

ymodel(t) is one of the following:

• Simulated response (Gu(t) of the model for a given input u(t)

• Predicted response of the model for a given input u(t) and past measurements of output (ymeas(t-1), ymeas(t-2),...)

Accordingly, the error v(t) is called simulation error or prediction error. The estimation algorithms adjust parameters in the model structure G such that the norm of this error is as small as possible.

#### Configuring the Parameter Estimation Algorithm

You can configure the estimation algorithm by:

• Configuring the minimization criterion to focus the estimation in a desired frequency range, such as put more emphasis at lower frequencies and deemphasize higher frequency noise contributions. You can also configure the criterion to target the intended application needs for the model such as simulation or prediction.

• Specifying optimization options for iterative estimation algorithms.

The majority of estimation algorithms in this toolbox are iterative. You can configure an iterative estimation algorithm by specifying options, such as the optimization method and the maximum number of iterations.

For more information about configuring the estimation algorithm, see Options to Configure the Loss Function and the topics for estimating specific model structures.

### Black-Box Modeling

#### Selecting Black-Box Model Structure and Order

Black-box modeling is useful when your primary interest is in fitting the data regardless of a particular mathematical structure of the model. The toolbox provides several linear and nonlinear black-box model structures, which have traditionally been useful for representing dynamic systems. These model structures vary in complexity depending on the flexibility you need to account for the dynamics and noise in your system. You can choose one of these structures and compute its parameters to fit the measured response data.

Black-box modeling is usually a trial-and-error process, where you estimate the parameters of various structures and compare the results. Typically, you start with the simple linear model structure and progress to more complex structures. You might also choose a model structure because you are more familiar with this structure or because you have specific application needs.

The simplest linear black-box structures require the fewest options to configure:

Estimation of some of these structures also uses noniterative estimation algorithms, which further reduces complexity.

You can configure a model structure using the model order. The definition of model order varies depending on the type of model you select. For example, if you choose a transfer function representation, the model order is related to the number of poles and zeros. For state-space representation, the model order corresponds to the number of states. In some cases, such as for linear ARX and state-space model structures, you can estimate the model order from the data.

If the simple model structures do not produce good models, you can select more complex model structures by:

• Specifying a higher model order for the same linear model structure. Higher model order increases the model flexibility for capturing complex phenomena. However, unnecessarily high orders can make the model less reliable.

• Explicitly modeling the noise:

y(t)=Gu(t)+He(t)

where H models the additive disturbance by treating the disturbance as the output of a linear system driven by a white noise source e(t).

Using a model structure that explicitly models the additive disturbance can help to improve the accuracy of the measured component G. Furthermore, such a model structure is useful when your main interest is using the model for predicting future response values.

• Using a different linear model structure.

• Using a nonlinear model structure.

Nonlinear models have more flexibility in capturing complex phenomena than linear models of similar orders. See Nonlinear Model Structures.

Ultimately, you choose the simplest model structure that provides the best fit to your measured data. For more information, see Estimating Linear Models Using Quick Start.

Regardless of the structure you choose for estimation, you can simplify the model for your application needs. For example, you can separate out the measured dynamics (G) from the noise dynamics (H) to obtain a simpler model that represents just the relationship between y and u. You can also linearize a nonlinear model about an operating point.

#### When to Use Nonlinear Model Structures?

A linear model is often sufficient to accurately describe the system dynamics and, in most cases, you should first try to fit linear models. If the linear model output does not adequately reproduce the measured output, you might need to use a nonlinear model.

You can assess the need to use a nonlinear model structure by plotting the response of the system to an input. If you notice that the responses differ depending on the input level or input sign, try using a nonlinear model. For example, if the output response to an input step up is faster than the response to a step down, you might need a nonlinear model.

Before building a nonlinear model of a system that you know is nonlinear, try transforming the input and output variables such that the relationship between the transformed variables is linear. For example, consider a system that has current and voltage as inputs to an immersion heater, and the temperature of the heated liquid as an output. The output depends on the inputs via the power of the heater, which is equal to the product of current and voltage. Instead of building a nonlinear model for this two-input and one-output system, you can create a new input variable by taking the product of current and voltage and then build a linear model that describes the relationship between power and temperature.

If you cannot determine variable transformations that yield a linear relationship between input and output variables, you can use nonlinear structures such as Nonlinear ARX or Hammerstein-Wiener models. For a list of supported nonlinear model structures and when to use them, see Nonlinear Model Structures.

#### Black-Box Estimation Example

You can use the System Identification app or commands to estimate linear and nonlinear models of various structures. In most cases, you choose a model structure and estimate the model parameters using a single command.

Consider the mass-spring-damper system, described in About Dynamic Systems and Models. If you do not know the equation of motion of this system, you can use a black-box modeling approach to build a model. For example, you can estimate transfer functions or state-space models by specifying the orders of these model structures.

A transfer function is a ratio of polynomials: For the mass-spring damper system, this transfer function is: which is a system with no zeros and 2 poles.

In discrete-time, the transfer function of the mass-spring-damper system can be: where the model orders correspond to the number of coefficients of the numerator and the denominator (`nb` = 1 and `nf` = 2) and the input-output delay equals the lowest order exponent of z–1 in the numerator (`nk` = 1).

In continuous-time, you can build a linear transfer function model using the `tfest` command:

`m = tfest(data,2,0)`

where `data` is your measured input-output data, represented as an `iddata` object and the model order is the set of number of poles (2) and the number of zeros (0).

Similarly, you can build a discrete-time model Output Error structure using the following command:

`m = oe(data,[1 2 1])`

The model order is [`nb nf nk`] = [```1 2 1```]. Usually, you do not know the model orders in advance. You should try several model order values until you find the orders that produce an acceptable model.

Alternatively, you can choose a state-space structure to represent the mass-spring-damper system and estimate the model parameters using the `ssest` or the `n4sid` command:

`m = ssest(data,2)`

where `order` = `2` represents the number of states in the model.

In black-box modeling, you do not need the system’s equation of motion—only a guess of the model orders.

For more information about building models, see Steps for Using the System Identification App and Model Estimation Commands.

### Grey-Box Modeling

In some situations, you can deduce the model structure from physical principles. For example, the mathematical relationship between the input force and the resulting mass displacement in the mass-spring-damper system is well known. In state-space form, the model is given by:

`$\begin{array}{l}\frac{dx}{dt}=Ax\left(t\right)+BF\left(t\right)\\ y\left(t\right)=Cx\left(t\right)\end{array}$`

where `x`(`t`) = [`y`(`t`);`v`(`t`)] is the state vector. The coefficients A, B, and C are functions of the model parameters:

A = [0 1; –k/mc/m]

B = [0; 1/m]

C = [1 0]

Here, you fully know the model structure but do not know the values of its parameters—m, c and k.

In the grey-box approach, you use the data to estimate the values of the unknown parameters of your model structure. You specify the model structure by a set of differential or difference equations in MATLAB and provide some initial guess for the unknown parameters specified.

In general, you build grey-box models by:

1. Creating a template model structure.

2. Configuring the model parameters with initial values and constraints (if any).

3. Applying an estimation method to the model structure and computing the model parameter values.

The following table summarizes the ways you can specify a grey-box model structure.

Represent the state-space model structure as a structured `idss` model object and estimate the state-space matrices A, B and C.

You can compute the parameter values, such as m, c, and k, from the state space matrices A and B. For example, m = 1/B(2) and k = –A(2,1)m.

Represent the state-space model structure as an `idgrey` model object. You can directly estimate the values of parameters `m`, `c` and `k`.Grey-Box Model Estimation

### Evaluating Model Quality

#### How to Evaluate and Improve Model Quality

After you estimate the model, you can evaluate the model quality by:

Ultimately, you must assess the quality of your model based on whether the model adequately addresses the needs of your application. For information about other available model analysis techniques, see Model Analysis.

If you do not get a satisfactory model, you can iteratively improve your results by trying a different model structure, changing the estimation algorithm settings, or performing additional data processing. If these changes do not improve your results, you might need to revisit your experimental design and data gathering procedures.

#### Comparing Model Response to Measured Response

Typically, you evaluate the quality of a model by comparing the model response to the measured output for the same input signal.

Suppose you use a black-box modeling approach to create dynamic models of the spring-mass damper system. You try various model structures and orders, such as:

```model1 = arx(data, [2 1 1]); model2 = n4sid(data, 3) ```

You can simulate these models with a particular input and compare their responses against the measured values of the displacement for the same input applied to the real system. The following figure compares the simulated and measured responses for a step input. The previous figure indicates that `model2` is better than `model1` because `model2` better fits the data (65% vs. 83%).

The % fit indicates the agreement between the model response and the measured output: 100 means a perfect fit, and 0 indicates a poor fit (that is, the model output has the same fit to the measured output as the mean of the measured output).

For more information, see topics on the Compare Output with Measured Data page.

#### Analyzing Residuals

The System Identification Toolbox software lets you perform residual analysis to assess the model quality. Residuals represent the portion of the output data not explained by the estimated model. A good model has residuals uncorrelated with past inputs.

#### Analyzing Model Uncertainty

When you estimate the model parameters from data, you obtain their nominal values that are accurate within a confidence region. The size of this region is determined by the values of the parameter uncertainties computed during estimation. The magnitude of the uncertainties provide a measure of the reliability of the model. Large uncertainties in parameters can result from unnecessarily high model orders, inadequate excitation levels in the input data, and poor signal-to-noise ratio in measured data.

You can compute and visualize the effect of parameter uncertainties on the model response in time and frequency domains using pole-zero maps, Bode response, and step response plots. For example, in the following Bode plot of an estimated model, the shaded regions represent the uncertainty in amplitude and phase of model's frequency response, computed using the uncertainty in the parameters. The plot shows that the uncertainty is low only in the 5 to 50 rad/s frequency range, which indicates that the model is reliable only in this frequency range. The following book describes methods for system identification and physical modeling:

 Ljung, L., and T. Glad. Modeling of Dynamic Systems. PTR Prentice Hall, Upper Saddle River, NJ, 1994.

These books provide detailed information about system identification theory and algorithms:

• Ljung, L. System Identification: Theory for the User. Second edition. PTR Prentice Hall, Upper Saddle River, NJ, 1999.

• Söderström, T., and P. Stoica. System Identification. Prentice Hall International, London, 1989.

For information about working with frequency-domain data, see the following book:

 Pintelon, R., and J. Schoukens. System Identification. A Frequency Domain Approach. Wiley-IEEE Press, New York, 2001.

For information on nonlinear identification, see the following references:

• Sjöberg, J., Q. Zhang, L. Ljung, A. Benveniste, B. Deylon, P. Glorennec, H. Hjalmarsson, and A. Juditsky, “Nonlinear Black-Box Modeling in System Identification: a Unified Overview.” Automatica. Vol. 31, Issue 12, 1995, pp. 1691–1724.

• Juditsky, A., H. Hjalmarsson, A. Benveniste, B. Delyon, L. Ljung, J. Sjöberg, and Q. Zhang, “Nonlinear Black-Box Models in System Identification: Mathematical Foundations.” Automatica. Vol. 31, Issue 12, 1995, pp. 1725–1750.

• Zhang, Q., and A. Benveniste, “Wavelet networks.” IEEE Transactions on Neural Networks. Vol. 3, Issue 6, 1992, pp. 889–898.

• Zhang, Q., “Using Wavelet Network in Nonparametric Estimation.” IEEE Transactions on Neural Networks. Vol. 8, Issue 2, 1997, pp. 227–236. 