# mfcc

Extract MFCC, log energy, delta, and delta-delta of audio signal

## Syntax

## Description

specifies options using one or more name-value arguments.`coeffs`

= mfcc(___,`Name=Value`

)

**Example: **`coeffs = mfcc(audioIn,fs,LogEnergy="replace")`

returns
mel frequency cepstral coefficients for the audio input signal sampled at
`fs`

Hz. The first coefficient in the `coeffs`

vector is replaced with the log energy value.

`[`

also returns the delta, delta-delta, and location of samples corresponding to each
window of data. You can specify an input combination from any of the previous
syntaxes.`coeffs`

,`delta`

,`deltaDelta`

,`loc`

] = mfcc(___)

`mfcc(___)`

with no output arguments plots the
mel-frequency cepstral coefficients. Before plotting, the coefficients are
normalized to have mean 0 and standard deviation 1.

If the input is in the time domain, the coefficients are plotted against time.

If the input is in the frequency domain, the coefficients are plotted against frame number.

If the log-energy is extracted, then it is also plotted.

## Examples

## Input Arguments

## Output Arguments

## Algorithms

Mel frequency cepstrum coefficients are popular features extracted from speech signals for use in recognition tasks. In the source-filter model of speech, cepstral coefficients are understood to represent the filter (vocal tract). The vocal tract frequency response is relatively smooth, whereas the source of voiced speech can be modeled as an impulse train. As a result, the vocal tract can be estimated by the spectral envelope of a speech segment.

The motivating idea of mel frequency cepstral coefficients is to compress information about the vocal tract (smoothed spectrum) into a small number of coefficients based on an understanding of the cochlea. Although there is no hard standard for calculating the coefficients, the basic steps are outlined by the diagram.

The default mel filter bank linearly spaces the first 10 triangular filters and logarithmically spaces the remaining filters.

The information contained in the zeroth mel frequency cepstral coefficient is often augmented with or replaced by the log energy. The log energy calculation depends on the input domain.

If the input (*audioIn*) is a time-domain signal, the log energy is
computed using the following equation:

$$\mathrm{log}E=\mathrm{log}(\text{sum}({x}^{2}))$$

If the input (*audioIn*) is a frequency-domain signal, the log energy
is computed using the following equation:

$$\mathrm{log}E=\mathrm{log}\left(\text{sum}\left({\left|x\right|}^{2}\right)/FFTLength\right)$$

## References

[1] Rabiner, Lawrence R., and
Ronald W. Schafer. *Theory and Applications of Digital Speech
Processing*. Upper Saddle River, NJ: Pearson, 2010.

[2] Auditory Toolbox. https://engineering.purdue.edu/~malcolm/interval/1998-010/AuditoryToolboxTechReport.pdf

## Extended Capabilities

## Version History

**Introduced in R2018a**