identify
Syntax
Description
Examples
This example uses a 1.36 GB subset of the Common Voice data set from Mozilla [1]. The data set contains 48 kHz recordings of subjects speaking short sentences.
Download the data set if it doesn't already exist and unzip it into tempdir
.
downloadFolder = matlab.internal.examples.downloadSupportFile("audio","commonvoice.zip"); dataFolder = tempdir; unzip(downloadFolder,dataFolder);
Ingest the train set using audioDatastore
and, to speed up this example, keep only 20% of each of the speaker files.
trainTable = readtable(dataFolder + fullfile("commonvoice","train","train.tsv"),FileType="text",Delimiter="tab"); adsTrain = audioDatastore(append(fullfile(dataFolder,"commonvoice","train","clips",filesep),trainTable.path,".wav")); idx = splitlabels(trainTable.client_id,0.2); adsTrain = subset(adsTrain,idx{1}); trainLabels = trainTable.client_id(idx{1});
Ingest the validation set using audioDatastore
.
valTable = readtable(dataFolder + fullfile("commonvoice","validation","validation.tsv"),FileType="text",Delimiter="tab"); valLabels = valTable.client_id; adsVal = audioDatastore(append(fullfile(dataFolder,"commonvoice","validation","clips",filesep),valTable.path,".wav"));
Split the validation data set into enroll and test sets. Use two utterances for enrollment and the remaining for the test set. Also, exclude any speakers with less than 5 utterances. Generally, the more utterances you use for enrollment, the better the performance of the system. However, most practical applications are limited to a small set of enrollment utterances.
labelCounts = countlabels(valLabels); labelsToExclude = labelCounts.Label(labelCounts.Count<5); idxs = splitlabels(valLabels,2,Exclude=labelsToExclude); adsEnroll = subset(adsVal,idxs{1}); enrollLabels = valLabels(idxs{1}); adsTest = subset(adsVal,idxs{2}); testLabels = valLabels(idxs{2});
Create an i-vector system that accepts feature input.
fs = 48e3;
iv = ivectorSystem(SampleRate=fs,InputType="features");
Create an audioFeatureExtractor
object to extract the gammatone cepstral coefficients (GTCC), the delta GTCC, the delta-delta GTCC, and the pitch from 50 ms periodic Hann windows with 45 ms overlap.
afe = audioFeatureExtractor(... SampleRate=fs, ... Window=hann(round(0.05*fs),"periodic"), ... OverlapLength=round(0.045*fs), ... gtcc=true,gtccDelta=true,gtccDeltaDelta=true,pitch=true);
Extract features from the train and enroll datastores.
xTrain = extract(afe,adsTrain); xEnroll = extract(afe,adsEnroll);
Train both the extractor and classifier using the training set.
trainExtractor(iv,xTrain, ... UBMNumComponents=64, ... UBMNumIterations=5, ... TVSRank=32, ... TVSNumIterations=3);
Calculating standardization factors .....done. Training universal background model ........done. Training total variability space ......done. i-vector extractor training complete.
trainClassifier(iv,xTrain,trainLabels, ... NumEigenvectors=16, ... ... PLDANumDimensions=16, ... PLDANumIterations=5);
Extracting i-vectors ...done. Training projection matrix .....done. Training PLDA model ........done. i-vector classifier training complete.
To calibrate the system so that scores can be interpreted as a measure of confidence in a positive decision, use calibrate
.
calibrate(iv,xTrain,trainLabels)
Extracting i-vectors ...done. Calibrating CSS scorer ...done. Calibrating PLDA scorer ...done. Calibration complete.
Enroll the speakers from the enrollment set.
enroll(iv,xEnroll,enrollLabels)
Extracting i-vectors ...done. Enrolling i-vectors ...................done. Enrollment complete.
Evaluate the file-level prediction accuracy on the test set.
numCorrect = 0; reset(adsTest) for index = 1:numel(adsTest.Files) features = extract(afe,read(adsTest)); results = identify(iv,features); trueLabel = testLabels(index); predictedLabel = results.Label(1); isPredictionCorrect = trueLabel==predictedLabel; numCorrect = numCorrect + isPredictionCorrect; end display("File Accuracy: " + round(100*numCorrect/numel(adsTest.Files),2) + " (%)")
"File Accuracy: 97.92 (%)"
References
Input Arguments
i-vector system, specified as an object of type ivectorSystem
.
Data to identify, specified as a column vector representing a single-channel (mono) audio signal or a matrix of audio features.
If
InputType
is set to"audio"
when the i-vector system is created,data
must be a column vector with underlying typesingle
ordouble
.If
InputType
is set to"features"
when the i-vector system is created,data
must be a matrix with underlying typesingle
ordouble
. The matrix must consist of audio features where the number of features (columns) is locked the first timetrainExtractor
is called and the number of hops (rows) is variable-sized.
Data Types: single
| double
Scoring algorithm used by the i-vector system, specified as
"plda"
, which corresponds to probabilistic linear discriminant
analysis (PLDA), or "css"
, which corresponds to cosine similarity
score (CSS).
To use "plda"
, you must train the PLDA model using
trainClassifier
. If the PLDA model has been trained, then
scorer
defaults to "plda"
. Otherwise, the
scorer defaults to "css"
.
Data Types: char
| string
Number of candidates to return in tableOut
, specified as a
positive scalar.
Note
If you request a number of candidates greater than the number of
labels
enrolled in the i-vector system, then all candidates are
returned. If unspecified, the number of candidates defaults to the number of enrolled
labels
.
Data Types: single
| double
Output Arguments
Candidate labels and corresponding scores, returned as a table. The number of rows
of tableOut
is equal to N
, the number of
candidates. The candidates are sorted in order of confidence.
Data Types: table
Version History
Introduced in R2021aStarting in R2022a, the identify
function throws a warning if the
scores from the i-vector system are not calibrated. Use calibrate
to
calibrate the scores.
See Also
trainExtractor
| trainClassifier
| calibrate
| unenroll
| enroll
| detectionErrorTradeoff
| verify
| ivector
| info
| addInfoHeader
| release
| ivectorSystem
| speakerRecognition
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Seleccione un país/idioma
Seleccione un país/idioma para obtener contenido traducido, si está disponible, y ver eventos y ofertas de productos y servicios locales. Según su ubicación geográfica, recomendamos que seleccione: .
También puede seleccionar uno de estos países/idiomas:
Cómo obtener el mejor rendimiento
Seleccione China (en idioma chino o inglés) para obtener el mejor rendimiento. Los sitios web de otros países no están optimizados para ser accedidos desde su ubicación geográfica.
América
- América Latina (Español)
- Canada (English)
- United States (English)
Europa
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)