release
Syntax
Description
Examples
Train Environmental Sound Classification System
Download and unzip the environment sound classification data set. This data set consists of recordings labeled as one of 10 different audio sound classes (ESC-10).
loc = matlab.internal.examples.downloadSupportFile("audio","ESC-10.zip"); unzip(loc,pwd)
Create an audioDatastore
object to manage the data and split it into training and validation sets. Call countEachLabel
to display the distribution of sound classes and the number of unique labels.
ads = audioDatastore(pwd,IncludeSubfolders=true,LabelSource="foldernames");
countEachLabel(ads)
ans=10×2 table
Label Count
______________ _____
chainsaw 40
clock_tick 40
crackling_fire 40
crying_baby 40
dog 40
helicopter 40
rain 40
rooster 38
sea_waves 40
sneezing 40
Listen to one of the files.
[audioIn,audioInfo] = read(ads); fs = audioInfo.SampleRate; sound(audioIn,fs) audioInfo.Label
ans = categorical
chainsaw
Split the datastore into training and test sets.
[adsTrain,adsTest] = splitEachLabel(ads,0.8);
Create an audioFeatureExtractor
to extract all possible features from the audio.
afe = audioFeatureExtractor(SampleRate=fs, ... Window=hamming(round(0.03*fs),"periodic"), ... OverlapLength=round(0.02*fs)); params = info(afe,"all"); params = structfun(@(x)true,params,UniformOutput=false); set(afe,params); afe
afe = audioFeatureExtractor with properties: Properties Window: [1323×1 double] OverlapLength: 882 SampleRate: 44100 FFTLength: [] SpectralDescriptorInput: 'linearSpectrum' FeatureVectorLength: 862 Enabled Features linearSpectrum, melSpectrum, barkSpectrum, erbSpectrum, mfcc, mfccDelta mfccDeltaDelta, gtcc, gtccDelta, gtccDeltaDelta, spectralCentroid, spectralCrest spectralDecrease, spectralEntropy, spectralFlatness, spectralFlux, spectralKurtosis, spectralRolloffPoint spectralSkewness, spectralSlope, spectralSpread, pitch, harmonicRatio, zerocrossrate shortTimeEnergy Disabled Features none To extract a feature, set the corresponding property to true. For example, obj.mfcc = true, adds mfcc to the list of enabled features.
Create two directories in your current folder: train and test. Extract features from the training and the test data sets and write the features as MAT files to the respective directories. Pre-extracting features can save time when you want to evaluate different feature combinations or training configurations.
if ~isdir("train") mkdir("train") mkdir("test") outputType = ".mat"; writeall(adsTrain,"train",WriteFcn=@(x,y,z)writeFeatures(x,y,z,afe)) writeall(adsTest,"test",WriteFcn=@(x,y,z)writeFeatures(x,y,z,afe)) end
Create signal datastores to point to the audio features.
sdsTrain = signalDatastore("train",IncludeSubfolders=true); sdsTest = signalDatastore("test",IncludeSubfolders=true);
Create label arrays that are in the same order as the signalDatastore
files.
labelsTrain = categorical(extractBetween(sdsTrain.Files,"ESC-10"+filesep,filesep)); labelsTest = categorical(extractBetween(sdsTest.Files,"ESC-10"+filesep,filesep));
Create a transform datastore from the signal datastores to isolate and use only the desired features. You can use the output from info
on the audioFeatureExtractor
to map your chosen features to the index in the features matrix. You can experiment with the example by choosing different features.
featureIndices = info(afe)
featureIndices = struct with fields:
linearSpectrum: [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 … ]
melSpectrum: [663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694]
barkSpectrum: [695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726]
erbSpectrum: [727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769]
mfcc: [770 771 772 773 774 775 776 777 778 779 780 781 782]
mfccDelta: [783 784 785 786 787 788 789 790 791 792 793 794 795]
mfccDeltaDelta: [796 797 798 799 800 801 802 803 804 805 806 807 808]
gtcc: [809 810 811 812 813 814 815 816 817 818 819 820 821]
gtccDelta: [822 823 824 825 826 827 828 829 830 831 832 833 834]
gtccDeltaDelta: [835 836 837 838 839 840 841 842 843 844 845 846 847]
spectralCentroid: 848
spectralCrest: 849
spectralDecrease: 850
spectralEntropy: 851
spectralFlatness: 852
spectralFlux: 853
spectralKurtosis: 854
spectralRolloffPoint: 855
spectralSkewness: 856
spectralSlope: 857
spectralSpread: 858
pitch: 859
harmonicRatio: 860
zerocrossrate: 861
shortTimeEnergy: 862
idxToUse = [... featureIndices.harmonicRatio ... ,featureIndices.spectralRolloffPoint ... ,featureIndices.spectralFlux ... ,featureIndices.spectralSlope ... ]; tdsTrain = transform(sdsTrain,@(x)x(:,idxToUse)); tdsTest = transform(sdsTest,@(x)x(:,idxToUse));
Create an i-vector system that accepts feature input.
soundClassifier = ivectorSystem(InputType="features");
Train the extractor and classifier using the training set.
trainExtractor(soundClassifier,tdsTrain,UBMNumComponents=128,TVSRank=64);
Calculating standardization factors ....done. Training universal background model .....done. Training total variability space ......done. i-vector extractor training complete.
trainClassifier(soundClassifier,tdsTrain,labelsTrain,NumEigenvectors=32,PLDANumIterations=0)
Extracting i-vectors ...done. Training projection matrix .....done. i-vector classifier training complete.
Enroll the labels from the training set to create i-vector templates for each of the environmental sounds.
enroll(soundClassifier,tdsTrain,labelsTrain)
Extracting i-vectors ...done. Enrolling i-vectors .............done. Enrollment complete.
Calibrate the i-vector system.
calibrate(soundClassifier,tdsTrain,labelsTrain)
Extracting i-vectors ...done. Calibrating CSS scorer ...done. Calibration complete.
Use the identify
function on the test set to return the system's inferred label.
inferredLabels = labelsTest; inferredLabels(:) = inferredLabels(1); for ii = 1:numel(labelsTest) features = read(tdsTest); tableOut = identify(soundClassifier,features,"css",NumCandidates=1); inferredLabels(ii) = tableOut.Label(1); end
Create a confusion matrix to visualize performance on the test set.
uniqueLabels = unique(labelsTest); cm = zeros(numel(uniqueLabels),numel(uniqueLabels)); for ii = 1:numel(uniqueLabels) for jj = 1:numel(uniqueLabels) cm(ii,jj) = sum((labelsTest==uniqueLabels(ii)) & (inferredLabels==uniqueLabels(jj))); end end labelStrings = replace(string(uniqueLabels),"_"," "); heatmap(labelStrings,labelStrings,cm) colorbar off ylabel("True Labels") xlabel("Predicted Labels") accuracy = mean(inferredLabels==labelsTest); title(sprintf("Accuracy = %0.2f %%",accuracy*100))
Release the i-vector system.
release(soundClassifier)
Supporting Functions
function writeFeatures(audioIn,info,~,afe) % Convert to single-precision audioIn = single(audioIn); % Extract features features = extract(afe,audioIn); % Replace the file extension of the suggested output name with MAT. filename = strrep(info.SuggestedOutputName,".wav",".mat"); % Save the MFCC coefficients to the MAT file. save(filename,"features") end
Input Arguments
ivs
— i-vector system
ivectorSystem
object
i-vector system, specified as an object of type ivectorSystem
.
Version History
Introduced in R2021a
See Also
trainExtractor
| trainClassifier
| calibrate
| enroll
| unenroll
| detectionErrorTradeoff
| verify
| identify
| info
| addInfoHeader
| ivector
| ivectorSystem
| speakerRecognition
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)