What is the smart way to evaluate performance for KERNEL DENSITY ESTIMATION model

1 visualización (últimos 30 días)
Hi Matlab Coder,
In this example, I am using Kernel Density Estimation (i.e., KSDENSITY) to assign whether a particular subject is a smoker or non-smoker depending on their Heights. The Classifier performance was evaluated based on the result of True Positive Rate and False Positive Rate.
To achieve this objective, the following code as shown below was design.
Following the design of this code, two questions come to mind
1) Code simplification| Can the code structure be further simplified by applying any build-in MATLAB function such as CLASSPERF or CLASSIFY
Thanks in advance for the wisdom
Note: The patients.mat is a build-in MAT file in MATLAB R2017a
s=load('patients'); % load the data into structure in preparation for...
pat=struct2table(s); % creating a table structure from it...
pat.Gender=categorical(pat.Gender); % just a sample cleanup to turn from cellstr to categorical
newHeigh (pat.Gender=='Female')=3;
indices = crossvalind('Kfold',s.Smoker,10);
c=1;
for i = 1:max(indices)
test = (indices == i);
Validate.Height = pat.Height(test ==1); % 10% of the data used for testing
Train.Height = pat.Height(test ~=1); % 90% of the data used for training
Validate.smokingStatus = pat.Smoker(test ==1); % 10% of the data used for testing
Train.smokingStatus = pat.Smoker(test ~=1); % 90% of the data used for training
%
[fSmk,xSmk] =ksdensity(Train.Height(Train.smokingStatus == true) ,Validate.Height); % fit the empircal density to smokers
[fNSmk,xNSmk]=ksdensity(Train.Height(Train.smokingStatus == false),Validate.Height); % and non in turn...
EasyComp =[];
EasyComp (:,1:2) = [fSmk fNSmk]; % Store in EASYCOMP to manually check the OUTPUT
% Line 23: applies the element-by-element comparison between the output from fSmk & fNSmk.
% The element in the Validation Vector will be assign to the class upon which it have the
% highest density value.
EasyComp (:,3) = bsxfun(@gt, EasyComp(:,1), EasyComp(:,2)); % Return binary.If 1: its smoker, 0:Non-Smoker| Column 3 is the predicted smoking status
EasyComp (:,4) = Validate.smokingStatus; % Column 4 is the actual smoking status
ConfMat = categorical (repmat({'TN'}, [length(EasyComp) 1])); % True Negative (TN) | Predicted NS, actual NS
ConfMat (logical (((EasyComp(:,3)==0) .*(EasyComp(:,4)==1))))='FN'; % False Negative (FN) | Predicted NS, actual S
ConfMat (logical (((EasyComp(:,3)==1) .*(EasyComp(:,4)==0))))='FP'; % False Positive (FP) | Predicted S, actual NS
ConfMat (logical (((EasyComp(:,3)==1) .*(EasyComp(:,4)==1))))='TP'; % True Positive (TP) | Predicted S, actual S
% Calculate the True Positive Rate % False Positive Rate which can be used to Plot the Receiver operating characteristic
perf.TPR = (sum(ConfMat(:) == 'TP'))/((sum(ConfMat(:) == 'TP'))+ (sum(ConfMat(:) == 'FN'))); % TP/ (TP+FN)
perf.FPR = (sum(ConfMat(:) == 'FP'))/((sum(ConfMat(:) == 'FP'))+ (sum(ConfMat(:) == 'TN'))); % FP/ (FP+TN)
EasyComp (:,5) = grp2idx(ConfMat ); % Deliberately assign to column 5 for easy manual check
result.TPR {c} = perf.TPR;
result.FPR {c} = perf.FPR;
c=c+1;
end
perf.averageTPR =mean((cellfun(@(v) v(1), result.TPR(1,:))),2); % The average TPR performance of 10 repetitions
perf.averageFPR =mean((cellfun(@(v) v(1), result.FPR(1,:))),2); % The average FPR performance of 10 repetitions

Respuestas (0)

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by