Calculating the Akaike information criterion (AIC) for Neural Networks

15 visualizaciones (últimos 30 días)
Isabelle Museck
Isabelle Museck el 10 de Feb. de 2025
Respondida: Umar el 11 de Feb. de 2025
Hello there, I am trying to caluclate the AIC for my TCN models and I was wondering if this is correct? I am not sure if I am calculating the logliklihood correctly? Let me know if you have any guidance! Thanks in advance!
net = trainnet(traningdataX,trainingdataY,net,"mse",options);
Predval = minibatchpredict(net,validationdataX,InputDataFormats="CTB");
TrueVal = validationdataY;
TrueValue = cell2mat(TrueVal);
Predvalue = {Predval};
PredictedValue = cell2mat(Predvalue);
% Calculate log-likelihood for each model
logLikelihood = calculateLogLikelihood(PredictedValue,TrueValue);
% Determine the number of parameters for each model
numParams = numFeatures
% Calculate AIC for each model
AIC = 2 * numParams - 2 * logLikelihood;
% Function to calculate log-likelihood
function logL = calculateLogLikelihood(PVal,TVal)
errors = TVal - PVal;
logL = -0.5 * sum(errors.^2); % Example for Gaussian errors
end

Respuestas (1)

Umar
Umar el 11 de Feb. de 2025

Hi @Isabelle Museck,

I went through your comments, to calculate AIC correctly, it’s essential first to ensure that the log-likelihood is computed accurately. In your code, you define a function `calculateLogLikelihood` that computes log-likelihood based on Gaussian errors. This method is appropriate if you assume that the errors are normally distributed. Here’s a breakdown of your code and potential improvements:

1. Log-Likelihood Calculation: The formula you used logL = -0.5 * sum(errors.^2);

This assumes that the errors follow a Gaussian distribution with mean zero and variance estimated from the data. This is a valid approach if you are indeed modeling Gaussian noise. However, if your error distribution differs, you will need to adjust this calculation accordingly.

2. Number of Parameters: You have:

   numParams = numFeatures;

Ensure that `numFeatures` accurately represents the number of parameters in your model. For neural networks, this can include weights and biases across all layers.

3. AIC Calculation: Your AIC formula:

   AIC = 2 * numParams - 2 * logLikelihood;

is correct. Just make sure that your `logLikelihood` variable contains the correct log-likelihood value as computed by your function.

Here’s an example implementation considering the points above:

   % Assuming training and validation data are already defined
   net = trainnet(trainingdataX, trainingdataY, net, "mse", options);
   Predval = minibatchpredict(net, validationdataX, InputDataFormats="CTB");
   TrueVal = validationdataY;
   TrueValue = cell2mat(TrueVal);
   Predvalue = {Predval};
   PredictedValue = cell2mat(Predvalue);
   % Calculate log-likelihood for each model
   logLikelihood = calculateLogLikelihood(PredictedValue, TrueValue);
   % Determine the number of parameters for each model
   numParams = numFeatures; % Ensure this reflects total params correctly
   % Calculate AIC for each model
   AIC = 2 * numParams - 2 * logLikelihood;
   % Function to calculate log-likelihood
   function logL = calculateLogLikelihood(PVal, TVal)
    errors = TVal - PVal;
    logL = -0.5 * sum(errors.^2); % Example for Gaussian errors
  end

Here are some additional insights to consider

1. Model Assumptions: Ensure that your model assumptions align with your data characteristics. If you're working with non-Gaussian errors or have a different distribution in mind, consider using other methods for calculating log-likelihood.

2. Cross-Validation: When calculating AIC for model selection, consider using cross-validation techniques to ensure that your results generalize well to unseen data.

3. Comparative Analysis:AIC values are meaningful primarily when comparing multiple models; lower AIC values indicate a better balance of fit and complexity.

4. Alternative Criteria: Besides AIC, consider using Bayesian Information Criterion (BIC) or adjusted R-squared depending on your specific context and needs.

In nutshell, it seems you are on the right track with your implementation of AIC calculation for your TCN models. Ensure that your assumptions regarding error distributions are valid and keep an eye on how you define `numParams` to achieve accurate model evaluations.

Hope this helps.

Categorías

Más información sobre Deep Learning Toolbox en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by