Multivariate regression for constrained dependent variables

Question

0 votos

I have a database formed by 6 independent variables and 2 dependent variables. The equation for each dependent variable should be something like:

I want to find βs that satisfy

.

Which metodology should I use?

I tried using mvregress, but I couldn't constrain the dependent variables. I am also thinking of using optimization as well, but I still couldn't figure out how to constrain the dependent variables.

Thank you so much!

8 comentarios
Mostrar 6 comentarios más antiguos Ocultar 6 comentarios más antiguos

Umar el 11 de Ag. de 2024

Hi @Ismail,

To address your comments regarding, “ Ah I see, so I don't need to put any constrained values in that case, right?Also, does that mean that there is no function that could deliberately put dependent variables as the constrains, right?”

You are correct that if you choose not to impose constraints on your beta values, you will allow for greater flexibility in fitting your model. However, if you were to impose equality constraints on dependent variables (like ensuring their sum equals a certain value), you would indeed face challenges. As you noted, with N data points, this would generate multiple constraints—one for each data point—thereby reducing your degrees of freedom significantly. Now, in situations where there are many data points (N), applying equality constraints can lead to an over-constrained system. Each constraint would reduce your available degrees of freedom (the number of independent pieces of information you have), which can hinder your ability to find a unique solution for the beta values. Given that traditional methods may not accommodate your specific needs effectively, consider reformulating your problem as an optimization goal rather than applying strict equality constraints. For instance, you could frame it as minimizing the difference from 1 while allowing for some tolerance—this can be expressed as a penalty function or through regularization techniques. In addition, if overfitting is a concern or if you wish to incorporate some form of control over the estimated parameters, consider using regularization techniques such as Lasso or Ridge regression. I would suggest you should also take @ John D'Errico’s advice into account as well.

John D'Errico el 11 de Ag. de 2024

Editada: John D'Errico el 11 de Ag. de 2024

A ridge regression as suggested by @Umar may be a great idea here. Classically, ridge regression is a technique to bias a regression estimator, pushing the estimated coefficients towards zero for a singular or nearly singular problem. But you can also use different bias forms. One common one I have used is to bias coefficients towards smoothness, in the context of a curve fitting problem (think about splines, or low order piecewise functions here.)

In your context, I would build a scheme to bias the coefficients such that the expression (y1hat+y2hat-1) is always small. That is, write the target expression as a quadratic form. The nice thing is, you could implement that scheme without need to use a nonlinear tool like lsqnonlin. Backslash would be sufficient in the end if you do it carefully.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Follow Question

Answer 1

Umar el 10 de Ag. de 2024

0 votos

Hi @Ismail,

I would suggest utilizing a constrained optimization technique approach and formulate the problem as an optimization task where you minimize a cost function subject to equality constraints on the dependent variables. To start working on this process, first define your data by generating random data for independent variables X and dependent variables Y.

X = randn(10, 6); % Independent variables (10 samples, 6 features)

Y = randn(10, 2); % Dependent variables (10 samples, 2 outputs)

Then, define the equation constraint by specifying the coefficients matrix Aeq and the right-hand side beq of the equation constraint.

Aeq = [1, 2, 3, 4, 5, 6]; % Coefficients matrix for the equation constraint

beq = 10; % Right-hand side of the equation constraint

Afterwards, define the objective function by creating a function that represents the error you want to minimize. In this case, it can be the least squares error between the actual Y and the predicted values based on X and beta.

fun = @(beta) norm(Y - X*beta)^2;

Now, provide an initial guess for the coefficients beta.

beta0 = zeros(6, 1); % Initial guess for beta (6 coefficients)

Use fmincon function for constrained optimization with the specified equality constraint.

options = optimoptions('fmincon', 'Algorithm', 'interior-point');

beta = fmincon(fun, beta0, [], [], Aeq, beq, [], [], [], options);

Display optimal beta values by outputting the optimal values of beta that satisfy the equation constraint.

disp('Optimal beta values:');

disp(beta);

By implementing this method in your code, you should be able to use constrained optimization to find the βs that satisfy your equation constraint while considering the dependencies between the variables. This methodology also allows you to optimize the coefficients under the specified constraints, providing a solution to your problem. Please let me know if you have any further questions.

6 comentarios
Mostrar 4 comentarios más antiguos Ocultar 4 comentarios más antiguos

Umar el 10 de Ag. de 2024

Hi @Ismail,

We can all agree that when setting up an optimization problem with equality constraints, the coefficients matrix Aeq defines the linear combination of the decision variables (in this case, the coefficients beta) that must sum up to the specified right-hand side beq and the equation represented by Aeq and beq constrains the feasible solutions to lie on a hyperplane defined by the coefficients matrix. In my provided example,

Aeq = [1, 2, 3, 4, 5, 6]; % Coefficients matrix for the equation constraint

beq = 10; % Right-hand side of the equation constraint

This equation constraint can be interpreted as: 1 * β1 + 2 * β2 + 3 * β3 + 4 * β4 + 5 * β5 + 6 * β6 = 10

and represents a specific relationship that the coefficients beta need to satisfy in addition to minimizing the least squares error between the actual Y and the predicted values based on X and beta. So, the Aeq and beq values are not arbitrary but are derived from the problem's constraints and requirements and by incorporating the equation constraint defined by Aeq and beq, you are guiding the optimization algorithm to search for solutions that not only minimize the error function but also adhere to the specified linear relationship among the coefficients which mkes sure that the optimized coefficients beta satisfy both the error minimization objective and the defined equality constraint.Therefore, the inclusion of Aeq and beq in the optimization problem provides a structured approach to finding solutions that meet all specified conditions, leading to a more robust and tailored optimization outcome. I hope this detailed explanation clarifies the role of Aeq and beq in constrained optimization and their importance in ensuring that the solution align with the defined constraints. If you still have any further questions or need additional clarification, please feel free to ask.

Ismail el 10 de Ag. de 2024

Thank you for your answer!

That explains a lot!

Umar el 10 de Ag. de 2024

Hi @Ismail,

Have you noticed the comments shared by @Torsten and @ John D'Errico, they have provided excellent examples to your queries. Please let us know if all your questions have been answered. Please don’t forget to click “accept answer” and vote for my friends. They have years of experience and have helped tackle lot of more complex problems than this one.

Iniciar sesión para comentar.

Answer 2

Torsten el 10 de Ag. de 2024

Editada: Torsten el 10 de Ag. de 2024

Abrir en MATLAB Online

0 votos

Define a Nx12 matrix A as

A = [x11,x12,x13,x14,x15,x16,x21,x22,x23,x24,x25,x26]

and a Nx1 vector b as

b = ones(N,1)

and solve for beta = [beta11;beta12;beta13;beta14;beta15;beta16;beta21;beta22;beta23;beta25;beta26] by using

beta = A\b

If you also have values for the y, your matrix A becomes 3*N x 12 as

A = [x11,x12,x13,x14,x15,x16,x21,x22,x23,x24,x25,x26;x11,x12,x13,x14,x15,x16,zeros(N,6);zeros(N,6),x21,x22,x23,x24,x25,x26];

your vector b becomes 3*N x 1 as

b = [ones(N,1);y1;y2]

and you solve for beta = [beta11;beta12;beta13;beta14;beta15;beta16;beta21;beta22;beta23;beta25;beta26] by using

beta = A\b

If it's the latter case you have to consider, you can also give weights to the three different error components as @John D'Errico does in his suggestion (alpha1, alpha2, alpha3).

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Answer 3

John D'Errico el 10 de Ag. de 2024

Abrir en MATLAB Online

0 votos

Sorry. You CANNOT insure that y1+y2 == 1, EXACTLY. At least not if there is any noise in your data.

All you can do is to use it as a target. And then that target will conflict with your goal of minimizing the errors in y1 and y2. As such, you might set it up as a problem of the form

alpha1*norm(1 - y1hat - y2hat)^2 + alpha2*norm(y1 - y1hat)^2 + alpha3*norm(y2 - y2hat)^2

where y1hat and y2hat are the predictions from the separate regression models, and alpha1, alpha2, and alpha3 are parameters you must choose to tradeoff the residuals from each piece of the problem.

An important point I want to make is that the lack of a constant term in your models may well be a significant problem.

Anyway, how might you do this? I suppose you could solve it using lsqnonlin, though, since the models do have parameters that enter in linearly, it would seem a linear modeling tool must be an option too.

Let me see if I can give an example. I'll need to make up some data, since you have provided none that I see.

N = 50;
X1 = randn(N,6);
X2 = rand(N,6);
Y1 = 0.4 + randn(N,1)/100;
Y2 = 0.6 + randn(N,1)/100;

As you can see, the data does have the property that y1+y2 should be 1, but for the noise in the system.

If I model each process separately, I get this:

B1 = X1\Y1;
B2 = X2\Y2;
[B1,B2]
ans = 6x2
    0.0587    0.3051
   -0.0032    0.1956
    0.0102    0.1781
    0.0732    0.1198
    0.0657    0.2308
   -0.0492    0.1005
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>

It is just random crap data, so I really don't care about how well the models fit. I expect the models will be pretty poor, in fact.

Y1hat = X1*B1;
Y2hat = X2*B2;
[norm(Y1 - Y1hat),norm(Y2 - Y2hat)]
ans = 1x2
    2.6661    1.0332
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>

But what happens with the predictions? As done separately, how well does it do?

norm(1 - (Y1 + Y2))
ans = 0.1054
norm(1 - (Y1hat + Y2hat))
ans = 3.0541

So, with no target put on the parallel models, things actually got WAY worse in terms of the sum of the predictions being 1. Again, the lack of a constant term in your model is probably a huge factor here.

Anyway, having built the models separately, can we formulate this as a regression model, but with the sum target built in?

alph = [1/3 1/3 1/3];
B12start = randn(12,1);
[B12est,~,~,exitflag] = lsqnonlin(@(B12) obj(B12,X1,X2,Y1,Y2,alph),B12start);
Local minimum found.

Optimization completed because the size of the gradient is less than
the value of the optimality tolerance.
reshape(B12est,6,2)
ans = 6x2
    0.0599    0.4327
   -0.0137    0.2133
    0.0158    0.2106
    0.0582    0.1975
    0.0538    0.3339
   -0.0636    0.0829
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Y1hatest = X1*B12est(1:6);
Y2hatest = X2*B12est(7:12);
[norm(Y1 - Y1hatest),norm(Y2 - Y2hatest)]
ans = 1x2
    2.6715    1.6373
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
norm(1 - (Y1hatest + Y2hatest))
ans = 2.1121

As you can see, the models are a little different. The coefficients have changed, but not by a lot. But now the target for the sum of the two models has had its error reduced.

function res = obj(B12,X1,X2,Y1,Y2,alph)
  Y1hat = X1*B12(1:6);Y2hat = X2*B12(7:12);
  res = [alph(1)*(1 - (Y1hat + Y2hat));alph(2)*(Y1-Y1hat);alph(3)*(Y2-Y2hat)];
end

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Multivariate regression for constrained dependent variables

8 comentarios
Mostrar 6 comentarios más antiguos Ocultar 6 comentarios más antiguos

Respuesta aceptada

6 comentarios
Mostrar 4 comentarios más antiguos Ocultar 4 comentarios más antiguos

Más respuestas (2)

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Categorías

Productos

Versión

Etiquetas

Community Treasure Hunt

Multivariate regression for constrained dependent variables

8 comentarios Mostrar 6 comentarios más antiguos Ocultar 6 comentarios más antiguos

Respuesta aceptada

6 comentarios Mostrar 4 comentarios más antiguos Ocultar 4 comentarios más antiguos

Más respuestas (2)

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Categorías

Productos

Versión

Etiquetas

Ver también

Community Treasure Hunt

8 comentarios
Mostrar 6 comentarios más antiguos Ocultar 6 comentarios más antiguos

6 comentarios
Mostrar 4 comentarios más antiguos Ocultar 4 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos