Finding PDF for difference of two PDFs

Question

Micah Mungal el 30 de Nov. de 2018

0
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/432971-finding-pdf-for-difference-of-two-pdfs

Editada: Bruno Luong el 30 de Sept. de 2021

Hi, I have data for two normal/gaussian PDF's which overlap like this:

Now I want to get the PDF of the green region using the PDF's of the two orignal PDF's. I was told I should use a convolution of the two data sets? But I am not getting that intersection. Any advise?

6 comentarios
Mostrar 4 comentarios más antiguosOcultar 4 comentarios más antiguos

John D'Errico el 1 de Dic. de 2018

Editada: John D'Errico el 1 de Dic. de 2018

Your question is confusing, partly because you talk about computing a pdf as a result. Are you looking to...

Compute the area of overlap, as a number? That is an integral of the minimum of the two PDFs as functions? (Not sure why you would want to, but that seems to be what you have drawn.)
Compute the overlap region as a function itself? It won't be a PDF, since it lacks necessary properties, nor does it seem to make much sense to me as such.
Compute the distribution of the difference of two random variables, where each is defined by their respective (Gaussian) PDFs? This you could arguably do as a convolution, but way simpler is to just combine the means and variances as one learns in a basic statistics class. That works since both PDFs are nominally for Gaussian random variables. So the mean of the difference is: mu2 - mu1, the variance of the difference is sigma1^2 + sigma2^2. As simple as that.
Something else?

So, I think perhaps you are a bit confused. But maybe you know exactly what you want to do, and have just posed a very confusing question, thereby leaving us also confused. ;-) It is difficult to know.

Micah Mungal el 1 de Dic. de 2018

Yeah, I'm trying to compute the distribution of the difference of two random variables. So I choose use the convolution approach. I did the convolution, and got an array of values. How do I use these values? Do I have to do a plot or something?

John D'Errico el 2 de Dic. de 2018

See my answer, where I will add a comment that explains what you are trying to do, and how you might solve the problem.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Bruno Luong el 30 de Nov. de 2018

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/432971-finding-pdf-for-difference-of-two-pdfs#answer_349772

Editada: Bruno Luong el 30 de Sept. de 2021

Not sure if it helps but the green area is

{ (x:y) : 0 <= y <= min(p1(x),p2(x)) }

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Jones el 30 de Sept. de 2021

Would you mind expanding on this? I have a similar problem outlined in the original post, but i'm trying to find the green area. thanks

Iniciar sesión para comentar.

Answer 2

John D'Errico el 1 de Dic. de 2018

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/432971-finding-pdf-for-difference-of-two-pdfs#answer_350008

Editada: John D'Errico el 1 de Dic. de 2018

Why in the name of god and little green apples would you use a convolution to compute the distribution of the difference of two normally distributed random variables?????? That is, if you have:

X1 ~ N(mu1,sigma1^2)

X2 ~ N(mu2,sigma2^2)

Then the distribution of X2-X1 is also normally distributed.

X2 - X1 ~ N(mu2 - mu1 , sigma1^2 + sigma2^2)

Using a convolution is silly to do here, when an analytical result exists. (Unless of course, you need to prove that result as a homework problem in a basic course in statistics.)

http://mathworld.wolfram.com/NormalDifferenceDistribution.html

2 comentarios
Mostrar NingunoOcultar Ninguno

Micah Mungal el 1 de Dic. de 2018

I'm using a convolution because there are instances where one of the random variables does not follow a normal distribution.

But I'm still confused as to what to do with the values after I have found the convolution.

John D'Errico el 2 de Dic. de 2018

You explicitly stated that you had two normal distributions. I hate chasing a moving target. Ugh. I also dislike consulting in the comments, and this is likely to get long. Double ugh.

What do you want at the end? I think this is a big part of your problem, that you don't totally understand what you are doing, and why you are even doing it. (Sorry, but that seems to be the case here.)

You are looking to compute the distribution of a function, a transformation of two random variables. In this case, it is a difference of the two variables. The realm of statistics that deals with these problems is sometimes called "statistical tolerancing". In fact, if you do a search on those words and my name, you will find several 30ish year old papers on that topic and on Modified Taguchi methods that I wrote with Nick Zaino.

https://www.jstor.org/stable/1269802?seq=1#page_scan_tab_contents

https://dl.acm.org/citation.cfm?id=58535

The problem is the distribution of some transformation of a random variable is often not a simple one. Yes, you get lucky when you add or subtract two normal variables, since the result will still be normally distributed as I explained above. However, that is often not the case. The trick is often to use various methods to infer the distribution. This is something that pearsrand and johnsrnd help you to do. (Both are in the stats toolbox. Though I thought I remembered a pearsfit utility too, in there at some point in the past. Not there now.)

I recall that a common source of simple problems like these are in engineering design, where I recall phrases like "stack tolerancing". Here you might have a stack of gears and bushings that lie on a shaft, perhaps in a transmission, where each piece has given tolerances on its dimensions. Then you need to know the distribution of the sum of those thicknesses, because it needs to all fit inside a housing. If each individual part had a normal distribution, then again, the sum is simple to deal with. But sometimes, the transformation is not as simple as taking the sum of terms, or the element distributions are not all normally distributed.

In the papers that I reference above, we show how the problem can be solved using what were called Taguchi methods, which we then showed were actually implicit numerical integration methods, whereby we derive Modified Taguchi methods. Modified Taguchi methods use an implicit Gaussian integration to perform the integration far more accurately.

Let me see if I can give an example. (It has been a long time since I did any serious work in this area..., so give me a few minutes to write something up...)

Iniciar sesión para comentar.

Answer 3

Jeff Miller el 1 de Dic. de 2018

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/432971-finding-pdf-for-difference-of-two-pdfs#answer_350038

Abrir en MATLAB Online

1) What do you mean when you say "I did the convolution, and got an array of values"? It sounds a bit like you are using MATLAB's 'conv' or 'conv2' functions, but these are not what you want. 'convolution' means something different with random variables.

2) In your first post you said that you had 'data', so maybe you have a set of values for each random variable, call them X and Y. Are X and Y independent? If not, the problem has just become a whole lot more complicated. If so, you can approximate the distribution of differences like this:

Form all possible pairs with one score from X and one score from Y.
Compute the difference for each pair.
Make a histogram of these differences. This histogram is an approximation of the distribution of differences that you said you wanted.

3) Perhaps you want to compute the distribution of differences from the PDFs in functional form (e.g., normal) rather than from observed data values. That can also be done if the random variables are independent. As John said, there is a direct solution for this if both RVs are normal. If not, you probably have to use numerical integration. There are some routines to do this in Cupid . For example, here are the commands to compute the distribution of differences X-Y between independent gamma and normal RVs, and of course many other distributions are also defined in Cupid:

X = Normal(50,5);
Y = RNGamma(6,.01);
Diff = Difference(X,Y);
Diff.PlotDens;  % This will plot the PDF and CDF of the difference distribution.

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Micah Mungal el 2 de Dic. de 2018

Hi, thanks for the advice. Yes I was using MATLAB's conv function. My bad. I will try what you said in points (2) and (3) and see which gives me what I really want since X and Y are independent.

Iniciar sesión para comentar.

Answer 4

Bruno Luong el 2 de Dic. de 2018

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/432971-finding-pdf-for-difference-of-two-pdfs#answer_350090

Editada: Bruno Luong el 2 de Dic. de 2018

Abrir en MATLAB Online

For 2 dependent variables A and B, the pdf of A+B is

pdf(A+B) = pdf(A) * pdf(B)

where "*" is a continuous convolution operator

(F*G)(t) = integral from 0 to infinity F(tau)G(t-tau) * dtau.

If you want the PDF of A-B, then just do

pdf(A-B) = pdf(A) * pdf(-B) = pdf(A) * flip(pdf(B))

where

flip(G)(x) := G(-x)

If you want to approximate the continuous convolution by discrete convolution (take an appropriate step), that will give you one way to compute pdf of (A-B).

Just follow that math logic, then you'll find how to derive an approximatio of PDF of A-B from discrete convolution.

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Finding PDF for difference of two PDFs

6 comentarios
Mostrar 4 comentarios más antiguosOcultar 4 comentarios más antiguos

Respuestas (4)

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Community Treasure Hunt

Finding PDF for difference of two PDFs

6 comentarios Mostrar 4 comentarios más antiguosOcultar 4 comentarios más antiguos

Respuestas (4)

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

2 comentarios Mostrar NingunoOcultar Ninguno

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Community Treasure Hunt

6 comentarios
Mostrar 4 comentarios más antiguosOcultar 4 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos