Precision and Recall based on Matrix

Dear all,
I have 10 query images and 100 gallery images. Each query image has 10 images with different image noise. Based on some similarity critaria, I have some similarity score in R. where R is 100 x 10 matrix. Where each column is similarity score of query image with gallery images.
If I sort R based on similarity score in descending order. Then in column 1, first 10 images should be correct images of query 1. So the indices of first 10 rows in column should be between from 1-10.
In column two, first 10 images should be correct images of query 2. So te indices of first 10 rows in column two should be between 11-20. So on so far.
Based on that I want to compute the Precision and recall. Since there are 10 images for each query image, so in ideal case, for query 1 image we get following column 1 in R
1 11
3 15
7 16
6 18
2 17
5 12
8 13
10 14
9 21
4 20
. 19
.
.
.
.
Ideally, for column 1 (query 1), we should have first 10 indices range 1-10 in column 1. so in this case, precision and recall is 1 on every step.
But for query 2, ideally, first 10 indices should be in range 11-20, but we can see that the 9th element is greater then 20, so in that case, at recall .9 the precision is 9/10 because the ninth image is at index number 10, and for recall 1 precision will be 10/11.
I have tried to explain well, in case it is messy or not clear then please suggest me.

2 comentarios

Walter Roberson
Walter Roberson el 25 de Ag. de 2012
How do you define "precision" and "recall" for your purpose?
Aravin
Aravin el 25 de Ag. de 2012
Thanks Walter, I have updated the question. please advise.

Iniciar sesión para comentar.

 Respuesta aceptada

Junaid
Junaid el 28 de Ag. de 2012
As I understand, you have 10 copies for your each given searching image. So I try to do it with loops based on some assumptions which I can infer from your explanations. R contains similarity score and you sort it in descending order implies that higher values will be on the top. Each column is for searching query and your queries are also in sorted order w.r.t columns.
I assume R is 100 into 10. You are tyring to calculate the precision on every recall where recall = [0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0]
lets start.
[s ind] = sort(R, 'descend');
recall = 10:10:100;
P = zeros(10, size(R,2));
for i =1:size(R,2)
col = ind(:,i);
if i==1, col = col < 11; else col = col > recall(i-1) & col<= recall(i); end
r = find(col==1); r = r(:);
t = 1:size(recall,2); t = t(:);
P(:,i) = t./r;
end
Now P have precision values for each image. You can take mean
mean (P,2)
to get overall precision.
Now example on some Random R
R = rand(100,10);

Más respuestas (1)

Walter Roberson
Walter Roberson el 25 de Ag. de 2012

0 votos

You do not have enough information to calculate precision or recall. In order to calculate those, you need information about which gallery image each query image "really" is. You do not know that (or at least you do not say that you know that.)
You wrote, "Then in column 1, first 10 images should be correct images of query 1". That may be the ideal case, but it involves assumptions that cannot be justified (not without more information.) If two of what you refer to as "query images" are very similar, then it is possible that because of the noise levels, the "better" match in practice is against the alternative image.
Your terminology about "query" and "gallery" seems nearly the opposite of what other posters have been using in the past. Posters who have discussed this subject in the past usually have stored "clean" images that they refer to as "gallery" images, and they then have images (possibly noisy, possibly with small variations from the gallery examples) that are to be queried ("query images") against the gallery to determine which gallery member it matches against. Notice the difference here about which set of images is noisy: the other people refer to the non-noisy images as being the gallery images.
There are, however, cases in which the original input consists of a number of clean images, each belonging to a known group, and then when the query images are presented, the question is which group the query matches against. For example, the original set might have examples of 10 different cat faces, 12 different dog profiles, 8 different mermaid statues, and 15 different cars, and then when presented with a "query" image that contains a number of items, the task would be to figure out which category is more predominant in the scene (or "none of the above" if it doesn't match anything.) This kind of image categorization is notably different than what you describe your needs as being, but it is the variety of analysis in which there might be multiple gallery images per category the way you describe.
If you had groups of "noisy" images all in one category, then the first task would usually be to de-noise and then arrive at single a consensus or median or average image that would be what was compared against. The gallery would not be 100, the gallery would be the number of grouping (10), one consensus image per grouping. The approach would be different than the case where you had 10 different cat faces that together are to intended to capture the idea "cat".

7 comentarios

Aravin
Aravin el 27 de Ag. de 2012
Hi Walter,
lets think like this. Now I have matrix I which contains the indices. How these indices are achieved or obtained is beyond the scope of my question.
In that matrix, for column one, first 10 indices should be between 1-10. and for second column those should be 11-20, and so on.
I want to compute the precision on those. In case of column 1, I must first the numbers from 1-10 in column 1 and compute the precision on those. As shown in the example in main question, the numbers (indices) between 1-10 are present in top 10 position so precision is 1 on all. But in second column I don't find 11-20 numbers in first 10 positions. So now for row 9 in column two, the precision can be computed but on row 10 the precision should be 9/10 because the 9th number between 11-20 is present at index 10. So precision at this stage is 9/10.
and for row 11, the precision is 10/11.
That is all I want to compute in few lines. Though I can do it by loops but I want smart solution.
Walter Roberson
Walter Roberson el 27 de Ag. de 2012
Editada: Walter Roberson el 27 de Ag. de 2012
Hint: repmat() and mod()
Hi Walter,
I tried but could not catchup the hint. Could you please make hint more understandable.
I tried like this.
b =[10:10:100];
b = repmat(b, 100,1);
I < b;
but i could not proceed further. :-(
Walter Roberson
Walter Roberson el 27 de Ag. de 2012
Editada: Walter Roberson el 27 de Ag. de 2012
Actually kron() is easier than repmat() for this purpose.
kron(1:10,ones(10,10))
Also, I should have said fix() instead of mod()
Aravin
Aravin el 27 de Ag. de 2012
Editada: Aravin el 27 de Ag. de 2012
Sorry. I could not catchup your hint :-(
Walter, could you explain little more please...
Walter Roberson
Walter Roberson el 27 de Ag. de 2012
Editada: Walter Roberson el 27 de Ag. de 2012
fix((R - 1)/10)+1 == kron(1:10,ones(10,10))
Thanks walter,
BUT I get error, as kron dimensions are not consistent. To do so, I do following this.
fix((R - 1)/10)+1 == kron(1:10,ones(100,10));
BUT i m not getting required answer. I hope, I was clear enough to explain my problem.

Iniciar sesión para comentar.

Productos

Preguntada:

el 25 de Ag. de 2012

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by