How do I identify unique rows based on multiple columns and calculate the average of the rest of the columns?

Question

Leon el 19 de Oct. de 2021

0
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/1567308-how-do-i-identify-unique-rows-based-on-multiple-columns-and-calculate-the-average-of-the-rest-of-the

Comentada: Leon el 19 de Oct. de 2021

I have a matrix as below:

A = [1 4 3 8; 4 5 6 9; 1 6 3 6; 2 6 9 3; 1 5 3 7];

My goal is to identify rows with both identical Column 1 and identical Column 3 values, and then calculate the average for the rest of the columns, i.e., Column 2 and Column 4, within these duplicate rows. In this example, the duplicate rows would be Rows # 1, 3, and 5. My ending matrix would be:

B = [1 5 3 7; 4 5 6 9; 2 6 9 3];

This is a much simplified example. In reality, I have 35 columns that need to be averaged, and millions of rows. What is the most efficient way of handling this? Do I have to write a loop and process each of the unique rows individually?

Many thanks!

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Kevin Holly el 19 de Oct. de 2021

1
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/1567308-how-do-i-identify-unique-rows-based-on-multiple-columns-and-calculate-the-average-of-the-rest-of-the#answer_812193

Editada: Kevin Holly el 19 de Oct. de 2021

Abrir en MATLAB Online

A = [1 4 3 8; 4 5 6 9; 1 6 3 6; 2 6 9 3; 1 5 3 7]
A = 5×4
   4     3     8
   5     6     9
   6     3     6
   6     9     3
   5     3     7

I am going to assume that any row that columns 1 and 3 are identical, irregardless of what pair, you want to ignore those rows in the averaging of other columns.

Here is my approach:

t = table(A(:,1),A(:,3))
t = 5×2 table
    Var1    Var2
    ____    ____

     1       3  
     4       6  
     1       3  
     2       9  
     1       3  
[C, ia, ic] = unique(t,'rows')
C = 3×2 table
    Var1    Var2
    ____    ____

     1       3  
     2       9  
     4       6  
ia = 3×1
     1
     4
     2
ic = 5×1
     1
     3
     1
     2
     1
ic==1
ans = 5×1 logical array
   1
   0
   1
   0
   1
A(ic==1,:)
ans = 3×4
     1     4     3     8
     1     6     3     6
     1     5     3     7
mean(A(ic==1,:))
ans = 1×4
     1     5     3     7
B = [mean(A(ic==1,:));A(ic~=1,:)]
B = 3×4
     1     5     3     7
     4     5     6     9
     2     6     9     3

Your answer:

B = [1 5 3 7; 4 5 6 9; 2 6 9 3]
B = 3×4
     1     5     3     7
     4     5     6     9
     2     6     9     3

Without showing work:

t = table(A(:,1),A(:,3));
[~,~,ic] = unique(t,'rows');
B = [mean(A(ic==1,:));A(ic~=1,:)]
B = 3×4
     1     5     3     7
     4     5     6     9
     2     6     9     3

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Leon el 19 de Oct. de 2021

Many thanks for the solution!

Iniciar sesión para comentar.

How do I identify unique rows based on multiple columns and calculate the average of the rest of the columns?

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

How do I identify unique rows based on multiple columns and calculate the average of the rest of the columns?

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos