Why wont unique function eliminate duplicate rows in timetable?

10 visualizaciones (últimos 30 días)
Nicholas Gaug
Nicholas Gaug el 23 de Feb. de 2022
Comentada: Kevin Johnson el 19 de Abr. de 2022
I have a table of a large data set that contains many duplicate time and am trying to remove them. The documentation say that when using the "unique" function on a timetable, it will take into account the row times and row values independently. However, when I use this function, it returns the exact same datatable even though there are hundreds of duplicate times. I used dateshift to round off the times to the nearest second but this didn't help. In my code, "Date_time" is the name of the first column of the table which contains the datetime values.
RadarTable = readtable('RADAR_DATA.xlsx');
RadarTable.Date_time = dateshift(RadarTable.Date_time,'start','second','nearest');
RadarTimeTable = table2timetable(RadarTable);
RadarTableFiltered = unique(RadarTimeTable);

Respuestas (1)

David Hill
David Hill el 23 de Feb. de 2022
RadarTable = readtable('RADAR_DATA.xlsx');
RadarTimeTable = table2timetable(RadarTable);
[~,idx]=unique(RadarTimeTable.time);%not sure what your time column is called.
RadarTimeTable=RadarTimeTable(idx,:);
  3 comentarios
David Hill
David Hill el 23 de Feb. de 2022
If the answer is acceptable, please accept it to close out your question.
Kevin Johnson
Kevin Johnson el 19 de Abr. de 2022
%David, I have a similar problem and this did not work for me.
%The original timetable looks like this:
%tt=
19-Apr-2022 11:50:00 6.9388 6.9402 6.9354 6.9364 12.308 NaN
19-Apr-2022 12:00:00 6.9365 6.9373 6.9346 6.9361 12.299 NaN
19-Apr-2022 12:10:00 6.9361 6.9368 6.9344 6.935 11.226 NaN
% Let's say for some reason I download the same data again into ttagain and
% concatenate it with the original data, then attempt to remove the duplicates
% as follows:
tt=[tt;ttagain];
[~,idx]=unique(tt);
newtt=tt(idx,:);
%the results look like this:
%newtt=
19-Apr-2022 11:50:00 6.9388 6.9402 6.9354 6.9364 12.308 NaN
19-Apr-2022 11:50:00 6.9388 6.9402 6.9354 6.9364 12.308 NaN
19-Apr-2022 12:00:00 6.9365 6.9373 6.9346 6.9361 12.299 NaN
19-Apr-2022 12:00:00 6.9365 6.9373 6.9346 6.9361 12.299 NaN
19-Apr-2022 12:10:00 6.9361 6.9368 6.9329 6.9338 11.966 NaN
19-Apr-2022 12:10:00 6.9361 6.9368 6.9329 6.9338 11.966 NaN
%Duplicate rows are not eliminated. Why? What alternate approach might I use?
%Thanks,
%Kevin

Iniciar sesión para comentar.

Categorías

Más información sobre Data Preprocessing en Help Center y File Exchange.

Productos


Versión

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by