Outerjoin: resulting table has repeated rows and has erased some data
28 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Hi everyone!
I have two tables: table 1, and table 2 (sorry it's showing up as table 11 in the pictures). They each have one column in common - it's called "a". They are of different lengths: table 1 is a 96514x2 double, and table 2 is a 95646x2 double.
I want to combine these two tables. Any times they have in common, I'd like them to share the data and put it into one row, and any times they don't have in common, I'd like the table that doesn't have that time to just be NA.
I have used the function outerjoin. however, when I use this, I get the resulting table T, which is a 459,134x3 double. As you can see from the attached file, many of these are repeats, AND it's smaller than the parent two tables, so I lose a lot of data.
I don't know why this is happening, or what I should do instead.
and then finally, table T
here is my code:
%this is me making my tables (which are originally an array, haha, sorry) into actual tables
table1=array2table(table1)
table1.Properties.VariableNames(1:2)={'a', 'b'}
table2=array2table(table2)
table2.Properties.VariableNames(1:2)={'a','c'}
%now me using the outerjoin function
T=outerjoin(table1, table2, 'MergeKeys', true)
thank you for your help!!
2 comentarios
Dyuman Joshi
el 2 de Nov. de 2023
"I don't know why this is happening"
Because the values are not the same, though they might appear to be so.
Can you share the data you are working with? If they are too big to attach, a sample of it would be good enough.
Use the paperclip button to attach.
Respuestas (2)
Vatsal
el 17 de Nov. de 2023
Hi,
The “outerjoin” function in MATLAB performs an outer join operation on two tables based on one or more key variables. It includes all rows from both tables and fills the missing values with NaN (Not-a-Number) for the non-matching rows. However, it seems that the resulting table ‘T’ is not as expected.
One possible reason for this behaviour is that the key variable "a" might contain duplicate values in both tables. When performing an outer join, the duplicate values can cause the resulting table to have repeated rows.
To address this issue, you can try the following steps:
- Remove any duplicate rows from both tables using the unique function. This ensures that each key value appears only once in each table.
- Perform the outer join on the cleaned tables using the ‘outerjoin‘ function.
To learn more about the usage and syntax of ‘outerjoin’, you may refer to the MathWorks documentation links below:
I hope this helps!
Steven Lord
el 27 de Nov. de 2023
Rather than storing your data as tables with one variable containing serial date numbers, consider converting those serial date numbers to datetime and converting your tables to timetables using table2timetable. If you do that you could synchronize the timetables (which would allow you to change the time basis, synchronize them to times that appear in either the intersection or union of the times, etc.)
0 comentarios
Ver también
Categorías
Más información sobre Tables en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!