How do I merge/average several datasets (Time, "Y," "Z")?
2 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Dear community,
I have a set of measured data (Time, Current, Voltage) and want to merge several measurments in one curve. As the data have different time lengths but specific segments, how can I merge the data since the specific segments have different lengths and time?
I want to merge the [TotTime, TotCurrent, TotVoltage] OR merge [TotTime, TotCurrent] and [TotTime, TotVoltage] separately but with same Time variable.
Please refer to the attached datasets.
Thanks in advance,
Kevin
0 comentarios
Respuestas (1)
Jaynik
el 18 de Abr. de 2024
Hi Kevin,
One method to merge the data is take a full outer join based on the 'time' column and take a mean for the same values of time.
The "outerjoin" function can be used to merge the tables as it combines tables based on a key variable ('Time'), and it includes all rows from each table, even if there are no matching rows in the other tables. The 'MergeKeys' option is set to true to indicate that the key column is not repeated in the merged table.
For each time point, the mean has been calculated for the current and voltage across the three datasets ignoring any 'NaN' values.
Following is the code for the same:
load NewCellTestData.mat
load NewCellTestData2.mat
load NewCellTestData3.mat
% Convert the arrays to tables
T1 = array2table([TestData.TotTime, TestData.TotCurrent, TestData.TotVoltage], 'VariableNames', {'Time', 'Current', 'Voltage'});
T2 = array2table([TestData2.TotTime, TestData2.TotCurrent, TestData2.TotVoltage], 'VariableNames', {'Time', 'Current', 'Voltage'});
T3 = array2table([TestData3.TotTime, TestData3.TotCurrent, TestData3.TotVoltage], 'VariableNames', {'Time', 'Current', 'Voltage'});
% Merge the tables using full outer join
mergedTable = outerjoin(T1, T2, 'Keys', 'Time', 'MergeKeys', true);
mergedTable = outerjoin(mergedTable, T3, 'Keys', 'Time', 'MergeKeys', true);
% Compute the mean current and voltage ignoring NaNs
mergedTable.MeanCurrent = mean([mergedTable.Current_T1, mergedTable.Current_T2, mergedTable.Current], 2, "omitnan");
mergedTable.MeanVoltage = mean([mergedTable.Voltage_T1, mergedTable.Voltage_T2, mergedTable.Voltage], 2, "omitnan");
% If MeanCurrent is NaN, choose Current_T1, if that is NaN, choose Current_T2, else choose Current_T3
mergedTable.MeanCurrent(isnan(mergedTable.MeanCurrent)) = mergedTable.Current_T1(isnan(mergedTable.MeanCurrent));
mergedTable.MeanCurrent(isnan(mergedTable.MeanCurrent)) = mergedTable.Current_T2(isnan(mergedTable.MeanCurrent));
mergedTable.MeanCurrent(isnan(mergedTable.MeanCurrent)) = mergedTable.Current(isnan(mergedTable.MeanCurrent));
% Same operation as above for MeanVoltage
mergedTable.MeanVoltage(isnan(mergedTable.MeanVoltage)) = mergedTable.Voltage_T1(isnan(mergedTable.MeanVoltage));
mergedTable.MeanVoltage(isnan(mergedTable.MeanVoltage)) = mergedTable.Voltage_T2(isnan(mergedTable.MeanVoltage));
mergedTable.MeanVoltage(isnan(mergedTable.MeanVoltage)) = mergedTable.Voltage(isnan(mergedTable.MeanVoltage));
% Remove the original current and voltage columns to obtain final 3 columns
mergedTable(:, {'Current_T1', 'Voltage_T1', 'Current_T2', 'Voltage_T2', 'Current', 'Voltage'}) = [];
Since data is broken and not consistent across the three dataset, this approach gives a lot of 'NaN' which are removed. You will need to clean the dataset for better results and use the appropriate statistics measure instead of mean based on your requirements.
You can read more about "outerjoin" here: https://www.mathworks.com/help/releases/R2021a/matlab/ref/outerjoin.html
I hope this helps!
0 comentarios
Ver también
Categorías
Más información sobre Cell Arrays en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!