Reshape matrix adding columns of zeros at specific index

Hello,
I am a beginner in Matlab and right I am a bit stack in a part of my code. The code import a baseline txt. file data that contain variables from combustion engine. The baseline has A=4x178 first row name of variables and the other three numbers.
The thing is that after a time (project duration) the number of variables is increased (new variables are added) and now I have a txt files composed by B=4X204 with 26 new variables.
Without enter in detail the script to export the txt. file in the right format, what I have got is a vector of index of which new variables in B are not in A.
Now, what I am try to do is reshape A to be the same size than B but with zeros columns at the specific index.
I would be really grateful whether somenone could point me out in the right direction.
Thank you very much in advanced.
if Start_comparison==0
if columns_baseline<columns_logs
index_new=~ismember(Variables_Logs,Variables_Baseline);
Numeric_index=find(double(index_new));
end
end

1 comentario

Guillaume
Guillaume el 22 de Nov. de 2018
Editada: Guillaume el 22 de Nov. de 2018
What are A and B, tables? cell arrays? something else? They can't be matrices.
Does Variables_Logs correspond to B and Variables_Baseline correspond to A? What are the actual variable names for A and B.
Note that in your code, the conversion to double (in the find) is completely unnecesarry. It would work just as well an faster if you leave index_new as a logical array. The find is probably unnecessary as well.

Iniciar sesión para comentar.

 Respuesta aceptada

The simplest way to do what you want is to import the files as tables and let matlab do all the work of figuring out the format and missing columns for you. readtable with the correct import option can do it all. Interestingly, testing with your files let me find some strange quirks of readtable that I'll be reporting shortly to Mathworks.
Note that your demo files have a tab character at the end of each line. That's not normal and even in excel results in an extra column of data imported. You ought to fix your export code from puma. i assume it's not written by AVL. Thankfully, we can tell matlab not to import that last empty column.
I'm not reproducing the UI part of your code, the rest can be replaced by:
%these three comes from UI
path = pwd;
Baseline_TXT = 'Baseline.txt';
Log_TXT = 'Logs.Txt'
%import code
opts = detectImportOptions(fullfile(path, Log_TXT)); %detect format of text file
opts.ExtraColumnsRule = 'ignore'; %to ignore extra tab at the end of lines. Otherwise, default rule of 'addVars' would add an extra variable
%default of opts.MissingRule is 'fill' which is exactly what we want. Missing columns will be filled by FillValue (NaN by default)
%We could tell matlab to import the 2nd line as units. However, there's a quirk where it doesn't work with missing variables. Not critical anyway
%opts.VariableUnitsLine = 2; %bug!
log = readtable(fullfile(path, Log_TXT), opts);
baseline = readtable(fullfile(path, Baseline_TXT), opts);
Since I'm using the import options of the log file for loading the baseline. It uses all the variable defined in the log file, and if not present fill them with NaN because of the default 'fill' value of MissingRule. As you can see in just 4 lines, everything is done.

6 comentarios

Hi again,
Thank you for the answer. The exporting data is directly from PUC AVL system, therefore, I believe that this was programming from AVL. I will ask them for the next service we have on place.
The code you provided me it is giving an error that say:
??? Undefined function or method 'detectImportOptions' for input arguments of type 'char'.
Is thi option only compatible for certian versions of matlab? I am using 2007b. It is a bit old and maybe that is the reason.
In any case, I really appreciate the answer.
A bit old! tables were introduced 5 years ago, and detectImportOptions in 2016. You're missing out on a lot of improvements to matlab, particularly with regards to importing data.
With R2007b, you're indeed stuck with more or less the parsing you have. Possibly, you could simplify it with xlsread on a system with excel, at the expense of speed. The reconciling of columns between the two files still wouldn't be complicated, you'd use setdiff and co.
Unfortunately, now that Star has deleted his answer with all your comments, I can't remember how you have the data stored. Can you repost your original description as a comment to the question?
Hi again,
Yes, I know it is what my company had licences available as they do not use matlab a lot. I need to provide something useful to request newer version :(.
Thank you very much for your help.
This is how I did:
%================================ Script==================================%
% This part of the script is used to import baseline data from ASCII file
% created by AVL Puma during Daily_Check.
% This provides three cell arrays to work with and export to excel:
% Name--> Daily_Check name.
% Variables_Names--> Provides the variable names. Not include Units as
% they are not required to the objective of this analysis.
% Log_Points--> Provides the values of the variables.
%======================= Selection of log type ===========================%
answer = questdlg('Is it baseline log?','Log type','Yes','No, select log point','Cancel','Cancel');
switch answer
case 'Yes'
process=1;
case 'No, select log point'
process=0;
case 'Cancel'
process=2;
end
%=========================================================================%
%================ Importing data for baseline logs =======================%
if process==1
[Baseline_TXT,path] = uigetfile('*.txt');
DELIMITER = '\t';
HEADERLINES = 1;
rawData1 = importdata(Baseline_TXT, DELIMITER, HEADERLINES);
[unused,name1] = fileparts(Baseline_TXT);
newData1.(genvarname(name1)) = rawData1;
delimiter = '\t';
startRow = 3;
columns_baseline=size(rawData1,2);
formatSpec=repmat('%s',1,columns_baseline);
fileID = fopen(Baseline_TXT,'r');
dataArray1 = textscan(fileID, formatSpec,'delimiter',delimiter, 'MultipleDelimsAsOne', true, 'HeaderLines' ,startRow-1);
fclose(fileID);
Numeric_Baseline = [dataArray1{:}];
Numeric_Baseline2=str2double(Numeric_Baseline);
Variables_Baseline=cellstr(rawData1);
Baseline_Points=Numeric_Baseline2;
elseif process==0
%=========================================================================%
%=================== Importing data for baseline logs ====================%
[Log_TXT,path] = uigetfile('*.txt');
DELIMITER = '\t';
HEADERLINES = 1;
rawlogs = importdata(Log_TXT, DELIMITER, HEADERLINES);
[unused,name] = fileparts(Log_TXT);
newData1.(genvarname(name)) = rawlogs;
delimiter = '\t';
startRow = 3;
columns_logs=size(rawlogs,2);
formatSpec=repmat('%s',1,columns_logs);
fileID = fopen(Log_TXT,'r');
dataArray = textscan(fileID, formatSpec,'delimiter',delimiter, 'MultipleDelimsAsOne', true, 'HeaderLines' ,startRow-1);
fclose(fileID);
Numeric_Logs = [dataArray{:}];
Numeric_Logs2=str2double(Numeric_Logs);
Variables_Logs=cellstr(rawlogs);
Log_Points=Numeric_Logs2;
Start_comparison=0;
%=========================================================================%
elseif process==2
disp('Run the script again selecting the right option');
end
%=================== Check Number of Variables to Be continue===========================%
if Start_comparison==0
if columns_baseline<columns_logs
index_new=~ismember(Variables_Logs,Variables_Baseline);
Numeric_index=find(double(index_new));
elseif columns_baseline>columns_logs
elseif columns_baseline>columns_logs
end
end
I need to provide something useful to request newer version :(.
4 lines of robust code vs lots of lines of fragile code. Isn't that useful? 11 years worth of improvements and bug fixes, isn't that useful?
Your import code is not very robust, and inefficient, reading each file twice (once with importdata then with textscan). Here is a better version that should work in R2017b. Note that whenever you're repeating the same lines of code, you should be extracting these lines into a function.
function [variables, units, values] = parselog(logpath)
%PARSELOG Parse AVL PUMA logs.
% The logs are text files consisting of 2 lines of headers (variables, units) followed by lines of values. Variables, units and values are separated by tabs
% The logs may contain extra tabs at the end of each rows. These are ignored. Some variables may also contain invalid values such as ** which are converted to NaN.
% inputs:
% parselog: full path of the log file to open. char vector
% outputs:
% variables: the variables in the file. 1xN cell array of char vectors.
% units: the units of each variable. 1XN cell array of char vectors.
% values: the values for each variable. MxN matrix of double. Textual values are converted to NaN.
%
% (c) Guillaume de Sercey, 2018. Licensed under the BSD. This copyright must remain.
fid = fopen(logpath, 'rt');
assert(fid > 1, 'Parselog:OpenError', 'Failed to open %s for reading', logpath);
%oncleanup requires R2008a. We will be missing the ability to close the file on error
%cleanup = onCleanup(@() fclose(fid)); %ensure the file is closed even if the function errors
%R2007b doesn't have strsplit (requires R2013a) so use regexp instead
variables = regexp(fgetl(fid), '\t', 'split');
units = regexp(fgetl(fid), '\t', 'split');
values = textscan(fid, repmat('%s', 1, numel(variables)), 'Delimiter', '\t');
values = str2double([values{:}]);
%get rid of extra columns resulting from tabs at the end of the row
validcols = logical(cumprod(~cellfun('isempty', variables))); %extra tab result in empty columns at the end of variable. cumprod ensures we capture all the empty columns but only at the end
variables = variables(validcols);
units = units(validcols);
values = values(:, validcols);
end
Now to reconcile the two logs:
%these three comes from UI
path = pwd;
Baseline_TXT = 'Baseline.txt';
Log_TXT = 'Logs.Txt';
%import
[baseline_vars, baseline_units, baseline_values] = parselog(fullfile(path, Baseline_TXT));
[log_vars, log_units, log_value] = parselog(fullfile(path, Log_TXT));
%adding missing variables to baseline.
%It is also assumed that all variables in baseline are present in log
baselinefilled_vars = log_vars;
baselinefilled_units = log_units;
baselinefilled_values = nan(size(baseline_values, 1), numel(baselinefilled_vars));
[found, where] = ismember(baseline_vars, baselinefilled_vars);
assert(all(found), 'Some variables in baseline are not in log')
baselinefilled_values(:, where) = baseline_values;
Santos Romero
Santos Romero el 24 de Nov. de 2018
Editada: Santos Romero el 24 de Nov. de 2018
Thank you very much for the function . I can tell that you do this every day. I only try when I really need to and I cannot detect if the code is robust or fragile. I am happy if the code does what I am looking for, considering that the volume of data I require to process it is not really high.
I totally agree that Matlab has improved a lot slong the years but unfortunately in the company I work excel is mainly used.
I will try the code on Monday when I get to work and let you know. I will take the oportunity to learn some matlab from your code.
I really appreciate your help again.
Good morning,
I have tried the code this morning and work perfectly, achieving what I am looking for.
I am really grateful for your advice and help as well as I have learnt from your code too!!!
Thanks again!!

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Data Import and Export en Centro de ayuda y File Exchange.

Productos

Versión

R2007b

Etiquetas

Preguntada:

el 22 de Nov. de 2018

Comentada:

el 26 de Nov. de 2018

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by