- the first columns comprise of exactly four characters (which may be spaces).
- the date vectors always start with asterisks, but no other lines do.
- no empty lines between the date vectors and the data matrices.
- the matrices contain numeric data only.
using regexp for space delimited strings in text file.
4 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
sermet
el 26 de En. de 2016
I need to extract repeated strings' lines from the attached text file. For example there are 2 lines which start with "P 1" (two spaces after P) string in the data file. I need to extract 2nd and 4th column of these lines as follows;
array_P1=[ 6444.951599 -24080.372159 -8934.980576; 6645.371003 -22892.293251 -11497.619680];
I use following codes (from Stephen Cobeldick) if there are no space in repeated strings (for example P1);
fid = fopen('data_file.txt','rt');
str = fscanf(fid,'%c',Inf);
fclose(fid);
C = regexp(str,'^P1( +\S+)+\s+$','lineanchors','tokens');
C = regexp(vertcat(C{:}),'\S+','match');
N = str2double(vertcat(C{:}));
But this doesn't work if there are spaces in the repeated strings as in my example (P 1)
0 comentarios
Respuesta aceptada
Stephen23
el 26 de En. de 2016
Editada: Stephen23
el 26 de En. de 2016
Try this:
% textscan options:
opt = {'MultipleDelimsAsOne',true,'CollectOutput',true};
% required arrays:
str = 'X';
dtv = [];
dat = {};
% open textfile:
fid = fopen('data.txt','rt');
while ischar(str)
% skip lines until first char is '*' (date vector):
while ~strcmp(str(1),'*')
str = fgetl(fid);
end
% convert date vector to numeric:
dtv(end+1,:) = str2double(regexp(str(2:end),'\S+','match')); %#ok<SAGROW>
% get file position:
pos = ftell(fid);
% read first line of matrix:
str = fgetl(fid);
if ischar(str)
% calculate how many columns in the matrix:
N = numel(regexp(str(5:end),'\S+','match'));
fmt = repmat('%f',1,N);
% rewind one line:
fseek(fid,pos,'bof');
% read entire matrix:
dat{end+1} = textscan(fid,['%4[^*]',fmt],opt{:}); %#ok<SAGROW>
end
end
% concatenate data in cell arrays:
dat = vertcat(dat{:});
mat = vertcat(dat{:,2});
This reads the entire data matrix (between the date vectors) into a numeric matrix inside the cell array dat, and the date vectors in dtv. It automatically adjusts for the different numbers of columns in your matrices. Some important assumptions:
Have a look inside dat, and pick the data that you need:
>> cell2mat(cellfun(@(m)m(1,[1,2,3]),dat(:,2),'UniformOutput',false))
ans =
1.0e+04 *
0.6445 -2.4080 -0.8935
0.6645 -2.2892 -1.1498
I also concatenated the matrices into mat, which lets gives you all of the matrices in one. This might be easier to access:
>> mat([1,10],[1,2,3])
ans =
1.0e+04 *
0.6445 -2.4080 -0.8935
0.6645 -2.2892 -1.1498
I tested this code on both of the files that you have provided (this question, and your last question), which are also available here:
Más respuestas (1)
Guillaume
el 26 de En. de 2016
This regex should work for you:
'^P\s*1( +\S+)+\s+$'
It simply adds 0 or more (the *) whitespace characters (the \s) between P and 1.
Ver también
Categorías
Más información sobre Text Files en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!