Skip until import data

3 visualizaciones (últimos 30 días)
vb
vb el 9 de Ag. de 2012
I have some questions about importing data. Here is an example of the data file to import:
!!!!!!
! text text
! stuff
0.1 2.53 2.5
0.2 2.59 2.43
0.3 2.5 2.54
0.4 2.48 2.53
0.5 2.52 2.48
1
ABC 0.123 123
DE
0.456 0.456 456
0.1 2.56 2.34 2.63
0.2 2.61 2.48 2.43
0.3 2.54 2.51 2.6
0.4 2.57 2.54 2.49
0.5 2.48 2.63 2.5
Here is the code I'm using to import this data:
Test=fopen('TestData.txt'); % open the file
for n=1
mystruct(n).Header1 = fgetl(Test); %line1 goes to header1
fgetl(Test); %skip line
mystruct(n).Header2 = fgetl(Test);
fgetl(Test);
mystruct(n).Header3 = fgetl(Test);
mystruct(n).meas = fscanf(Test, '%f', [3, 5])';
end
for n=2
for j=1:6 % skips to the 6th line
fgetl(Test);
end
mystruct(n).T = fscanf(Test, '%f', 1); % call out value for T
for j=1:2 % skips 2 empty lines
fgetl(Test);
end
mystruct(n).meas = fscanf(Test, '%f', [4, 5])';
end
fclose(Test); % Close the file
I want to preserve the headers at the top and I don't necessarily care about the midfile headers with the exception of my T-value. My question is how I can import this to allow for variable amounts of headers at the top and in the middle of the file without having to look through each data file? This would be helpful since I have multiple data files and with varying contents (mainly the headers). I think I need something like skip until that includes skipping empty spaces and allows for individual treatment of the matrices as I have it now. Any help is much appreciated thanks!

Respuesta aceptada

per isakson
per isakson el 9 de Ag. de 2012
Editada: per isakson el 12 de Ag. de 2012
I have deleted a sketchy outline, which was not helpful.
--- Working code ---
Purpose:
  1. learn to use Matlab
Approach:
  1. the data file consists of consecutive blocks of headers and data
  2. a data block is a number of consecutive rows containing an equal number of "numerical strings"
  3. a header block is a number of consecutive rows, which do not belong to a data block
Implementation:
  1. cssm, main function
  2. getblocks, subfunction
Hopefully, the code works with more data files than the example above, cssm.txt
Example:
>> [ header_blocks, data_blocks ] = cssm()
header_blocks =
{6x1 cell}
{4x1 cell}
data_blocks =
[5x3 double]
[5x4 double]
.
Left as excersice:
  1. understand the code
  2. write comments
====
function [ header_blocks, data_blocks ] = cssm()
fid = fopen( 'cssm.txt' );
cac = textscan( fid, '%s', 'Whitespace','', 'Delimiter','\n' );
fclose( fid );
number_of_floats = cellfun( @(c) size(c,2) ...
, regexp( cac{:}, '[+|-]?\d*\.\d+', 'match' ) ...
, 'uni', true );
number_of_stuff = cellfun( @(c) size(c,2) ...
, regexp( cac{:}, '[^([+|-]?\d*\.\d+) ]', 'match' ) ...
, 'uni', true );
is_data = ( number_of_floats >= 1 & number_of_stuff == 0 );
number_of_data_columns = number_of_floats;
number_of_data_columns( not(is_data) ) = nan;
[ ~, ix1, ix2 ] = getblocks( number_of_data_columns, 2 );
data_blocks = cell(0);
for ii = 1 : numel( ix1 )
data_blocks = cat( 1, data_blocks ...
, {str2num( char(cac{1}{ix1(ii):ix2(ii)}))} );
end
ix3 = cat( 2, 1, ix2+1 );
ix4 = cat( 2, ix1-1, size( cac{1}, 1 ) );
header_blocks = cell(0);
for ii = 1 : numel( ix1 )
header_blocks = cat( 1, header_blocks ...
, {cac{:}(ix3(ii):ix4(ii))} );
end
end
====
function [ col, ix1, ix2 ] = getblocks( sequence, min_nrows )
% without comments
seq = cat( 2, nan, transpose( sequence(:) ), nan );
change = diff( double( diff( seq ) == 0 ) );
ix1 = strfind( change, +1 );
ix2 = strfind( change, -1 );
col = sequence( ix1 );
if min_nrows >= 2
isg = ix2-ix1+1 >= min_nrows;
col = col( isg );
ix1 = ix1( isg );
ix2 = ix2( isg );
else
ix_sngl = find( not( logical( cumsum( change ) ...
+ double( change==-1 ) ) ) );
ix1 = cat( 2, ix1, ix_sngl );
ix2 = cat( 2, ix2, ix_sngl );
col = cat( 2, col, sequence( ix_sngl ) );
[~,ix] = sort( ix1 );
ix1 = ix1( ix );
ix2 = ix2( ix );
col = col( ix );
end
end
  15 comentarios
vb
vb el 14 de Ag. de 2012
i think i have something that works! Thank you! I had to comment number_of_data ( not(is_data) ) = nan; and data_blocks returned the two matrices. Not totatlly sure where the discrepancy is since it worked as is for you. Thanks again!
per isakson
per isakson el 14 de Ag. de 2012
Editada: per isakson el 14 de Ag. de 2012
I cannot guess what problems you see. However, here is what i get when I run the code above:
with "% number_of_data ( not(is_data) ) = nan;" commented out
>> [ header_blocks, data_blocks ] = cssm()
header_blocks =
{0x1 cell}
{0x1 cell}
{4x1 cell}
data_blocks =
[]
[5x3 double]
[5x4 double]
>> header_blocks{:}
ans =
Empty cell array: 0-by-1
ans =
Empty cell array: 0-by-1
ans =
'1'
'ABC 0.123 123'
' DE'
' 0.456 0.456 456'
>> data_blocks{:}
ans =
[]
ans =
0.1000 2.5300 2.5000
0.2000 2.5900 2.4300
0.3000 2.5000 2.5400
0.4000 2.4800 2.5300
0.5000 2.5200 2.4800
ans =
0.1000 2.5600 2.3400 2.6300
0.2000 2.6100 2.4800 2.4300
0.3000 2.5400 2.5100 2.6000
0.4000 2.5700 2.5400 2.4900
0.5000 2.4800 2.6300 2.5000
>>
.
with "number_of_data ( not(is_data) ) = nan;" in place
>> [ header_blocks, data_blocks ] = cssm()
header_blocks =
{6x1 cell}
{4x1 cell}
data_blocks =
[5x3 double]
[5x4 double]
>> header_blocks{:}
ans =
'!!!!!!'
''
'! text text'
''
'! stuff'
''
ans =
'1'
'ABC 0.123 123'
' DE'
' 0.456 0.456 456'
>> data_blocks{:}
ans =
0.1000 2.5300 2.5000
0.2000 2.5900 2.4300
0.3000 2.5000 2.5400
0.4000 2.4800 2.5300
0.5000 2.5200 2.4800
ans =
0.1000 2.5600 2.3400 2.6300
0.2000 2.6100 2.4800 2.4300
0.3000 2.5400 2.5100 2.6000
0.4000 2.5700 2.5400 2.4900
0.5000 2.4800 2.6300 2.5000
>>
.
Comment
In the text file there should be an empty line between "ABC..." and " DE". Adding that blank line doesn't cause any problems. I get
...
ans =
'1'
'ABC 0.123 123'
''
' DE'
' 0.456 0.456 456'

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Matrices and Arrays en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by