Skip until import data

Question

vb el 9 de Ag. de 2012

0
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/45693-skip-until-import-data

I have some questions about importing data. Here is an example of the data file to import:

!!!!!!
! text text
! stuff
0.1      2.53  2.5
0.2  2.59  2.43
0.3  2.5  2.54
0.4  2.48  2.53
0.5  2.52  2.48
1
ABC 0.123 123 
   DE
    0.456 0.456 456
0.1  2.56  2.34  2.63
0.2  2.61  2.48  2.43
0.3  2.54  2.51  2.6
0.4  2.57  2.54  2.49
0.5  2.48  2.63  2.5

Here is the code I'm using to import this data:

Test=fopen('TestData.txt'); % open the file
for n=1
mystruct(n).Header1 = fgetl(Test); %line1 goes to header1
fgetl(Test); %skip line
mystruct(n).Header2 = fgetl(Test);
fgetl(Test);
mystruct(n).Header3 = fgetl(Test);
mystruct(n).meas = fscanf(Test, '%f', [3, 5])';
end
for n=2
      for j=1:6  % skips to the 6th line
          fgetl(Test);
      end
  mystruct(n).T = fscanf(Test, '%f', 1); % call out value for T
      for j=1:2  % skips 2 empty lines
          fgetl(Test);
      end
  mystruct(n).meas = fscanf(Test, '%f', [4, 5])';
  end
fclose(Test); % Close the file

I want to preserve the headers at the top and I don't necessarily care about the midfile headers with the exception of my T-value. My question is how I can import this to allow for variable amounts of headers at the top and in the middle of the file without having to look through each data file? This would be helpful since I have multiple data files and with varying contents (mainly the headers). I think I need something like skip until that includes skipping empty spaces and allows for individual treatment of the matrices as I have it now. Any help is much appreciated thanks!

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

per isakson el 9 de Ag. de 2012

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/45693-skip-until-import-data#answer_55872

Editada: per isakson el 12 de Ag. de 2012

Abrir en MATLAB Online

I have deleted a sketchy outline, which was not helpful.

--- Working code ---

Purpose:

learn to use Matlab

Approach:

the data file consists of consecutive blocks of headers and data
a data block is a number of consecutive rows containing an equal number of "numerical strings"
a header block is a number of consecutive rows, which do not belong to a data block

Implementation:

cssm, main function
getblocks, subfunction

Hopefully, the code works with more data files than the example above, cssm.txt

Example:

>> [ header_blocks, data_blocks ] = cssm()
header_blocks = 
    {6x1 cell}
    {4x1 cell}
data_blocks = 
    [5x3 double]
    [5x4 double]

.

Left as excersice:

understand the code
write comments

====

    function [ header_blocks, data_blocks ] = cssm()
        fid = fopen( 'cssm.txt' );
        cac = textscan( fid, '%s', 'Whitespace','', 'Delimiter','\n' );
        fclose( fid );
        number_of_floats = cellfun( @(c) size(c,2)          ...
            ,   regexp( cac{:}, '[+|-]?\d*\.\d+', 'match' ) ...
            ,   'uni', true                                 );
        number_of_stuff  = cellfun( @(c) size(c,2)                  ...
            ,   regexp( cac{:}, '[^([+|-]?\d*\.\d+) ]', 'match' )   ...
            ,   'uni', true                                         );
        is_data = ( number_of_floats >= 1 & number_of_stuff == 0 );
        number_of_data_columns = number_of_floats;
        number_of_data_columns( not(is_data) ) = nan;
        [ ~, ix1, ix2 ] = getblocks( number_of_data_columns, 2 );
        data_blocks = cell(0);
        for ii = 1 : numel( ix1 )
            data_blocks = cat( 1, data_blocks               ...
                ,   {str2num( char(cac{1}{ix1(ii):ix2(ii)}))} );
        end
        ix3 = cat( 2, 1, ix2+1 );
        ix4 = cat( 2, ix1-1, size( cac{1}, 1 ) );
        header_blocks = cell(0);
        for ii = 1 : numel( ix1 )
            header_blocks = cat( 1, header_blocks       ...
                            ,   {cac{:}(ix3(ii):ix4(ii))} );
        end
    end

====

    function  [ col, ix1, ix2 ] = getblocks( sequence, min_nrows )
    %   without comments
        seq     = cat( 2, nan, transpose( sequence(:) ), nan );
        change  = diff( double( diff( seq ) == 0 ) );
        ix1     = strfind( change, +1 );
        ix2     = strfind( change, -1 );    
        col     = sequence( ix1 );
        if min_nrows >= 2
            isg = ix2-ix1+1 >= min_nrows;
            col = col( isg );
            ix1 = ix1( isg );
            ix2 = ix2( isg );
        else
            ix_sngl = find( not( logical( cumsum( change )          ...
                                        + double( change==-1 ) ) ) );
            ix1     = cat( 2, ix1, ix_sngl );
            ix2     = cat( 2, ix2, ix_sngl );
            col     = cat( 2, col, sequence( ix_sngl ) );
            [~,ix]  = sort( ix1 );
            ix1     = ix1( ix );
            ix2     = ix2( ix );
            col     = col( ix );
        end
    end

15 comentarios
Mostrar 13 comentarios más antiguosOcultar 13 comentarios más antiguos

vb el 14 de Ag. de 2012

i think i have something that works! Thank you! I had to comment number_of_data ( not(is_data) ) = nan; and data_blocks returned the two matrices. Not totatlly sure where the discrepancy is since it worked as is for you. Thanks again!

per isakson el 14 de Ag. de 2012

Editada: per isakson el 14 de Ag. de 2012

Abrir en MATLAB Online

I cannot guess what problems you see. However, here is what i get when I run the code above:

with "% number_of_data ( not(is_data) ) = nan;" commented out

>> [ header_blocks, data_blocks ] = cssm()
header_blocks = 
    {0x1 cell}
    {0x1 cell}
    {4x1 cell}
data_blocks = 
    []
    [5x3 double]
    [5x4 double]
>> header_blocks{:}
ans = 
   Empty cell array: 0-by-1
ans = 
   Empty cell array: 0-by-1
ans = 
    '1'
    'ABC 0.123 123'
    ' DE'
    '  0.456 0.456 456'
>> data_blocks{:}
ans =
     []
ans =
    0.1000    2.5300    2.5000
    0.2000    2.5900    2.4300
    0.3000    2.5000    2.5400
    0.4000    2.4800    2.5300
    0.5000    2.5200    2.4800
ans =
    0.1000    2.5600    2.3400    2.6300
    0.2000    2.6100    2.4800    2.4300
    0.3000    2.5400    2.5100    2.6000
    0.4000    2.5700    2.5400    2.4900
    0.5000    2.4800    2.6300    2.5000
>>

.

with "number_of_data ( not(is_data) ) = nan;" in place

>> [ header_blocks, data_blocks ] = cssm()
header_blocks = 
    {6x1 cell}
    {4x1 cell}
data_blocks = 
    [5x3 double]
    [5x4 double]
>> header_blocks{:}
ans = 
    '!!!!!!'
    ''
    '! text text'
    ''
    '! stuff'
    ''
ans = 
    '1'
    'ABC 0.123 123'
    ' DE'
    '  0.456 0.456 456'
>> data_blocks{:}
ans =
    0.1000    2.5300    2.5000
    0.2000    2.5900    2.4300
    0.3000    2.5000    2.5400
    0.4000    2.4800    2.5300
    0.5000    2.5200    2.4800
ans =
    0.1000    2.5600    2.3400    2.6300
    0.2000    2.6100    2.4800    2.4300
    0.3000    2.5400    2.5100    2.6000
    0.4000    2.5700    2.5400    2.4900
    0.5000    2.4800    2.6300    2.5000
>>

.

Comment

In the text file there should be an empty line between "ABC..." and " DE". Adding that blank line doesn't cause any problems. I get

...
ans = 
    '1'
    'ABC 0.123 123'
    ''
    ' DE'
    '  0.456 0.456 456'

Iniciar sesión para comentar.

Skip until import data

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

15 comentarios
Mostrar 13 comentarios más antiguosOcultar 13 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Community Treasure Hunt

Skip until import data

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

15 comentarios Mostrar 13 comentarios más antiguosOcultar 13 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

15 comentarios
Mostrar 13 comentarios más antiguosOcultar 13 comentarios más antiguos