Reading in a document full of code

Question

Chris E. el 19 de Oct. de 2013

0
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/90716-reading-in-a-document-full-of-code

Comentada: Cedric el 22 de Oct. de 2013

I'm building a program to read in some code for Accelerator physics and I really do not know how to start reading the files into MATLAB. I need to make the lines starting with "!" as a comment or for the case of reading it into MATLAB, just ignore any line starting with "!", so it will not show up in MATLAB. All of the names starting with "&", like "&NEWRUN", starts the info that I need. I can handle anything in between, but separating the sections that start with "&NAME" and ending with "/", and there are a few of them, I'm not sure exactly. Any help I can get to make this work well will be very welcomed, even any comments or advice will be great! Thank you!

Here is the document in a .in file format:

! header
!
&NEWRUN
  Version=3
  Head= 'IAC 44 MeV LINAC'
  RUN=1,
  Distribution = 'IAC_LASER_1p5ps70mmRISE5000.ini', Xoff=0.0, Yoff=0.0
! Qbunch=1.00
  XYrms=0.6900
! Trms=4.0e-3
  TRACK_ALL=T, PHASE_SCAN=F, AUTO_PHASE=.T
  Lmonitor=.F
  check_ref_part=.F
  H_max=0.001
  H_min=0.000
! MAX_STEP=5000000
  debunch=0.0
/
&SCAN
  LScan= T, 
Scan_para='MaxB(1)',
! Fine Scan
S_min= 0.100, S_max=0.200, S_numb=11
!Scan_para='XYrms',
!rms beam size
!S_min=0.6555, S_max=0.7245, S_numb=11
!Scan_para='MaxB(2)',
! Rough Scan
! S_min=0.0, S_max=0.01, S_numb=17
!Scan_para='MaxB(3)',
! Rough Scan
! S_min=0.02, S_max=0.1, S_numb=12
!Scan_para='S_pos(1)',
!Gun Solenoid Position Scanning
!S_min=0.00, S_max=0.1, S_numb=21
!Scan_para='C_pos(2)',
!1st Cavity Position Scanning
!S_min=2.00, S_max=3.00, S_numb=21
!Scan_para='MaxE(1)' ,
!1st Cavity Gradient
!S_min=38.00, S_max=42.00, S_numb=11,
!Scan_para='Phi(1)' ,
!Gun Phase
!S_min=-2.1147, S_max=-1.9133, S_numb=11,
!Scan_para='MaxE(2)' ,
!1st Cavity Gradient
!S_min=9.975, S_max=11.025, S_numb=11,
!Scan_para='Phi(2)' ,
!1st Linac Cavity Phase
!S_min=-1.00, S_max=1.00, S_numb=16,
!Scan_para='MaxE(3)' ,
!S_min=22.85, S_max=52.85, S_numb=31,
    FOM(1)='rms bunch length',
    FOM(2)='horizontal rms emittance',
    FOM(3)='vertical rms emittance',
    FOM(4)='longitudinal rms emittance',
    FOM(5)='horizontal rms spot size',
    FOM(6)='vertical rms spot size',
    FOM(7)='bunch charge',
    FOM(8)='mean beam energy',
    FOM(9)='horizontal rms emittance minus Z correlation',
    FOM(10)='horizontal rms beam divergence'
  /
! &ERROR
! /
&CHARGE
  Loop=F,
  LSPCH=.T,
  Nrad=15, Nlong_in=20
  Cell_var=2.0
  min_grid=0.0
  Max_scale=0.05
  Lmirror=.T
! N_min=100
/
&FEM
/
&CAVITY
  LEfield=.T
! 1.5 cells RF Gun with a symmetric coupler
  File_Efield(1) = 'ttf2rfgun.dat',     
  Nue(1)=1.300, MaxE(1)=40.00, Phi(1)=-2.115,  C_pos(1)=0.000000, 
! The 1st 2.00 m long L-band Linac Structure with 24 cells 
  FILE_EFIELD(2) = 'TWS_IAC_Lband_ASTRA.dat'
  Nue(2)=1.300, MaxE(2)=10.40, Phi(2)=0.8667, C_pos(2)=2.35, C_Numb(2)=24
! The 2nd 2.75 m long L-band Linac Structure with 33 cells
! Distance between 1st and 2nd L-band Linacs = 0.65 m
  FILE_EFIELD(3) = 'TWS_IAC_Lband_ASTRA.dat'
  Nue(3)=1.300, MaxE(3)=15, Phi(3)=0.0, C_pos(3)=5.15, C_Numb(3)=33
/
&SOLENOID
  LBfield=.T,
! Gun Main Solenoid
  File_Bfield(1)='TTF2solenoids.dat', MaxB(1)=0.1585,
  S_pos(1)=0.0E-3, S_xoff(1)=0.0, S_yoff(1)=0.0, S_Smooth(1)=150 
! 1st Solenoids in the 1st L-band Linac Structure
! distance between center of solenoid and 1st Linac head = 0.13795 m 
! distance between center of solenoids = 0.35 m 
! -> 2.35 m + 0.13795 m = 2.48795 m 
! File_Bfield(2)='IAC_SOLENOID_ASTRA.dat', MaxB(2)=0.04
  File_Bfield(2)='IAC_SOLENOID_ASTRA.dat', MaxB(2)=0.005
  S_pos(2)=2.48795, S_xoff(2)=0.0, S_yoff(2)=0.0, S_Smooth(2)=150
! 2nd Solenoid = 2.48795 m + 0.35 m = 2.83795 m
  File_Bfield(3)='IAC_SOLENOID_ASTRA.dat', MaxB(3)=0.005,
  S_pos(3)=2.83795, S_xoff(3)=0.0, S_yoff(3)=0.0, S_Smooth(3)=150
! 3rd Solenoid = 2.83795 m + 0.35 m = 3.18795 m
  File_Bfield(4)='IAC_SOLENOID_ASTRA.dat', MaxB(4)=0.005,
  S_pos(4)=3.18795, S_xoff(4)=0.0, S_yoff(4)=0.0, S_Smooth(4)=150
! 4th Solenoid = 3.18795 m + 0.35 m = 3.53795 m
  File_Bfield(5)='IAC_SOLENOID_ASTRA.dat', MaxB(5)=0.09,
  S_pos(5)=3.53795, S_xoff(5)=0.0, S_yoff(5)=0.0, S_Smooth(5)=150
! 5th Solenoid = 3.53795 m + 0.35 m = 3.88795 m
  File_Bfield(6)='IAC_SOLENOID_ASTRA.dat', MaxB(6)=0.09,
  S_pos(6)=3.88795, S_xoff(6)=0.0, S_yoff(6)=0.0, S_Smooth(6)=150
! 6th Solenoid = 3.88795 m + 0.35 m = 4.23795 m
  File_Bfield(7)='IAC_SOLENOID_ASTRA.dat', MaxB(7)=0.09,
  S_pos(7)=4.23795, S_xoff(7)=0.0, S_yoff(7)=0.0, S_Smooth(7)=150
! 1st - 8th Solenoids in the 2nd L-band Linac Structure
! distance between center of solenoid and 1st Linac head = 0.13795 m
! distance between center of solenoids = 0.35 m
! -> 5.15 m + 0.13795 m = 5.28795 m
    File_Bfield(8)='IAC_SOLENOID_ASTRA.dat', MaxB(8)=0.10,
    S_pos(8)=5.28795, S_xoff(8)=0.0, S_yoff(8)=0.0, S_Smooth(8)=150
! 2nd Solenoid = 5.28795 m + 0.35 m = 5.63795 m
  File_Bfield(9)='IAC_SOLENOID_ASTRA.dat', MaxB(9)=0.10,
  S_pos(9)=5.63795, S_xoff(9)=0.0, S_yoff(9)=0.0, S_Smooth(9)=150
! 3rd Solenoid = 5.63795 m + 0.35 m = 5.98795 m
  File_Bfield(10)='IAC_SOLENOID_ASTRA.dat', MaxB(10)=0.10,
  S_pos(10)=5.98795, S_xoff(10)=0.0, S_yoff(10)=0.0, S_Smooth(10)=150
! 4th Solenoid = 5.98795 m + 0.35 m = 6.33795 m
  File_Bfield(11)='IAC_SOLENOID_ASTRA.dat', MaxB(11)=0.10,
  S_pos(11)=6.33795, S_xoff(11)=0.0, S_yoff(11)=0.0, S_Smooth(11)=150
! 5th Solenoid = 6.33795 m + 0.35 m = 6.68795 m
  File_Bfield(12)='IAC_SOLENOID_ASTRA.dat', MaxB(12)=0.10,
  S_pos(12)=6.68795, S_xoff(12)=0.0, S_yoff(12)=0.0, S_Smooth(12)=150
! 6th Solenoid = 6.68795 m + 0.35 m = 7.03795 m
  File_Bfield(13)='IAC_SOLENOID_ASTRA.dat', MaxB(13)=0.10,
  S_pos(13)=7.03795, S_xoff(13)=0.0, S_yoff(13)=0.0, S_Smooth(13)=150
! 7th Solenoid = 7.03795 m + 0.35 m = 7.38795 m
  File_Bfield(14)='IAC_SOLENOID_ASTRA.dat', MaxB(14)=0.10,
  S_pos(14)=7.38795, S_xoff(14)=0.0, S_yoff(14)=0.0, S_Smooth(14)=150
! 8th Solenoid = 7.38795 m + 0.35 m = 7.73795 m
  File_Bfield(15)='IAC_SOLENOID_ASTRA.dat', MaxB(15)=0.10,
  S_pos(15)=7.73795, S_xoff(15)=0.0, S_yoff(15)=0.0, S_Smooth(15)=150
/
&QUADRUPOLE
/

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Cedric el 19 de Oct. de 2013

1
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/90716-reading-in-a-document-full-of-code#answer_100182

Editada: Cedric el 20 de Oct. de 2013

Abrir en MATLAB Online

Simple answer below, and a more complete function in comment #3. Code is attached to the comment.

You can use a REGEXP to split blocks/sections, and then post-process each block of data:

 content = fileread( 'myFile.in' ) ;
 content = regexprep( content, '!.*?\r?\n', '' ) ;
 blocks  = regexp( content, '&(?<name>[^\n\r]+)(?<data>.+?)/', 'names' ) ;

This creates a struct array blocks with the following type of content:

 >> blocks(1)
 ans = 
      name: 'NEWRUN'
      data: [1x266 char]
 >> blocks(2)
 ans = 
      name: 'SCAN'
      data: [1x496 char]

where the field data stores the content of the section as a raw, unprocessed character array. Post-processing seems to be a bit section specific, but here is an example:

 for bId = 1 : numel( blocks )
    blocks(bId).parsed = regexp( blocks(bId).data, ...
                                 '([^=\s,]+)\s*=\s*''?([^\s,\n\r'']+)', ...
                                 'tokens' ) ;
 end

which builds a new field named parsed for each block (you could overwrite blocks(bId).data with parsed data actually, to spare memory, but I kept both at this point for debugging). For example:

 >> blocks(1)
 ans = 
      name: 'NEWRUN'
      data: [1x266 char]
    parsed: {1x15 cell}

shows that 15 parameters were parsed (without commented lines) in block/section 1. Parsed parameters are stored in pairs name/value: ..

 >> blocks(1).parsed{1}
 ans = 
    'Version'    '3'
 >> blocks(1).parsed{4}
 ans = 
    'Distribution'    'IAC_LASER_1p5ps70mmRISE5000.ini'
 >> blocks(1).parsed{5}
 ans = 
    'Xoff'    '0.0'

This approach holds in ~6 lines of code, which is interesting for such a complex file. However, you have to be familiar enough with regular expressions for fine tuning the mechanism.

The second approach is simply to read the file line by line and build whatever structure or cell array of parameters you need. For example:

 sections = {} ;
 sId = 0 ;
 fid = fopen( 'myFile.in', 'r' ) ;
 while ~feof( fid )
    line = strtrim( fgetl( fid )) ;
    if isempty( line ), continue ;  end
    if line(1) == '!',  continue ;  end
    if line(1) == '&'
        sId = sId + 1 ;
        sections{sId}.name = line(2:end) ;
        sections{sId}.data = {} ;
    elseif line(1) ~= '/'
        sections{sId}.data = [sections{sId}.data; {line}] ;
    end
 end
 fclose( fid ) ;

This is simple, but there would still be some post processing..

 >> sections{1}
 ans = 
    name: 'NEWRUN'
    data: {11x1 cell}
 >> sections{1}.data
 ans = 
    'Version=3'
    'Head= 'IAC 44 MeV LINAC''
    'RUN=1,'
    'Distribution = 'IAC_LASER_1p5ps70mmRISE5000.ini', Xoff=0.0, Yoff=0.0'
    'XYrms=0.6900'
    'TRACK_ALL=T, PHASE_SCAN=F, AUTO_PHASE=.T'
    'Lmonitor=.F'
    'check_ref_part=.F'
    'H_max=0.001'
    'H_min=0.000'
    'debunch=0.0'

8 comentarios
Mostrar 6 comentarios más antiguosOcultar 6 comentarios más antiguos

Cedric el 20 de Oct. de 2013

Editada: Cedric el 20 de Oct. de 2013

Abrir en MATLAB Online

importInFile.m

I just meant that if you updated your code based on my answer between Friday and Saturday noon, you should re-update it with the modifications that I brought on Saturday.

I spend 10 more minutes making a more complete and commented example, and wrapping it into a function (the M-File is attached to this comment.. not sure whether the attachment is working, but if not just copy/paste the code below into a new M-File named importInFile.m):

 function sections = importInFile( fileLocator )
    sections = struct() ;
    % Read file content.
    try
        content = fileread( fileLocator ) ;
    catch ME
        return ;
    end
    % First pass parser -> blocks.
    content = regexprep( content, '!.*?\r?\n', '' ) ;
    blocks  = regexp( content, '&(?<name>[^\n\r]+)(?<data>.+?)/', 'names' ) ;
    % Post process blocks -> sections.
    for bId = 1 : numel( blocks )
        % Second pass parser -> split parameters/values.
        parsed = regexp( blocks(bId).data, ...
            '([^=\s,]+)\s*=\s*''?([^\s,\n\r'']+)', 'tokens' ) ;
        % Iterate through parameters and parse them further.
        S = struct() ;
        for pId = 1 : numel( parsed )
            % Try conversion to numeric for value.
            value = str2double( parsed{pId}{2} ) ;
            if isnan( value ),  value = parsed{pId}{2} ;  end
            % Parse parameter name -> {name, id} if possible (arrays). 
            tokens = regexp( parsed{pId}{1}, '([^\(]+)\((\d+)\)', 'tokens' ) ;
            if isempty( tokens )
                % Not an array -> store .(name) = value.
                S.(parsed{pId}{1}) = value ;
            else
                % Array -> convert ID to numeric.
                vId = str2double(tokens{1}{2}) ;
                if ischar( value )
                    % Non-numeric value -> cell array .(name){vId} = value.
                    S.(tokens{1}{1}){vId} = value ;
                else
                    % Numeric value -> num array .(name)(vId) = value.
                    S.(tokens{1}{1})(vId) = value ;
                end
            end
        end
        sections.(blocks(bId).name) = S ;
    end
 end

It is basically what you already saw, this 6 lines solution, with an extra step for parsing parameters names and values and building arrays when relevant. I also changed the output from the struct array blocks to a more basic struct sections. So now you can use it as follows:

 >> sections = importInFile( 'myFile.in' )
 sections = 
        NEWRUN: [1x1 struct]
          SCAN: [1x1 struct]
        CHARGE: [1x1 struct]
           FEM: [1x1 struct]
        CAVITY: [1x1 struct]
      SOLENOID: [1x1 struct]
    QUADRUPOLE: [1x1 struct]
 >> sections.NEWRUN
 ans = 
           Version: 3
              Head: 'IAC'
               RUN: 1
      Distribution: 'IAC_LASER_1p5ps70mmRISE5000.ini'
              Xoff: 0
              Yoff: 0
             XYrms: 0.6900
         TRACK_ALL: 'T'
        PHASE_SCAN: 'F'
        AUTO_PHASE: '.T'
          Lmonitor: '.F'
    check_ref_part: '.F'
             H_max: 1.0000e-03
             H_min: 0
           debunch: 0
 >> sections.NEWRUN.Version
 ans =
     3
 >> sections.SCAN
 ans = 
        LScan: 'T'
    Scan_para: 'MaxB(1)'
        S_min: 0.1000
        S_max: 0.2000
       S_numb: 11
          FOM: {1x10 cell}
 >> sections.SCAN.FOM
 ans = 
    'rms'  'horizontal'  'vertical'  'longitudinal'  'horizontal'    
    'vertical'  'bunch'  'mean'  'horizontal'  'horizontal'

As you can see, it is now easier to wander through the structure of parameters, and numeric values have been already converted.

Note that I give no guarantee that this code is working correctly, so the evaluation part is on your side. For example, there might be special cases to manage better than what I did, where e.g. the first value of an array is not defined in a section. My code sets them to 0, which might not be appropriate. To illustrate:

 >> sections.CAVITY
 ans = 
        LEfield: '.T'
    File_Efield: {'ttf2rfgun.dat'}
            Nue: [1.3000 1.3000 1.3000]
           MaxE: [40 10.4000 15]
            Phi: [-2.1150 0.8667 0]
          C_pos: [0 2.3500 5.1500]
    FILE_EFIELD: {[]  'TWS_IAC_Lband_ASTRA.dat'  'TWS_IAC_Lband_ASTRA.dat'}
         C_Numb: [0 24 33]

Here you can see that C_Numb(1) is 0, whereas it is not defined in your .in file.

Cheers,

Cedric

Chris E. el 22 de Oct. de 2013

Thank you for looking out for me that way!!! I did see that mistake, I fixed it already. Thank you again, you have been really helpful!

Cedric el 22 de Oct. de 2013

You're welcome!

Iniciar sesión para comentar.

Reading in a document full of code

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

8 comentarios
Mostrar 6 comentarios más antiguosOcultar 6 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Community Treasure Hunt

Reading in a document full of code

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

8 comentarios Mostrar 6 comentarios más antiguosOcultar 6 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

8 comentarios
Mostrar 6 comentarios más antiguosOcultar 6 comentarios más antiguos