Convert chars into formatted numbers

7 visualizaciones (últimos 30 días)
Francesco
Francesco el 21 de Mzo. de 2025
Comentada: Star Strider el 21 de Mzo. de 2025
Hello everyone,
I am working on a code which parses a .header file to interpret a big database stored in a .data file (for those familiar, HITRAN).
From the header file I am able to obtain information on where to separate each line of the dataset into a variable and which format this variable is in. I will put below an example of data:
% ... Parse the .header file to get variable names (.Names) and their numerical
% formatting (.Values) in C. Some of them are double, some of them are integer numbers
% related to quantum states. Note that Names and Values are not in the same
% order as the columns of the .data file.
FormatBlock.Names = {'a', 'gamma_air', 'gp', 'local_iso_id', 'molec_id', 'sw', 'local_lower_quanta', 'local_upper_quanta', 'gpp', 'elower', 'n_air', 'delta_air', 'global_upper_quanta', 'iref', 'line_mixing_flag', 'ierr', 'nu', 'gamma_self', 'global_lower_quanta'};
FormatBlock.Values = {'%10.3E', '%5.4f', '%7.1f', '%1d', '%2d', '%10.3E', '%15s', '%15s', '%7.1f', '%10.4f', '%4.2f', '%8.6f', '%15s', '%12s', '%1s', '%6s', '%12.6f', '%5.3f', '%15s'};
% ... Parse the .data file, dividing it into lines and separating values
% into columns keeping them as char. Here an example of one line
DataBlock.Names = {'molec_id', 'local_iso_id', 'nu', 'sw', 'a', 'gamma_air', 'gamma_self', 'elower', 'n_air', 'delta_air', 'global_upper_quanta', 'global_lower_quanta', 'local_upper_quanta', 'local_lower_quanta', 'ierr', 'iref', 'line_mixing_flag', 'gp', 'gpp'}
DataBlock.Columns = ' 1', '1', ' 2800.033883', ' 1.303E-29', ' 1.003E-04', '.0664', '0.298', ' 2705.1396', '0.65', '0.005780', ' 0 2 0', ' 0 1 0', ' 11 6 5 ', ' 10 1 10 ', '434233', '807294713152', ' ', ' 69.0', ' 63.0'}.
The question is: assuming that I am able to reorganise Names and Values in the same order of the data file, how can I convert the DataBlocks.Columns chars into numbers following each FormatBlock.Values?
For example:
'molec_id' = ' 1' has formatting '%2d', hence: "molec_id" = 1
'local_lower_quanta' = ' 0 1 0' has formatting '%15s', hence 'local_lower_quanta' = [0 1 0]
'nu' = ' 2800.033883' has formatting '%12.6f', hence 'nu' = 2.800033883e3
etc...
Thank you in advace for your help!

Respuesta aceptada

Star Strider
Star Strider el 21 de Mzo. de 2025
I am not certain what result you want.
Try something like this —
% ... Parse the .header file to get variable names (.Names) and their numerical
% formatting (.Values) in C. Some of them are double, some of them are integer numbers
% related to quantum states. Note that Names and Values are not in the same
% order as the columns of the .data file.
FormatBlock.Names = {'a', 'gamma_air', 'gp', 'local_iso_id', 'molec_id', 'sw', 'local_lower_quanta', 'local_upper_quanta', 'gpp', 'elower', 'n_air', 'delta_air', 'global_upper_quanta', 'iref', 'line_mixing_flag', 'ierr', 'nu', 'gamma_self', 'global_lower_quanta'};
FormatBlock.Values = {'%10.3E', '%5.4f', '%7.1f', '%1d', '%2d', '%10.3E', '%15s', '%15s', '%7.1f', '%10.4f', '%4.2f', '%8.6f', '%15s', '%12s', '%1s', '%6s', '%12.6f', '%5.3f', '%15s'};
% ... Parse the .data file, dividing it into lines and separating values
% into columns keeping them as char. Here an example of one line
DataBlock.Names = {'molec_id', 'local_iso_id', 'nu', 'sw', 'a', 'gamma_air', 'gamma_self', 'elower', 'n_air', 'delta_air', 'global_upper_quanta', 'global_lower_quanta', 'local_upper_quanta', 'local_lower_quanta', 'ierr', 'iref', 'line_mixing_flag', 'gp', 'gpp'}
DataBlock = struct with fields:
Names: {1x19 cell}
DataBlock.Columns = {' 1', '1', ' 2800.033883', ' 1.303E-29', ' 1.003E-04', '.0664', '0.298', ' 2705.1396', '0.65', '0.005780', ' 0 2 0', ' 0 1 0', ' 11 6 5 ', ' 10 1 10 ', '434233', '807294713152', ' ', ' 69.0', ' 63.0'}
DataBlock = struct with fields:
Names: {1x19 cell} Columns: {1x19 cell}
format shortG
DBC = cellfun(@(x)sscanf(x,'%g'),DataBlock.Columns,Unif=0);
disp(DBC)
Columns 1 through 13 {[1]} {[1]} {[2800]} {[1.303e-29]} {[0.0001003]} {[0.0664]} {[0.298]} {[2705.1]} {[0.65]} {[0.00578]} {3x1 double} {3x1 double} {3x1 double} Columns 14 through 19 {3x1 double} {[434233]} {[8.0729e+11]} {0x0 double} {[69]} {[63]}
for k = 1:numel(DBC)
DBC{k}.'
end
ans =
1
ans =
1
ans =
2800
ans =
1.303e-29
ans =
0.0001003
ans =
0.0664
ans =
0.298
ans =
2705.1
ans =
0.65
ans =
0.00578
ans = 1×3
0 2 0
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
ans = 1×3
0 1 0
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
ans = 1×3
11 6 5
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
ans = 1×3
10 1 10
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
ans =
434233
ans =
8.0729e+11
ans = []
ans =
69
ans =
63
You can format them at your leisure. Use either sprintf or fprintf depending on what you want to do.
.
  4 comentarios
Francesco
Francesco el 21 de Mzo. de 2025
Thank you so much!
Star Strider
Star Strider el 21 de Mzo. de 2025
As always, my pleasure!

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Data Type Conversion en Help Center y File Exchange.

Productos


Versión

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by