detect correct startRow in fopen before textscan

Question

0 votos

Hello,

I have a text file containing 17 columns of data, with a variable string header above the data. The header contains several rows of strings. The number of rows is not fixed, otherwise I will not request to post my question on the forum.

The column of data that I want to import are located at a certain row defined by startRow, but the value of startRow depend on the headers number of rows. How many rows are defining the headers is unknow after using fopen, but must be known when using textscan. So in between, I have to implement an automated detection of startRow, whatever the header above the data.

This is an example of the text file.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

program_20181015.vi																			
line2
line3
line4															
t	T	V	off	F	I1	V1	I2	V2	Li1	Li2	X1	Y2	X3	Y4	V5	c			
6.357780E+2	2.999041E+2	3.500000E-3	0.000000E+0	1.100000E+0	5.000000E-8	1.999990E+101	-5.000000E-12	1.000000E-4	7.140000E-6	-9.620000E-6	2.395640E-1	-4.995750E-2	2.400520E-1	-5.032370E-2	-2.727684E-7	0.000000E+0	0.000000E+0	0.000000E+0	0.000000E+0

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

In this particular example, Line 6 corresponds to the startRow that I want to detect.

the string chains 't T V off F I1 V1 I2 V2 Li1 Li2 X1 Y2 X3 Y4 V5 c' is always the same whatever the content above this line. So this could be nice to detect such string using find function, because data starta hereafter this line.

Of course I can simply set startRow = 6, and it is solved. But depending on user, I have different number of headers rows above the data. So I need to detect startRow automatically.

In forum, I found the interesting try / catch. Maybe it is nice to use it for my purpose. If startRow =1 (because it should be 6), then an error occurs of course. So catch will not be executed.

Here, I would like to implement startRow = startRow +1, and try again. If no error then catch. Or startRow = startRow +1 and try again.

How to do that ?

startRow = 1; 
try 
delimiter = '\t'; 
formatSpec = '%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%[^\n\r]';
fileID = fopen(fichier,'r');
catch me
dataArray = textscan(fileID, formatSpec, 'Delimiter', delimiter, 'TextType', 'string', 'EmptyValue', NaN, 'HeaderLines' ,startRow-1, 'ReturnOnError', false, 'EndOfLine', '\r\n');
fclose(fileID);
end 
 
RAW = importdata(filename,'\t',startRow);
M= RAW.data(:,1:17);

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Follow Question

Answer 1

Etsuo Maeda el 28 de Nov. de 2018

Editada: Etsuo Maeda el 28 de Nov. de 2018

Abrir en MATLAB Online

0 votos

A while loop will help you.

k = 0;
while exist('D') ~= 1
    try
        D = dlmread('yourfile.txt', '', k, 0);
    catch
        k = k + 1;
    end
end

HTH

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Answer 2

laurent jalabert el 28 de Nov. de 2018

Abrir en MATLAB Online

0 votos

Dear Maeda-san,

I tried to adapt your code like this, but I stopped running it by CTRL-C : startRow was about 73458. Actually, the correct value should be around 22.

startRow = 0 ;
while exist('dataArray') ~= 1
try
dataArray = textscan(fileID, formatSpec, 'Delimiter', delimiter, 'TextType', 'string', 'EmptyValue', NaN, 'HeaderLines' ,startRow-1, 'ReturnOnError', false, 'EndOfLine', '\r\n');
fclose(fileID);
catch ME
startRow = startRow+1;
end
end

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Answer 3

Etsuo Maeda el 29 de Nov. de 2018

Editada: Etsuo Maeda el 29 de Nov. de 2018

Abrir en MATLAB Online

0 votos

Hi laurent jalabert - san,

textscan is little bit different from dlmread.

In case of textscan, its "empty" output exists in every loop and everywhere.

In case of dlmread, its output can exist when it succeed to read numerical data, not text data.

So, "exist('dataArray') ~= 1" cannot work well with your original code.

You can confirm the difference between textscan and dlmread toward unexpected data input with following codes.

fid = fopen('yourfile.txt');
fspec = repmat('%f', [1, 16]);
D = textscan(fid, fspec, 'HeaderLines', 0);
fclose(fid);

and

D = dlmread('yourfile.txt', '', 0, 0);

and 'yourfile.txt'

program_20181015.vi	
line2
line3
line4	
t	T	V	off	F	I1	V1	I2	V2	Li1	Li2	X1	Y2	X3	Y4	V5	c	
6.357780E+2	2.999041E+2	3.500000E-3	0.000000E+0	1.100000E+0	5.000000E-8	1.999990E+101	-5.000000E-12	1.000000E-4	7.140000E-6	-9.620000E-6	2.395640E-1	-4.995750E-2	2.400520E-1	-5.032370E-2	-2.727684E-7	0.000000E+0	0.000000E+0	0.000000E+0 0.000000E+0

I believe my suggested code in the previous post with dlmread can work well for your data without any modification.

If you need to use textscan function, I can suggest an another way using "isempty" function.

fid = fopen('yourfile.txt');
fspec = repmat('%f', [1, 16]);
k = 0;
D{1, 1} = [];
while isempty(D{1, 1}) == 1
    D = textscan(fid, fspec, 'HeaderLines', k);
    k = k +1;
end
fclose(fid);

HTH

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Answer 4

laurent jalabert el 30 de Nov. de 2018

0 votos

Dear Maeda-san

thank you very much for your help. I tried to implement your code, but it did not work. If it works, then in my example, k=6;

Using your code, k=1. Therefore I get an error.

Basically, I want to detect startRow value, in order to import the data from startRow, and import the header until startRow-1.

I understand your code like this.

At first, D{1,1} =[ ] therefore for k=0, isempty(D{1, 1}) == 1. Then D=textscan(...) and k=k+1. Then I get an error because textscan(...) has not the correct value of headLines.

So I am sorry for my possible misunderstanding, but this code might not work.

The data are always located after this line :

t T V off F I1 V1 I2 V2 Li1 Li2 X1 Y2 X3 Y4 V5 c

So how can I simply detect the row value of this above line in my text file ?

Yours

Laurent

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Answer 5

Etsuo Maeda el 30 de Nov. de 2018

Editada: Etsuo Maeda el 30 de Nov. de 2018

0 votos

yourfile.txt

Hello Laurent - san,

I analyzed your code and finally I found out a bug in "textscan" function with "while" loop!!!

Reproduction steps are following.

aaaaa
bbbbb
ccccc
dddd
eeee
1	2	3
4	5	6

and

clear; close all; fclose all;
fid = fopen('test.txt', 'r');
fspec = '%f%f%f';
k = 0;
while exist('D') ~= 1
    try
        D = textscan(fid, fspec, 'HeaderLines', k, 'ReturnOnError', false); % k = 3 NO error EMPTY D
    catch
        k = k +1;
        disp(k)
    end
end
D = textscan(fid, fspec, 'HeaderLines', k, 'ReturnOnError', false); % k = 3 NO error EMPTY D
fclose(fid);
fid = fopen('test.txt', 'r');
D = textscan(fid, fspec, 'HeaderLines', k, 'ReturnOnError', false); % k = 3 error!!!
fclose(fid)

"k" should be 5 but the while loop stops at 3.

"D" exists but it is empty.

When try to peform textscan again before fclose, it also works but D is empty.

After fclose and 2nd fopen, textscan will show an error with k = 3 and D is not created.

In case of your file, textscan will return strange numbers and stops at k = 4.

It is unexpected behavior of textscan.

I will report your case to the development team in US.

As a workaround, could you please use my 1st code to find the ROW number?

k = 0;
while exist('D') ~= 1
    try
        D = dlmread('yourfile.txt', '', k, 0);
    catch
        k = k + 1;
    end
end

The numerical data start from 6th line.

"D" will contain numerical data.

"k" will be 5.

Thank you very much for your question and patience.

HTH

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Answer 6

laurent jalabert el 30 de Nov. de 2018

Abrir en MATLAB Online

0 votos

Dear Maeda-san

the code below works well and solved my question. I deeply thank you for your answer, and your time to help me. It could be useful to many people who wants to read any data file containing variable headers lines.

k = 0;
while exist('D') ~= 1
    try
        D = dlmread('yourfile.txt', '', k, 0);
    catch
        k = k + 1;
    end
end

Now, to retrieve the headers, usually I use this kind of code,

RAW = importdata(filename,'\t',startRow);
M= RAW.data(:,1:21);
header = (RAW.textdata(1:startRow-1))';

In 1 data file, I found k=21, size(D) = [69735 21] , that means there are 21 columns of data, and 69735 lines. The headers are located from row =1 to row = 20. How can I get the headers ?

With importdata, it is quite easy as I show above with RAW.textdata function.

With dlmread, how is it ? I guess I should use textscan from row =1 to row =21 , isn't it ?

Yours

Laurent

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Answer 7

Etsuo Maeda el 30 de Nov. de 2018

0 votos

Dear Laurent - san,

"dlmread" is a function to read numbers, not for characters.

So, it is impossible to read your header characters using "dlmread".

As you mentioned, "textscan or importdata again with determined k" is one of workaournds.

I think "readtable" is a good tool for you if you know number of the variables.

(The first question was try-catch problem. So I used try-catch statement in my answers before.)

clear; close all;
% R2016b and later
filename = 'yourfile.txt';
numOfVariables = 21;
opts = detectImportOptions(filename, 'Delimiter', '\t', 'NumVariables', numOfVariables);
T = readtable(filename, opts)
T.Properties.VariableNames

If you do not know the number of the variables, the following code may work but I cannot make any promise.

clear; close all;
% R2016b and later
filename = 'yourfile.txt';
opts = detectImportOptions(filename, 'Delimiter', '\t'); % remove NumVariables
T = readtable(filename, opts)
T.Properties.VariableNames

HTH

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

detect correct startRow in fopen before textscan

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Respuesta aceptada

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Más respuestas (6)

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Categorías

Productos

Versión

Etiquetas

Community Treasure Hunt

detect correct startRow in fopen before textscan

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Respuesta aceptada

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Más respuestas (6)

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Categorías

Productos

Versión

Etiquetas

Ver también

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos