how can I do a code that compares each document one by one and get all the information I want?

Hi, I've got a code that compares two different text documents and then makes me a figure with geoscatter. Then I get the numbers from the variables that I want and save the figure. As its a big amount of documents I would like to know if there is any way to programm a code that imports me the documents from each folder that I want and then runs my code, saves the numbers I want and saves the image I know It sounds A little bit complicated but I am gonna post the code and explain from where it comes each thing I mentioned.
Firstly I have two different folders AIS 1 and AIS 2 in each folder we can find different folders for different days and in each day we have 24 text documents that has the following names:
2021033020AIS
2021033021AIS
2021033022AIS
2021033023AIS
As you can see there are just canging the las 2 digits that go from 00 to 23. The data has to be in strings in order to be compared.
My first problem is to import one document from each folder that ends the same way as in the other folder. Then once is imported this code is gonna run:
%% PRIMERA PARTE
AIS1(strlength(AIS1) < 15) = [];
AIS2(strlength(AIS2) < 15) = [];
% AIS1 = unique(AIS1);
% AIS2 = unique(AIS2);
N=size(AIS1,1); %% Importante detras que sino daba error el codigo
msg_AIS1 = regexp(AIS1, '.*(?=\d{4}$)', 'match', 'once'); % todo el mensaje menos las ultimas 4 cifras
msg_AIS2 = regexp(AIS2, '.*(?=\d{4}$)', 'match', 'once');
t1 = regexp(AIS1, '\d{4}$', 'match', 'once'); % sacar ultimas 4 cifras
t2 = regexp(AIS2, '\d{4}$', 'match', 'once');
Time_AIS1 = duration(strcat('00:',extractBefore(t1,3),':',extractAfter(t1,2))); % Poner en formato hh:mm:ss
Time_AIS1 = Time_AIS1+hours(cumsum([0;diff(Time_AIS1)<0])); %añadir una unidad en hh cada vez que se reinicia mm:ss
Time_AIS2 = duration(strcat('00:',extractBefore(t2,3),':',extractAfter(t2,2)));
Time_AIS2 = Time_AIS2+hours(cumsum([0;diff(Time_AIS2)<0]));
mask1 = ismissing(msg_AIS1) | ismissing(Time_AIS1);
mask2 = ismissing(msg_AIS2) | ismissing(Time_AIS2);
origi_AIS1 = (1:length(msg_AIS1));
origi_AIS2 = (1:length(msg_AIS2));
msg_AIS1(mask1) = []; Time_AIS1(mask1) = []; origi_AIS1(mask1) = [];
msg_AIS2(mask2) = []; Time_AIS2(mask2) = []; origi_AIS2(mask2) = [];
[H1, M1, S1] = hms(Time_AIS1); % Dar tiempo en 2 variables, utilizar M1 para crear rangos
[H2, M2, S2] = hms(Time_AIS2);
msg_match = cell(N, 1);
complete_match_AIS = [];
%% ENTRAR EN EL LOOP DE COMPARACION
for K = 1:1:N
all_match_AIS = find(msg_AIS1(K) == msg_AIS2); % encontrar mensajes iguales
if isempty(all_match_AIS) %fprintf para escribir datos en un archivo de texto
% fprintf('No hay coincidencias para la linia #%d -> "%s"\n', origi_AIS1(K), msg_AIS1(K)); % '%s' para un string
continue;
end
% fprintf('potencial coincidencia #%d -> "%s", checking times\n', origi_AIS1(K), msg_AIS1(K));
% disp(K), disp(all_match_AIS)
if H1(K)== H2(all_match_AIS)
% crear rango de coincidencia de minutos
complete_match_AIS = all_match_AIS(M1(K) == M2(all_match_AIS) | M1(K) == M2(all_match_AIS) - 1 | M1(K) == M2(all_match_AIS) + 1);% Rango creado +-1 minuto de cada mensaje
msg_match{K} = msg_AIS1(complete_match_AIS);
Time_msg_match{K} = complete_match_AIS;
end
if isempty(complete_match_AIS)
% fprintf('line %#d -> "%s" coincide texto pero no tiempo\n', origi_AIS1(K), msg_AIS1(K));
else
% fprintf ('line %#d -> "%s" coincide tambien el tiempo. Los resultados son:\n', origi_AIS1(K), msg_AIS1(K));
msg_AIS2(complete_match_AIS) %IMPORTANTE
end
end
%% AL ACABAR EL LOOP QUITAR CELDAS VACIAS Y CAMBIAR DATA TYPE DE LAS VARIABLES QUE QUEREMOS
%# encontrar celdas vacias (creacion de la variable)
emptyCells = cellfun(@isempty,msg_match);
emptyCells2 = cellfun(@isempty,Time_msg_match);
%# quitar las celdas vacias
msg_match(emptyCells) = [];
Time_msg_match(emptyCells2) = [];
% tenemos la posicion de los mensajes en data type cell. tenemos que
% pasarlo en un formato que nos deje indexarlo.
Time_msg_match = Time_msg_match';
% Quitar los strings de dentro de la cell (cat)--> para concadenar
Matching_msg = cellstr(cat(1, msg_match{:}));
Matching_msg = string(Matching_msg);
% QUITAR los double dentro de las cells de Time_msg_match2
numCells = numel(Time_msg_match);
Time_msg_match2 = zeros(numCells+10000, 1);
vector2Index = 1;
for k = 1 : numCells
len = length(Time_msg_match{k});
if len == 1
Time_msg_match2(vector2Index) = Time_msg_match{k};
vector2Index = vector2Index + 1;
else
fprintf('Row %d has %d elements in it.\n', k, length(Time_msg_match{k}));
for k2 = 1 : len
thisVector = Time_msg_match{k};
Time_msg_match2(vector2Index) = thisVector(k2);
vector2Index = vector2Index + 1;
end
end
end
if vector2Index < numCells
Time_msg_match2 = Time_msg_match2(1 : vector2Index - 1);
end
fprintf('Original Time_msg_match had %d rows.\n', numCells) %Nos dice de que numero a que numero pasamos al quitar las celdas vacias
fprintf('Afterwards Time_msg_match had %d rows.\n', numel(Time_msg_match2))
% quitar los ceros restantes
Time_msg_match2(Time_msg_match2 == 0) =[];
% creamos unas variables de AIS1 y AIS2 que nos dan los valores de cada
% linea. Para asi utilizar el setdiff y coger los la posicion de los
% mensajes para mirar el rango de horas que nos interesa.
P = length(msg_AIS2);
AIS2_MSG = (1:P); %s'ga de crear sa variable P
AIS2_MSG = AIS2_MSG';
AIS1_MSG = (1:N);
AIS1_MSG = AIS1_MSG';
% COGEREMOS AHORA LOS MENSAJES REALES QUE NO ESTAN EN LA COMPARACION
% IMPORTANTE AHORA : cogeremos los mensajes no repetido y luego la hora de
% estos.
NoMatchAIS1 = setdiff(AIS1_MSG,Time_msg_match2);
NoMatchAIS2 = setdiff(AIS2_MSG,Time_msg_match2);
% PODEM OBSERVAR QUE LA NO COINCIDENCIA DELS MISSATGES NO DEPEN DE LA HORA
% DEL DIA A LA QUE ENS TROBAM. LLAVORS A QUE?
% Hacemos variable de tiempo para los mensajes no repetidos de ambos AIS!!!
Time_NoMatchAIS1 = Time_AIS1(NoMatchAIS1);
Time_NoMatchAIS2 = Time_AIS2(NoMatchAIS2);
% CREAMOS LAS VARIABLES DE LOS MENSAJES NO REPETIDOS DE CADA AIS
msg_NoMatchAIS1 = msg_AIS1(NoMatchAIS1);
msg_NoMatchAIS2 = msg_AIS2(NoMatchAIS2);
disp 'Ya ha acabado de comparar'
L = size(msg_NoMatchAIS1,1);
J = size(msg_NoMatchAIS2,1);
% Para visualizar los barcos cada hora de los dos AIS
lat1 = [];
lon1 = [];
for i=1:1:L
seq1 = msg_NoMatchAIS1(i);
linia=convertStringsToChars(seq1);
if linia(13)=='A' && linia(15)=='1'
sequencia = ais_to_bit(linia(15:44));
s_longitud=sequencia(62:89);
longitud = bin2dec(num2str(s_longitud))/600000; % en graus
lon1 = [lon1, longitud];
s_latitud=sequencia(90:116);
latitud = bin2dec(num2str(s_latitud))/600000; % en graus
lat1 = [lat1, latitud];
end
end
lat2 = [];
lon2 = [];
for j=1:1:J
seq2 = msg_NoMatchAIS2(j);
linia=convertStringsToChars(seq2);
if linia(13)=='A' && linia(15)=='1'
sequencia = ais_to_bit(linia(15:44));
s_longitud = sequencia(62:89);
longitud = bin2dec(num2str(s_longitud))/600000; % en graus
lon2 = [lon2, longitud];
s_latitud = sequencia(90:116);
latitud = bin2dec(num2str(s_latitud))/600000; % en graus
lat2 = [lat2, latitud];
end
end
figure(1)
geoscatter(lat1, lon1)
hold on
geoscatter(lat2,lon2,'filled')
legend('AIS1','AIS2')
hold off
X = sprintf('Mensajes AIS1 %.f no Match %.f y mensajes AIS2 %.f no Match %.f \n',K,L,P,J);
disp(X)
As you can see compares the two documents and then makes a figure. How could I make to save this figure and in the end How could I save the following numbers from the variables K, L, P, J.
I am wasting a lot of time because I do not know how to do it so If anybody know just let me know Thank You In advance

12 comentarios

You can use saveas() to save a figure to file, and you can use save() to save variables to file.
For importing the files, how are you doing it now? And can you share an example document file here?
I should save those figures and variables each time the loop ends. I've uploaded three documetns from two different foldrs so you can save them in different folders to try to code something
What code are you using to import the files now? How about you do that at the beginning of the loop?
And, yes, you can go ahead and call save and saveas as necessary at the bottom of the loop.
I am not using nothing just importing one by one
But do not know how could I import two documents from different folders at the same time
Where do the variables AIS1 and AIS2 come from? Are you importing each file by doing something like "Variables > Import Data" in the MATLAB window?
i put Import data and imported with the name AIS 1 AND AIS 2 YEAH
Can you do
save('vars.mat','AIS1','AIS2');
and then attach the file vars.mat here? This way I can use the files you shared and the corresponding imported variables to attempt to reproduce the import process with some code that you could in turn use in your code.
I already uploaded you some data some messages before You have to save them in different folders thats it
I have the files, yes. But I don't have your variables AIS1 and AIS2 like they exist in your workspace, and I don't know what options you used to import them.
The code you shared uses some variables AIS1 and AIS2. The files you shared are what these variables are based on. But there is no way for me to go from files to variables in the workspace so I can run and/or modify your code. Does that make sense? "You have to save them in different folders thats it" "i put Import data and imported with the name AIS 1 AND AIS 2 YEAH" is not a sufficient description of the process for me to reproduce it and be able to run your code.
I just one a loop that exports one by one the documents I uploaded you and then runs my code to get a figure each time the documents are different
OK, I don't think I can help you. Maybe someone else can figure it out. Good luck!

Iniciar sesión para comentar.

 Respuesta aceptada

Based on a comment you made on a another related question (https://www.mathworks.com/matlabcentral/answers/1453769-ismember-and-ways-to-implement-it#comment_1737439), I gather that the variables AIS1 and AIS2 are string arrays with each string element corresponding to one line of their respective input .txt files. (I can't say for certain because I don't have access to those variables, just the text files.) Based on that working assumption, I put together a function that will "import" these files, that is, read the file and return a string array for use in your code above.
The function (import_AIS) is defined at the bottom of the script below. In calling it, you would replace fn1 and fn2 with the full paths to the two files you want to compare. Then use what it returns as AIS1 and AIS2 in your code above. Wrap the whole thing in a loop if you want to go through 2 directories and compare each pair of same-named files.
% Specify the two files (using two you attached previously for demonstration):
fn1 = fullfile(pwd(),'2021030102AIS.txt');
fn2 = fullfile(pwd(),'2021030102AIS (2).txt');
% call import_AIS with the two file names:
AIS1 = import_AIS(fn1);
AIS2 = import_AIS(fn2);
% to demonstrate what the function returned:
display(size(AIS1)); display(AIS1([1 end])); display(size(AIS2)); display(AIS2([1 end]));
1 18973 1×2 string array "!AIVDM,1,1,,A,H33mw2Q>uV0luHTpN3800000000,2*080000" "!AIVDM,1,1,,A,13F:b60P0909tF8GbLMbH?wl0@Ra,0*0F5959" 1 13091 1×2 string array "!AIVDM,1,1,,A,H33mw2Q>uV0luHTpN3800000000,2*080000" "!AIVDM,1,1,,B,33=Orb1000P:0:tGa?779Qon0000,0*225959"
function AIS = import_AIS(fn)
fid = fopen(fn);
if fid == -1
AIS = string();
return
end
data = fread(fid,'*char');
fclose(fid);
AIS = string(strsplit(data.',newline()));
if strlength(AIS(end)) == 0
AIS(end) = [];
end
end

19 comentarios

If I want to get files from those two folders how I should do? I attached a photo
I have the same name files in both directories and I want to compare them not just 1 files and the files change the last 2 digits thats why I attached 3 files of each folder before.
folder1 = 'DIAS AIS 1';
folder2 = 'DIAS AIS 2';
dinfo = dir(folder1);
dinfo([dinfo.isdir]) = []; %discard folders including . and ..
filenames = {dinfo.name};
nfiles = length(filenames);
for K = 1 : nfiles
thisfile = filenames{K};
thisfile1 = fullfile(folder1, thisfile);
thisfile2 = fullfile(folder2, thisfile);
if ~exist(thisfile2, 'file')
error('file "%s" does not exist in folder "%s"', thisfile, folder2);
end
AIS1 = import_AIS(thisfile1);
AIS2 = import_AIS(thisfile2);
%now compare the data
whatever
end
That should give me all the files that are in the folders? and if I write my code to compare is going to compare the first file from one folder with the same one file that is in the other folder? I do not really understand this part:
for K = 1 : nfiles
thisfile = filenames{K};
thisfile1 = fullfile(folder1, thisfile);
thisfile2 = fullfile(folder2, thisfile);
if ~exist(thisfile2, 'file')
error('file "%s" does not exist in folder "%s"', thisfile, folder2);
end
AIS1 = import_AIS(thisfile1);
AIS2 = import_AIS(thisfile2);
%now compare the data
whatever
end
I want to load one by one in each folder and run my code for the whole day
folder1 = 'DIAS AIS 1';
dinfo = dir(folder1);
dinfo([dinfo.isdir]) = []; %discard folders including . and ..
filenames = {dinfo.name};
nfiles = length(filenames);
That fetches all of the file names within the first folder, and also counts the number of files.
for K = 1 : nfiles
Loop over the number of files we know we have in the first folder
thisfile = filenames{K};
Pull out the next filename we extracted from the first folder
thisfile1 = fullfile(folder1, thisfile);
Create the full name of that file, within the first folder.
thisfile2 = fullfile(folder2, thisfile);
Create the full name that same file would have within the second folder, under the assumption that the file names are the same and just the folder names differ.
if ~exist(thisfile2, 'file')
error('file "%s" does not exist in folder "%s"', thisfile, folder2);
end
Test to see whether the file really does exist within the second folder, and give an error message if it does not. If you get that error message then there is a file in the first folder that does not exist in the second folder.
AIS1 = import_AIS(thisfile1);
Read from the file in the first folder
AIS2 = import_AIS(thisfile2);
Read from the corresponding file in the second folder.
At that point you can do any comparing that you need to do.
"That should give me all the files that are in the folders?"
No, the code ignores any files that exist in the second folder but not in the first folder. It does, however, process all of the files that exist in both folders.
"and if I write my code to compare is going to compare the first file from one folder with the same one file that is in the other folder?"
Yes, after the import_AIS calls, the data for the first file is in AIS1 and the data for the second file is in AIS2, ready for you to compare them however is appropriate.
"I want to load one by one in each folder and run my code for the whole day"
This code will load one file from the first folder, and the corresponding file from the second folder. Then you compare them. Then the code will load the second file from the first folder and the corresponding second file from the second folder. And so on, each iteration loading one file from the first folder and the corresponding file from the second folder.
I runned the code and do not get any result. Do you know why?
could It be in the function or? I upload my code
fn1 = fullfile(pwd(),'2021030100AIS.txt');
fn2 = fullfile(pwd(),'2021030100AIS.txt');
% call import_AIS with the two file names:
AIS1 = import_AIS(fn1);
AIS2 = import_AIS(fn2);
function AIS = import_AIS(fn)
fid = fopen(fn);
if fid == -1
AIS = string();
return
end
data = fread(fid,'*char');
fclose(fid);
AIS = string(strsplit(data.',newline()));
if strlength(AIS(end)) == 0
AIS(end) = [];
end
end
I am getting this problem now:
>> prova
Error using prova (line 12)
file "2021030100AIS.txt" does not exist in folder "AIS2 DIA 1"
the file does exist but I do not understand why it says it does not exist
Maybe it is because both files have the same name but different content so as they arent the same it does not work
Please show the output of
dir('DIAS AIS 1')
dir('DIAS AIS 2')
Your attached code uses directories DIAS AIS 1 and DIAS AIS2 but the error message is about folder AIS2 DIA 1 so the error message does not match the code.
Yes because the folder is AIS 2 DIA 1 I changed because I had another error. What I mean is that the problem comes from not getting the same name and the name exost in folder2 so... Did you tried it?
Yes, I tried it, but since I do not have your files, it does not find anything at all when I run the code.
I uploaded some files before from both folders. In about a hour I am gonna try to figure why does not work
I've made two changes and it worked. how could I modify this part of the code to save the graphic is made every time to a folder I want and with the last four digits of the name? And then I would like to save some variables that I have how could I save them to then work with them in a excel.
figure(1)
geoscatter(lat1, lon1)
hold on
geoscatter(lat2,lon2,'filled') %save those graphics and in a folder with the last four digits of the name of the file
legend('AIS1','AIS2')
hold off
X = sprintf('Mensajes AIS1 %.f no Match %.f y mensajes AIS2 %.f no Match %.f \n',K,L,P,J); %save those 4 variables eahc time the loop begins
disp(X)
See exportgraphics()
[~, basename, ~] = fileparts(thisfile);
last4 = basename(end-3:end);
outname = fullfile(output_directory_name_goes_here, last4 + ".png");
exportgraphics(gcf, outname);

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Data Import and Analysis en Centro de ayuda y File Exchange.

Etiquetas

Preguntada:

el 29 de Dic. de 2021

Comentada:

el 31 de Dic. de 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by