Importing .dat data and creating a matrix
Mostrar comentarios más antiguos
Hello
I got slightly more than one hundred .dat files. I'm attaching one of them here (converted into .txt because it didn't allow me to upload .dat). I'm trying to import only one at a time at this moment.
Generally the format is a row of date: yyyy mm dd and below are rows and columns of data. What I'm trying to do is import that data and limit it by four points - creating a matrix - as follows:
(11,5).....(21,5)
(11,13)....(21,13)
I want to create this matrix x 365 days - so import the data and assign it to its date somehow. I've tried the built in Import Tool and the readtable function but I can't get it to work for me at all. Does someone know a good way to do that?
Thanks
3 comentarios
OcDrive
el 15 de Ag. de 2023
Star Strider
el 15 de Ag. de 2023
To use readtable with a ‘.dat’ file, use the name-value pair 'FileType','text' listed under Text Files (no direct link to it), since they appear to be text files.
To upload one or more of them here (as .dat files), use the zip function and then upload the .zip file.
OcDrive
el 15 de Ag. de 2023
Respuesta aceptada
Más respuestas (1)
"...that data and limit it by four points - creating a matrix - as follows:
(11,5).....(21,5)
(11,13)....(21,13)"
I don't follow what the above is intended to represent? You mean you only want to keep four elements out of each day's worth of data given by those four indices or the array data(11:21,5:13) from each?
Either is relatively trivial, just need to know what, specifically, is intended.
Also, the file shows up with an extra linefeed or two in the first rows, it appears at least in my browser; the day of y,m,d is split at only one character on first record. One presumes/hopes that isn't real...
data=readlines('data.dat.txt'); % see what content is as string
data(1:5,:)
Ah, so, looks to be tab-delimited...
numel(double(data{1})==9)
numel(double(data{2})==9)
But, they're not same number each...readmatrix may not work as desired, let's see about that...
data=readmatrix('data.dat.txt');
whos data
data(1:5,1:7), data(1:5, end-4:end)
data(end-4:end,1:10)
Well, that does seem able to handle, let's see about finding the dates using that first tab---
ixDate=isnan(data(:,1)); % logical vector records starting with a NaN
nnz(ixDate) % how many are there?
Aha! The number we would have expected for each day of a non-leapyear year...that's most excellent!
We can then get the dates easily enough; one presumes the data size is consistent for each...but let's check that out...
dataSize=diff(find(ixDate)); % the distance between date records
N=unique(dataSize) % easy way to see if all the same and what is
And, they are all same size, with 17 data lines between...but that isn't commensurate with what would appear to be 21 rows requested above?
But, it's simple enough then to reshape the file however wanted...
dates=datetime(data(ixDate,[2 4 6])); % convert y,m, d to datetime
dates([1:3 end-2:end]) % and see if got it right
Looks ok; goes from first to last day of year.
ERRATUM: LOOK MORE CLOSELY, THE LAST THREE AREN'T RIGHT!!! There's a fuller explanation at bottom, the quick fixup corrections are..
dateData=data(ixDate,[2 4:6]); % columns of y, m, 10s,1s day
dateData(isnan(dateData))=0; % fixup the initial 10s day column
dates=datetime(dateData(:,1),dateData(:,2),10*dateData(:,3)+dateData(:,4)); % fix day and convert
dates([1:3 end-2:end])
OK, so now that does look more better...@OcDrive, you'll need to verify this behavior in the real file(s); would be a good thing to fix at the source if possible.
So, now, get rid of the date records and convert to 3D by the number records/day...
data=data(~ixDate,:); % remove the dates (keep not date)
whos data
data=mat2cell(data,repmat(N-1,1,numel(dates)),size(data,2)); % by records, width to a cell array
whos data
data=cat(3,data{:});
whos data
3 comentarios
Chetan Bhavsar
el 15 de Ag. de 2023
I guess OP actually wanted sub matrix from main matrix
(Col,Row)
(11,5).....(21,5)
(11,13)....(21,13)
but looking at data it should be
(Row,Col)
(5,11).....(5,21)
. .
. .
. .
(13,11)....(13,21)
dpb
el 15 de Ag. de 2023
I couldn't interpret that desire so left as "exercise for Student" to select whatever range is desired...that would be some sort of colon indexing operation.
Your last presumption would simply be
r1= 5; r2=13;
c1=11; c2=21;
data=data(r1:r2,c1:c2,:);
presuming the wish is intended as inclusive.
Actually, the dates above aren't correct for the end...looks like maybe there isn't always a blank column there after some point? Mayhaps have to do something more there...
data=readmatrix('data.dat.txt'); % see what content is as string
ixDate=isnan(data(:,1)); % logical vector records starting with a NaN
nnz(ixDate) % how many are there?
dateData=data(ixDate,:); % extract those records to inspect more carefully
dateData([1:5 end-4:end],:)
Yeah, well that sucks...the format does change somewhere from beginning to end.
Oh! Actually, the day is in two fields, the original NaN should be zeros in column 5 and then column 5:6 should be interpreted as one instead. Not sure if that is real or a figment of having uploaded the file; OP will have to determine what's going on there for sure.
To fix this as it was interpreted would be something like
dateData=data(ixDate,[2 4:6]); % pick out the date data columns
dateData(isnan(dateData))=0; % convert the initial nan-->0
dateData(:,3)=10*dateData(:,3)+dateData(:,4); % fixup the day number from the two columns
dates=datetime(dateData(:,1:3)); % convert y,m, d to datetime
dates([1:3 end-2:end]) % and see if got it right
Categorías
Más información sobre Data Type Conversion en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!