Import a non rectangular string file using textscan
Mostrar comentarios más antiguos
Hello all!
I'm currently working on social nets and have a string file of names i have to import in order to create an adjacency matrix. The format of the file is as such:
Author1 Recipient1,Recipient2,Recipient3 "Date"
Author2 Recipient4,Recipient3 "Date"
Author3 Recipient5 "Date"
Author2 Recipient3,Recipient4,Recipient6 "Date"
etc
Using the code below, I have no problem importing the author or date lists.
fid = fopen(['txtfiles.txt']);
C = textscan(fid,'%s %s %q %*[^\n]','CollectOutput');
fclose(fid);
The trouble is with the recipients lists since i have no way of knowing what the maximum length of recipients will be. I'd like to have a rectangular cell array (CA) so that i can read them straight away. What i get at the moment is a CA as such:
Recipient1,Recipient2,Recipient3
Recipient4,Recipient3
Recipient5
Recipient3,Recipient4,Recipient6
So i textscan the latter CA once more and get a nested CA as such:
3x1cell
2x1cell
'Recipient5'
3x1cell
At the moment what i do is search for the commas using:
findCommas=strfind(mentions,',');emptyCell=cellfun(@isempty,findCommas);
commaPos=find(~emptyCell);
In addition, I use a for-loop to expand the nested cells described above. As you can imagine, all this is taking forever when i have to process 2M entries. Is there anything i can do to get a CA of strings and not a CA of mixed format data for the recipients? Thanking you in anticipation, Dinos
Respuestas (0)
Categorías
Más información sobre String Parsing en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!