Traversing Text Document Matlab
3 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Please provide guidance on this particular inquiry. All responses are highly valued and will be used to further knowledge(not just looking for a copy and paste solution). I am attempting to read a Microsoft Word dictionary into Matlab. From here I would like to be able to traverse it and extract words of a specific length, say four letter words, and put them into an array. Then I would like to select random words from the array and put them into a matrix. ?
0 comentarios
Respuestas (1)
Adam Danz
el 17 de Nov. de 2019
Editada: Adam Danz
el 17 de Nov. de 2019
Reading from word doc
Here's the general approach to reading a Microsoft word document.
directory = 'C:\Users\AOC\Documents\MATLAB';
file = 'myDocFile.docx';
% Full path to the MS Word file
filePath = fullfile(directory,file);
% Read MS Word file using actxserver function
word = actxserver('Word.Application');
wdoc = word.Documents.Open(filePath);
txt = wdoc.Content.Text;
Quit(word)
delete(word)
The variable txt is a char array containing the text in your document.
Extracting 4-letter words
There are several approaches you could use. This one is fast and doesn't require segementing each word and counting each word-length. Instead, it uses a regular expression to search for this pattern:
[non-letter],[4-letters],[non-letter]
It also uses strtrim() to remove the leading and trailing white space.
% Extract 4-letter words.
s = strtrim(regexp(txt, '([^a-zA-Z])[a-zA-Z]{4}([^a-zA-Z])', 'match'));
s is a 1xn cell array of 4-letter words at character arrays.
Randomly select words
You can't put non-numeric values into a matrix but you can put them into a cell array. This example below chooses n random values from the extracted words.
n = 10;
if n > numel(s)
error('There are only %d words available. You selected %d words.' numel(s), n)
end
randIdx = randi(numel(s),1,n);
randWords = s(randIDx); % Here is your random selection
Ver también
Categorías
Más información sobre Text Files en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!