Strfind to contain complex pattern

1 view (last 30 days)
I have created the following program to search for sentences. I want to include those that only begin with vowels.
a = 'John played volleyball. I love Anna. Are you there?'
b = strfind(a,'.')
y{1} = a(1:b(1))
for i=2:length(b)
y{i} = a(b(i-1)+1:b(i))
y{i+1} = a((b(i)+1):end)

Accepted Answer

Cedric Wannaz
Cedric Wannaz on 3 Jan 2018
Edited: Cedric Wannaz on 3 Jan 2018
Here is an approach:
str = 'John played volleyball. I love Anna. Are you there?' ;
buffer = strtrim( strsplit( str, {'.', '?', '!'} )) ;
for k = numel( buffer ) : -1 : 1
if isempty( buffer{k} ) || ~any( upper(buffer{k}(1)) == 'AEIOUY' )
buffer(k) = [] ;
Running this, you get:
>> buffer
buffer =
1×2 cell array
{'I love Anna'} {'Are you there'}
If you don't understand, evaluate the following expression independently, and analyze their output:
strsplit( 'John played volleyball. I love Anna. Are you there?', {'.', '?', '!'} )
buffer = strtrim( strsplit( 'John played volleyball. I love Anna. Are you there?', {'.', '?', '!'} ))
upper(buffer{1}(1)) == 'AEIOUY'
any( upper(buffer{1}(1)) == 'AEIOUY' )
upper(buffer{2}(1)) == 'AEIOUY'
any( upper(buffer{2}(1)) == 'AEIOUY' )
PS: this could also be done using regular expressions, but more classic approaches (like the above) should be understood first.
PS2: your first attempt is good actually. You try to implement STRSPLIT and it works to some extent; it was good training, but it would have to be extended to support multiple delimiters. If you run the expressions above, you will realize that STRSPLIT does the split (outputs a cell array). You may have to use another approach if you need to keep the delimiters (.?!) though (see PS3). STRSPLIT leaves the leading and trailing spaces, which is a problem if you want to test the first character, hence the call to STRTRIM. The output of STRTRIM is a cell array of "clean" sentences. The loop removes all entries that don't start with a vowel. It needs to go backwards from the end of the array, otherwise indexing is messed up.
PS3: If you need to keep the punctuation, replace the call to STRSPLIT with a call to REGEXP, as follows:
buffer = strtrim( regexp( str, '.*?[\.!\?]', 'match' )) ;

More Answers (0)


Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by