Regular Expression to match strings after a certain number of words that do not contain a keyword

2 visualizaciones (últimos 30 días)
I'm attempting to use regular expressions to retrieve the middle of a string, and in the default case I need to match after two words, and in the non-default case I need to match after two words which do not contain a keyword.
An example input character array is
defaultCase = '1.2.3.4 Hello\ - my name is Bob'
This is fairly easy to handle - the below regex looks for two expressions which contain alpha_numeric characters, and then matches everything that follows the next alpha_numeric character.
%Returns 'my name is Bob'
matchedString = regexpi(defaultCase,'(?<=(\S*\w\S*\s[\s\W]*){2})\w.*','match','once')
Harder Case
nonDefault1 = '1.2.3.4 Hello Matlabbers - my name is Bob'
nonDefault2 = '1.2.3.4 Matlabbers - Hello - my name is Bob'
In this case I would like to explicitly not count the word Matlabbers in my look behind match - and I'd still like the output to be my name is Bob.
The best I've come up with is something like the following
%Returns 'my name is Bob'
regexpi(nonDefault2,'(?<=[\d\.]+\s+(Matlabbers)?\W*(?!Matlabbers)\S*\w\s\W+)\w[^\(]*\w','match','once')
This works for the nonDefault2 case, but in general it doesn't work. Does anyone know of a robust way to do this?

Respuesta aceptada

Matthew
Matthew el 2 de En. de 2018
I ended up just building this in a very piece wise way.
%Demonstration Cases
defaultCase = '1.2.3.4 Hello\ - my name is Bob';
nonDefault1 = '1.2.3.4 Hello Matlabbers - my name is Bob';
nonDefault2 = '1.2.3.4 Matlabbers - Hello - my name is Bob';
%Word to skip during counting
skipBasic = 'Matlabbers';
%Set up the regular expression
word = '(\S*[a-zA-Z0-9]+\S*)';
space = '(\s[\W\s_]*)';
skipWord = ['(\S*' skipBasic '\S*)'];
skipWordSpace = ['(',skipWord space '?)'];
wordSpace = ['(',word space '?)'];
nonSkipWord = ['(\<(?!' skipWord ')' word '\>)'];
pairedWord = ['(' skipWordSpace '*' nonSkipWord ')'];
firstTwoPairedWords = ['^(' pairedWord space '){2}'];
unwantedFirstPart = ['(' firstTwoPairedWords,skipWordSpace,'*)'];
wantedPart = ['(?<=' unwantedFirstPart ')' nonSkipWord space wordSpace '*'];
%Create the parser
endString = @(inputString) regexpi(inputString,wantedPart,'match','once');
%Apply the parser to the examples
disp(endString(defaultCase))
disp(endString(nonDefault1))
disp(endString(nonDefault2))

Más respuestas (0)

Categorías

Más información sobre Characters and Strings en Help Center y File Exchange.

Productos

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by