regexprep incorrect multiple replacement
5 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Paolo
el 5 de Jun. de 2018
Comentada: Paolo
el 5 de Jun. de 2018
Let's say we have the following char vector as input:
str = 'abc(1,2,3)';
I would like to replace '1','2' and '3' with different numbers.
Let's say I want to replace the numbers with the following numbers:
rep = '{'5';'8';'3'};
My desired output is:
str = 'abc(5,8,3)';
The format for using regexprep is:
regexprep(str,expression,replace)
I have tried to solve the problem in two ways:
- One expression.
expression = '\d';
replace = {'5';'2';'3'};
regexprep(str,expression,replace)
ans = 'abc(3,3,3)'
The output is incorrect, despite the documentation stating:
If replace is a cell array of N character vectors and expression is a single character vector, then regexprep attempts N matches and replacements.
- Multiple expressions.
expression = {'\d';'\d';'\d'};
replace = {'5';'2';'3'};
regexprep(str,expression,replace)
ans = 'abc(3,3,3)'
The output for the second case is incorrect, despite the documentation stating:
If both replace and expression are cell arrays of character vectors, then they must contain the same number of elements. regexprep pairs each replace element with its corresponding element in expression.
In both cases regexprep is replacing all three matches using only the last value from the replace cell array, rather than all three.
What am I missing?
2 comentarios
Stephen23
el 5 de Jun. de 2018
Editada: Stephen23
el 5 de Jun. de 2018
"The output is incorrect, despite the documentation stating:..."
"What am I missing?"
The output is correct in both cases. The documentation states that it "...attempts N matches and replacements": so it matches the digits and replaces them with cell one, then it starts afresh and matches the digits and replaces them with cell 2, then it starts afresh and matches the digits and replaces them with cell 3. Which is exactly the output you are getting.
Each time regexp starts parsing the string from the start again, whereas you assumed that it starts from where it finished replacing the last string. To get the behavior you want you will have to add a dynamic expression of some kind.
Respuesta aceptada
Walter Roberson
el 5 de Jun. de 2018
regexprep (S, {A, B }, { P, Q })
is the same as
regexprep( regexprep(S, A, P), B, Q)
That is, the first pair is applied to the entire string, and the second pair is applied to the string that results.
It appears to you that only the third was done because your replacement text happens to match the second and third pattern and got rereplaced.
The 'once' option will not solve the problem.
3 comentarios
Walter Roberson
el 5 de Jun. de 2018
str = 'abc(1,2,3)';
regexprep(str, '\d+(\D+)\d+(\D+)\d+', '5$18$23')
The $1 in the replacement pattern matches the first () expression, the $2 matches the second () expression. So we match one or more digits, then remember the sequence of non-digits that follows that, then match another series of digits, then remember the sequence of non-digits that follows that, then match another series of digits. And we replace that all with fixed text followed by the first remembered series of non-digits, then fixed text followed by the second remembered series of non-digits, then more fixed text.
Más respuestas (0)
Ver también
Categorías
Más información sobre Characters and Strings en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!