removing parentheses around digits using regular expressions

Dear all, I am slowly making progress on my learning of regular expressions. At the moment, I am trying to solve the following problem: replace all occurrences of (n) with n, where n is a number, provided that no alphabetical letter occurs before the first parenthesis. As an example,
str='(2)+p_5*(3)-(0.3)'
would become
2+p_5*3-0.3
I wrote the following
regexprep(str,'(\W)(\()([.012345789]+)(\))','$1$3')
but it does not solves the problem if one of the expressions to change occurs at the beginning as in the example above. More concretely, the answer I get from running this is
(2)+p_5*3-0.3
which is not the expected result.
Thanks in advance for any help
Pat.

 Respuesta aceptada

Matt Fig
Matt Fig el 5 de Oct. de 2012
Editada: Matt Fig el 5 de Oct. de 2012
regexprep(str,'(\()([\d*\.]+)(\))','$2')

10 comentarios

Hi Matt,
Your solution does not work on this example
(2)+p_5*(3)-(0.3)+exp(3)
no change should occur if the opening parenthesis is preceded by an alphabetical letter.
Thanks,
Pat
Matt Fig
Matt Fig el 5 de Oct. de 2012
Editada: Matt Fig el 5 de Oct. de 2012
Try this one:
regexprep(str,'(?<![\w])(\()([\d*\.]+)(\))','$2')
nb: the (?<![\w]) is an example of look-behind.
Yes, and it seems to work!
str='(2)+p_5*(3)-(0.3)+sin(0)-cos(6.7)+(3)^(2-3) + (55.55)'
regexprep(str,'(?<![\w])(\()([\d*\.]+)(\))','$2')
ans =
2+p_5*3-0.3+sin(0)-cos(6.7)+3^(2-3) + 55.55
This is very helpful. Actually the (.) should occur at most once and so a fully correct expression would be
regexprep(str,'(?<![\w])(\()(\d*\.?\d*)(\))','$2')
Thanks a lot Matt.
Pat.
(\d*\.?\d*) matches a single "."
(\d*\.?\d*) also matches the empty string.
It does not, however, match infinity, or any irrational number.
"matches the empty string", what does that mean with regular expressions? In what type of cases could matching an empty string cause an effect?
For example, 1+()*2 the empty string inside the () would match because all parts of \d*\.?\d* are optional, so the expression would be converted to 1+*2
Also, I note that the problem description does not disallow numeric characters before the (), so 1+Henkel2(7)*5 would be converted to 1+Henkel27*5
per isakson
per isakson el 5 de Oct. de 2012
Editada: per isakson el 6 de Oct. de 2012
Yes.
Assuming the requirement is to match numbers only. The second expression below seems to be closer to a working one. However, what expression is taught in the book?
>> regexprep( '(12),(.12),(12.),(1.2),(.),()' ...
, '\((\d*\.?\d*)\)', '#$1' )
ans =
#12,#.12,#12.,#1.2,#.,#
>> regexprep( '(12),(.12),(12.),(1.2),(.),()' ...
, '\(((\d+\.?\d*)|(\d*\.\d+))\)', '#$1' )
ans =
#12,#.12,#12.,#1.2,(.),()
.
\w takes care of the case with Henkel2 - by intent or not. However, are there any cases, in which "(" should be replaced when preceded by a digit?

Iniciar sesión para comentar.

Más respuestas (2)

Walter Roberson
Walter Roberson el 5 de Oct. de 2012
Consider using a "look-behind"

2 comentarios

Patrick Mboma
Patrick Mboma el 5 de Oct. de 2012
Editada: Patrick Mboma el 5 de Oct. de 2012
Hi Walter,
My knowledge of regular expressions is not that good yet... I don't know yet how to implement "look behinds".

Iniciar sesión para comentar.

per isakson
per isakson el 5 de Oct. de 2012
Editada: per isakson el 5 de Oct. de 2012
>> regexprep( str, '\(([\d.]+)\)', '$1' )
ans =
2+p_5*3-0.3
str =
(2)+p_5*(3)-(0.3)+exp(3)
>> regexprep( str, '\(([\d.]+)\)', '$1' )
ans =
2+p_5*3-0.3+exp3
  • *\(* represents "("
  • \) represents ")"
  • [\d.]+ represents one or more digits and periods, e.g. "....0" and "2"
  • (expr) "Group regular expressions and capture tokens." The token may be refered to in the replacement string by $1 - "1" because it is the first
Every substring in the string that matches this expression is replaced, i.e. a number enclosed by parentheses is replaced by the number.
.
--- in response to a comment ---
>> regexprep( str, '(?<!\w)\(([\d.]+)\)', '$1' )
ans =
2+p_5*3-0.3+exp(3)
better
>> regexprep( str, '(?<![a-zA-Z])\(([\d.]+)\)', '$1' )
because \w includes digits.
  • (?<![a-zA-Z]) "Look behind from current position and test if expr is not found." Where expr evaluates to a letter. Thus, if preceded by a letter there is no match.

7 comentarios

Hi Per, This would not work. Here is a counter-example
'(2)+p_5*(3)-(0.3)+exp(3)'
your solution would give
'2+p_5*3-0.3+exp3'
per isakson
per isakson el 5 de Oct. de 2012
Editada: per isakson el 5 de Oct. de 2012
Yes, it does. What should it give? OK preceded by letter
What about atan2()?
The solution I posted in the comments to my answer handles atan2.
str='(2)+p_5*(3)-(0.3)+cos(5.7)+(3)^(2-3) + (.55) + atan2(9)';
regexprep(str,'(?<![\w])(\()([\d*\.]+)(\))','$2')
ans =
2+p_5*3-0.3+cos(5.7)+3^(2-3) + .55 + atan2(9)
per isakson
per isakson el 5 de Oct. de 2012
Editada: per isakson el 5 de Oct. de 2012
Yes, and that is because "\w" stands for "[A-Za-z0-9]". However, it is difficult to know whether including "0-9" might have any unintended side effects.
I find it difficult to construct robust expressions. However, I have never tried to learn regular expressions in a systematic way.
I am currently doing just that, with you guys' help.
Thanks to everybody.
Pat.

Iniciar sesión para comentar.

Categorías

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by