how to extract strings between two newline characters using regexp

then the \n at the end "eats" the \n after '1 2', leaving the remaining string as '3 4\n' . That string then does not match the pattern that begins with \n

Consider

a = regexp(S, '(?<=\n)\d[^\n]*', 'match')

This leaves the \n in place; and the (?<=\n) part requires that the \n before the digit be there but does not include the \n in the output.

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Walter Roberson el 13 de Jul. de 2017

overlapping in regular expressions typically requires using zero-width assertions.

Some programming languages such as a newer python regexp module support an "overlapped" switch.

In some programming language such as perl, there are tricks that can be done with essentially evaluating code in the middle of a match, but that code has to get a bit complicated to handle backtracking properly. See for example http://www.perlmonks.org/?node_id=463461

Li Xue el 13 de Jul. de 2017

I see. Many thanks for your help.

Iniciar sesión para comentar.

Answer 2

Sayam Ganguly el 12 de Jul. de 2017

2
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/348439-how-to-extract-strings-between-two-newline-characters-using-regexp#answer_273899

Abrir en MATLAB Online

From your question I understand that you have a string '\n1 2\n3 4\n' and you want to extract '1 2' and '3 4' into a 1*2 cell array. I would like to suggest a different regexp that should help you achieve your workflow.

a=regexp(S, '.*','match','dotexceptnewline')

Here '.*' automatically considers all the characters but because of the 'dotexceptnewline', the '/n' characters are not considered and you get a 1*2 cell array split with your desired result. In case of your approach the entire pattern was getting matched only once and was not getting repeated.

4 comentarios
Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

Li Xue el 13 de Jul. de 2017

Editada: Li Xue el 13 de Jul. de 2017

Yes, perl can achieve quite complicated tasks. But Perl regular expression rules are very simple. For a programmer, it is very easy to remember and use. But matlab regular expression has lot of tricks, such as 'dotexceptnewline'. It looks simpler than perl but on the programming part it takes more time to figure out.

Walter Roberson el 13 de Jul. de 2017

I programmed in perl for a few years. The rules are not easy to remember. There are multiple books explaining perl regular expressions. For example O'Reilly's "Mastering Regular Expressions" http://shop.oreilly.com/product/9780596528126.do which is over 500 pages.

Most perl regular expression authors make mistakes even on comparatively simple tasks such as matching the valid floating point numbers. Hardly anyone gets right tasks such as balancing brackets (a task which is not possible with true regular expressions, and not possible with perl basic regular expressions, requiring perl extended regular expressions.)

Iniciar sesión para comentar.

Answer 3

Jan el 12 de Jul. de 2017

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/348439-how-to-extract-strings-between-two-newline-characters-using-regexp#answer_273905

Abrir en MATLAB Online

S = sprintf('\n1 2\n3 4\n')
C = strsplit(S, '\n');

2 comentarios
Mostrar NingunoOcultar Ninguno

Li Xue el 13 de Jul. de 2017

strsplit will create two additional empty cells.

Jan el 13 de Jul. de 2017

Abrir en MATLAB Online

Then add:

C(cellfun('isempty', C)) = [];

Iniciar sesión para comentar.

how to extract strings between two newline characters using regexp

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Más respuestas (2)

4 comentarios
Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Community Treasure Hunt

how to extract strings between two newline characters using regexp

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

3 comentarios Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Más respuestas (2)

4 comentarios Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

2 comentarios Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

4 comentarios
Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno