Regex for string match
32 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
ML_Analyst
el 24 de Sept. de 2023
Comentada: ML_Analyst
el 25 de Sept. de 2023
I have a huge string array 1*50000 length like below:
Stock_field1_img
Sys_tim_valt98.qaf.rat.app.gui
Enable1.HSB_setblcondition.Enable_logic.ui
P_k12.delay.init_func_delay_update.Sys
#fat_11ks.ergaa.ths.dell
$thispt.dynmem11.ide.gra
.....
.....
I am looking for a regex, which can search this array based on "user input". For ex,
if user gives st* then it should get all the strings starting with "st" ,
if user gives *st then it should get all strings ending with "st",
if user gives *st* then it should get all strings which has st in between start and end,
user can also give *st*app.*sys* then it should list all combinations which has strings with st in between, followed by app. in between and followed by sys in between.
I tried multiple combos like below and also other combinations
expression = '\w* + signal + \w*';
a = regexp(str_array, ,'match','ignorecase');
but doesn't work as intended, could someone help with this.
0 comentarios
Respuesta aceptada
Voss
el 24 de Sept. de 2023
I think it may be tricky to get this to work for any possible expression the user may enter, because every special character used in regexp will have to be modified in the user-input expression. For example, you want * to represent any character sequence, which in regexp is .* so you have to replace * with .* in the user-input expression before passing to regexp; other special characters you want to treat literally have to be escaped (by prepending \), so that . becomes \. and $ becomes \$ etc. The function get_matches defined below does this replacement explicitly for a few special characters before passing the expression to regexp and returns the matches. You can add more special characters to it as needed.
str = [
"Stock_field1_img"
"Sys_tim_valt98.qaf.rat.app.gui"
"Enable1.HSB_setblcondition.Enable_logic.ui"
"P_k12.delay.init_func_delay_update.Sys"
"#fat_11ks.ergaa.ths.dell"
"$thispt.dynmem11.ide.gra"
];
user_input = "st*"; % return any string starting with st
matched_str = get_matches(str,user_input)
user_input = "*.sys"; % ending with .sys
matched_str = get_matches(str,user_input)
user_input = "*del*"; % containing del
matched_str = get_matches(str,user_input)
user_input = "$*"; % starting with $
matched_str = get_matches(str,user_input)
user_input = "*.*d*.*"; % containing d somewhere between two .s
matched_str = get_matches(str,user_input)
function a = get_matches(str,user_input)
regex = replace(user_input,["*",".","$","^"],[".*","\.","\$","\^"]);
a = rmmissing(regexpi(str,"^"+regex+"$",'match','once'));
end
3 comentarios
Stephen23
el 25 de Sept. de 2023
Note that regexptranslate can be used to escape all special characters:
https://www.mathworks.com/help/matlab/ref/regexptranslate.html
Más respuestas (0)
Ver también
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!