Replacing only certain instances of text within matlab character array

3 visualizaciones (últimos 30 días)
I have a large character array in matlab: 'lineDataA' - containing many different numbers.
I would like to find and replace all instances of the number '6002' and replace with '0', apart from the very first instance.
lineData = replace(lineDataA, '6002', '0');
This replaces all instances
And
where6002 = strfind(lineDataA, '6002');
Gives the position of all the instances. However I am not sure how to replaces all the instances except the first?
Many thanks for your help,
Rob

Respuesta aceptada

Stephen23
Stephen23 el 20 de En. de 2017
Editada: Stephen23 el 20 de En. de 2017
Method One: split the string
>> str = '___6002__6002___6002___6002__';
>> idx = regexp(str,'6002','once','end');
>> strcat(str(1:idx),strrep(str(idx+1:end),'6002','0'))
ans =
___6002__0___0___0__
Method Two: use a placeholder
>> str = '___6002__6002___6002___6002__';
>> str = regexprep(str,'6002','\b','once');
>> str = strrep(str,'6002','0');
>> regexprep(str,'\b','6002')
ans =
___6002__0___0___0__
Note that the original string must not contain \b.
Method Three: dynamic regular expression
>> str = '___6002__6002___6002___6002__';
>> regexprep(str,'(.*?6002)(.*)','$1${strrep($2,''6002'',''0'')}')
ans =
___6002__0___0___0__
  2 comentarios
John Leal
John Leal el 16 de Oct. de 2017
I have a similar problem. I need to replace some words for others in an extense array. I have the code but is too slow. Can you help me to find a way to make it better?:
if true
% code
textData = regexprep(textData, '[@$/#.-:-&*+=[]?!(){},''">_<;%]|', ' ');
% Remove any non alphanumeric characters
textData = regexprep(textData, '[^a-zA-Zñ ]', '');
textData = regexprep(textData, '[0-9]+', ' ');
textData = regexprep(textData, '<[^<>]+>', ' ');
textData = regexprep(textData, 'á', 'a');
textData = regexprep(textData, 'é', 'e');
textData = regexprep(textData, 'í', 'i');
textData = regexprep(textData, 'ó', 'o');
textData = regexprep(textData, 'ú', 'u');
textData = regexprep(textData, 'ñ', 'n');
textData = regexprep(textData, 'x', 's');
textData = regexprep(textData, 'cc', 'c');
textData = regexprep(textData, 'ci', 'si');
% deletedWords = ["helllo","hello";"moter","mother"] ... 50000 rows
% excludedWords = ["father","three", "tree"]... words I don't want to replace
% textData = ["my mother lives with my father";"hello Word"]... 2 million rows.
m = length(deletedWords(:,1));
for idx=1:m
w_new = deletedWords{idx,1};
w_ok = deletedWords{idx,2};
f = find(excludedWords==w_new, 1);
% only if it is not in excludesWords
if isempty(f)
% Replace EXACT word match"
textData = regexprep(textData,"(?<![\w])"+w_new+"(?![\w])" ,w_ok );
end
end
end
John Leal
John Leal el 16 de Oct. de 2017
The main idea is to correct misspelling words in SPANISH. It is like a handmade stem adjust to my specific data. deletedWords contains the misspelling word and the correct word. These words are extracted from the same textData using jaro wrinkler to convert less frequent word to a high frequent word with more than 95% similarity.
Ty

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Environment and Settings en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by