# Cleaning string and reaction time data

2 views (last 30 days)
Corey Magaldino on 24 Aug 2022
Commented: Corey Magaldino on 25 Aug 2022
I've got data that is in two columns.
The first column is string data that houses animal names. The second column is numeric data that houses reaction times to produce said animal names.
However, there are instances where a participant provides an out of bound response (a non animal for instance). All out of bound responses have been recoded to read OTHER.
I've been working on code to eliminate the OTHERs from the free responses and then recalculate the reaction times from their last in-bound item to their next in-bound item (essentially a sum of reaction times across all out of bound items).
For instance, example data could look something like this: https://imgur.com/czO5gub
Eliminating the OTHERs in the first column was pretty straight-forward with
DAT_STR = DAT_STR(cellfun('isempty',strfind(DAT_STR,'OTHER')));
However, I am struggling to clean the reaction time data.
My first attempt was to create a logical array to determine whether 'OTHER' was present in a given row using:
logic_array = strcmp(DAT_STR,'OTHER')
From here I can find the corresponding row numbers:
row_nos = find(logic_array == 1);
And then I was going to loop through them in a for loop that looked something like this:
for k = 1:length(row_nos)
DAT_RT((row_nos(k))+1) = DAT_RT(row_nos(k)) + DAT_RT((row_nos(k))+1);
end
This loop basically takes the RT for the OTHER response and just adds the RT to the next in bound item, which is intended. The loop works really well for one-off 'OTHER' responses; however, it does a terrible job if there are consecutive 'OTHER' values in a row.
I've been beating my head against the wall trying to figure this out lol. My next attempt was to create 'start' and 'stop' values when there are consecutive 'OTHER' responses. Below is my attempt at that (warning: it doesn't work lol, the logic is off)
for k = 1:length(row_nos)
if DAT_STR((row_nos(k))+1) == 'OTHER'
logconsec = diff(row_nos)==1;
D = diff([0,logconsec',0]);
first1 = row_nos(D>0);
last1 = row_nos(D<0);
for j = 1:length(first1)
DAT_RT((last1(j))+1) = sum(DAT_RT((first1(j)):(last1(j))));
end
else
DAT_RT((row_nos(k))+1) = DAT_RT(row_nos(k)) + DAT_RT((row_nos(k))+1);
end
end
The thought behind this section was to look ahead one row and if the next row == 'OTHER', then treat it as consecutive OTHERS and use the first/last values. Else, it should do the typical addition that works well in the one-off cases.
I feel like I'm spinning my wheels and overcomplicating things without really making any progress, so any guidance or insight is greatly appreciated!!
Corey Magaldino on 25 Aug 2022
Sorry I couldn't figure out how to embed the images directly in the post. Thanks anyway.

Steven Lord on 25 Aug 2022
I suspect that the standardizeMissing and/or fillmissing functions will be of interest to you.

### Categories

Find more on Visualization in Help Center and File Exchange

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by