Cleaning string and reaction time data

2 visualizaciones (últimos 30 días)
Corey Magaldino
Corey Magaldino el 24 de Ag. de 2022
Comentada: Corey Magaldino el 25 de Ag. de 2022
I've got data that is in two columns.
The first column is string data that houses animal names. The second column is numeric data that houses reaction times to produce said animal names.
However, there are instances where a participant provides an out of bound response (a non animal for instance). All out of bound responses have been recoded to read OTHER.
I've been working on code to eliminate the OTHERs from the free responses and then recalculate the reaction times from their last in-bound item to their next in-bound item (essentially a sum of reaction times across all out of bound items).
For instance, example data could look something like this: https://imgur.com/czO5gub
Eliminating the OTHERs in the first column was pretty straight-forward with
DAT_STR = DAT_STR(cellfun('isempty',strfind(DAT_STR,'OTHER')));
However, I am struggling to clean the reaction time data.
My first attempt was to create a logical array to determine whether 'OTHER' was present in a given row using:
logic_array = strcmp(DAT_STR,'OTHER')
From here I can find the corresponding row numbers:
row_nos = find(logic_array == 1);
And then I was going to loop through them in a for loop that looked something like this:
for k = 1:length(row_nos)
DAT_RT((row_nos(k))+1) = DAT_RT(row_nos(k)) + DAT_RT((row_nos(k))+1);
end
This loop basically takes the RT for the OTHER response and just adds the RT to the next in bound item, which is intended. The loop works really well for one-off 'OTHER' responses; however, it does a terrible job if there are consecutive 'OTHER' values in a row.
I've been beating my head against the wall trying to figure this out lol. My next attempt was to create 'start' and 'stop' values when there are consecutive 'OTHER' responses. Below is my attempt at that (warning: it doesn't work lol, the logic is off)
for k = 1:length(row_nos)
if DAT_STR((row_nos(k))+1) == 'OTHER'
logconsec = diff(row_nos)==1;
D = diff([0,logconsec',0]);
first1 = row_nos(D>0);
last1 = row_nos(D<0);
for j = 1:length(first1)
DAT_RT((last1(j))+1) = sum(DAT_RT((first1(j)):(last1(j))));
end
else
DAT_RT((row_nos(k))+1) = DAT_RT(row_nos(k)) + DAT_RT((row_nos(k))+1);
end
end
The thought behind this section was to look ahead one row and if the next row == 'OTHER', then treat it as consecutive OTHERS and use the first/last values. Else, it should do the typical addition that works well in the one-off cases.
I feel like I'm spinning my wheels and overcomplicating things without really making any progress, so any guidance or insight is greatly appreciated!!
  4 comentarios
dpb
dpb el 25 de Ag. de 2022
We can do nothing with images and aren't going elsewhere to look for stuff...post in the forum itself; use the toolset provided.
Corey Magaldino
Corey Magaldino el 25 de Ag. de 2022
Sorry I couldn't figure out how to embed the images directly in the post. Thanks anyway.

Iniciar sesión para comentar.

Respuesta aceptada

Steven Lord
Steven Lord el 25 de Ag. de 2022
I suspect that the standardizeMissing and/or fillmissing functions will be of interest to you.

Más respuestas (0)

Categorías

Más información sobre Text Analytics Toolbox en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by