Transform NaN into number

Question

0 votos

Hi everyone,

I have data that is organized in structures which look like this:

rating.pretest.s1
rating.pretest.s2
...

and in those structures on the last level (s1, s2...) I have numbers or NaN where a participant failed to push a button in time. Since this counts as a wrong answer I would like to assign all the NaNs the number 1. What I tried is

rating(isnan(rating)) = 1

but I got the error message Undefined function "isnan" for input arguments of type "struct". I have read some answers on similar questions but none of them seemed applicable since I'm using a structure (or were too complicated for me to understand since I'm still a beginner). Is there an alternative I could use? Thank you in advance!

2 comentarios
Mostrar Ninguno Ocultar Ninguno

Walter Roberson el 23 de Mayo de 2019

Is rating a scalar or non-scalar structure? Is rating.pretest a scalar or non-scalar structure?

For example is there a possibility of a rating(3).pretest(7).s1 ?

RP el 23 de Mayo de 2019

I think it's not but I'm not quite familiar with that term. It looks like this: rating is a 1x1 structure that contains two 1x1 structures, pretest and posttest. Each of those contains several fields (s1, s2, ...) which contain 56 numbers each

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Follow Question

Answer 1

Steven Lord el 23 de Mayo de 2019

Editada: Steven Lord el 23 de Mayo de 2019

Abrir en MATLAB Online

2 votos

For this application, I'd use fillmissing. It will do the same task as filling in locations using logical indexing with the output of isnan, but in the code I find it makes the intent clearer to the reader. In the example below I'm going to assign the output of fillmissing into a different struct array so you can compare the "before and after", but you could assign the output back into the struct you processed if you want. Let's define some sample data:

rating.pretest.s1 = [4;1;NaN;3];
rating.pretest.s2 = [5;2;0;NaN];

Now let's fill in the missing values with the constant 1. I used 'UniformOutput', false to return the output as a struct array with the same fields as the input.

fillingfunction = @(x) fillmissing(x, 'constant', 1);
rating2.pretest = structfun(fillingfunction, rating.pretest, ...
                            'UniformOutput', false);

Compare the before and after.

rating.pretest.s1
rating2.pretest.s1
rating.pretest.s2
rating2.pretest.s2

FYI you might want to use table arrays to store your data.

pretestScores = table(rating.pretest.s1, rating.pretest.s2, ...
    'VariableNames', {'Test1', 'Test2'}, ...
    'RowNames', {'Alice', 'Bob', 'Charlie', 'Doug'})
posttestScores = table(rating.pretest.s1+2, rating.pretest.s2+3, ...
    'VariableNames', {'Test1', 'Test2'}, ...
    'RowNames', {'Alice', 'Bob', 'Charlie', 'Doug'})

You can even store one or more table arrays inside another table!

allScores = table(pretestScores, posttestScores)
allScores.Properties.RowNames = pretestScores.Properties.RowNames

Though if you were to build that allScores table you probably wouldn't want to put the student names as the RowNames of the inner table arrays as well, as that looks kind of funny. You can trim them if you want.

allScores.pretestScores.Properties.RowNames = {}
allScores.posttestScores.Properties.RowNames = {}

3 comentarios
Mostrar 1 comentario más antiguo Ocultar 1 comentario más antiguo

Steven Lord el 23 de Mayo de 2019

Abrir en MATLAB Online

One nice feature of table arrays is the ability to use names to index into them rather than numbers. The following works to check Alice's score on test 1 regardless of the order of the rows. I don't need to keep a separate list of student names, check which entry in rating.pretest.s1 is Alice's, and retrieve it.

pretestScores{'Alice', 'Test1'}

As for building the struct before building the table, not necessarily. I built them because I believe someone else did. [Assuming you're using a sufficiently new release the VariableNames and RowNames can be either cell arrays containing char arrays or string arrays.]

scores1 = [4; 1; NaN; 3];
scores2 = [5; 2; 0; NaN];
studentnames = ["Alice"; "Bob"; "Charlie"; "Doug"];
pretestScores = table(scores1, scores2, ...
    'VariableNames', {'Test1', 'Test2'}, ...
    'RowNames', studentnames)

If you do this, filling in the NaN values is even easier. The fillmissing function can accept a table array as its first input.

pretestScoresFilled = fillmissing(pretestScores, 'constant', 1)

RP el 23 de Mayo de 2019

That was very helpful and well explained, thanks a lot for taking the time!

Iniciar sesión para comentar.

Answer 2

Jos (10584) el 23 de Mayo de 2019

Abrir en MATLAB Online

1 voto

This function recursively looks at all fields of the structure and replaces any NaNs by a value. Also works for structure arrays:

function  S = structNaN2num(S, value)
if isstruct(S)
    % recursively look at all fields of structure array S
    for k=1:numel(S)
        S(k) = structfun(@(x) structNaN2num(x, value), S(k),'un',0) ;
    end
else
    if isnumeric(S)
        S(isnan(S)) = value ; % replace NaNs with a value
    end
end

Use it like this:

clear a 
a.x1 = 1 ; a.x2 = NaN ; x.x3.y1 = NaN ; a.x3.y2 = [1 2 NaN]
a(2) = a(1) % structure array
a(2).x3.y2 = [NaN 2 3]
b = structNaN2num(a, 999)

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

RP el 23 de Mayo de 2019

Thanks a lot, this works perfectly!

Iniciar sesión para comentar.

Answer 3

Stephen23 el 23 de Mayo de 2019

Editada: Stephen23 el 23 de Mayo de 2019

Abrir en MATLAB Online

1 voto

rating.pre.67.mat

Your original description failed to mention several things, including the size of the numeric vector and also that the numeric vector is (pointlessly) nested inside a scalar cell array. But once we know the exact data structure, it is very easy to loop over those fields and change the NaN values to whatever you want:

>> load('rating.pre.67.mat')
>> fld = fieldnames(rating.pre)
fld = 
    's63'
    's64'
    's65'
    's66'
    's67'
>> rating.pre.s64{1}
ans =
     5
   NaN
     5
     5
     5
 ... more lines here
     4
     5
     5
>> for k = 1:numel(fld), idx = isnan(rating.pre.(fld{k}){1}); rating.pre.(fld{k}){1}(idx) = 1; end
>> rating.pre.s64{1}
ans =
     5
     1
     5
     5
     5
     5
     5
 ... more lines here
     4
     5
     5
>>

Your data structure is far too complex, e.g.:

those scalar cell arrays appear to be entirely superfluous,
forcing the meta-data into fieldnames makes your code slower and more complex.
nested structures are not very convenient to work with.

I would recommend looking at using much simpler and more efficient data organisation, e.g. with a simple non-scalar structure, or a table.

2 comentarios
Mostrar Ninguno Ocultar Ninguno

RP el 23 de Mayo de 2019

This works as well, thanks a lot for the suggestion!

By forcing meta-data into field names you mean the fact that I have the participant numbers in the structure? Unfortunately I didn't know how else to do that, I'd like to have an array but I thought I won't be able to see the participant number anywhere then. Could you tell me why nested structures are not good? Or maybe you know what I could look up in the Matlab documentation in order to learn how to create these variables in a better way?

Stephen23 el 23 de Mayo de 2019

Editada: Stephen23 el 23 de Mayo de 2019

Abrir en MATLAB Online

"By forcing meta-data into field names you mean the fact that I have the participant numbers in the structure?"

Yes: data and code are two separate paradigms that are best kept separated. Using (meta-)data for fieldnames makes your code fragile because it is susceptibile to any external changes in those IDs (e.g. consider what your code would do if the ID format changes to include characters that are invalid for fieldnames). Read this for a detailed explanation:

https://www.mathworks.com/matlabcentral/answers/225435-save-variable-as-string-from-user-input

"Unfortunately I didn't know how else to do that, I'd like to have an array but I thought I won't be able to see the participant number anywhere then."

You can easily store the ID in a cell array, or a non-scalar structure, e.g.:

S(1).data = [...];
S(1).ID = 's63';
S(2).data = [...];
S(2).ID = 's64';
...

If the ID is actually a numeric value (and you just added the 's' to make it a valid fieldname) then simply storing it in a numeric array (using indexing) would be by far the simplest and most efficient solution.

Remember that you do not have to put all of the meta-data and test-data into the same array, it might be more convenient to use two (or more) arrays of classes that better suit the meta-data and the test-data (but of course using exactly the same indexing, so there is no ambiguity about how their elements correspond). For example:

data = [... all numeric data... ]
IDs = {... cell array of IDs ... }

"Could you tell me why nested structures are not good?"

Simply because accessing their contents leads to quite bulky code which is often not easy to follow, as generally they require lots of loops to process. If the nested structures are not a good representation of how the data are actually related and arranged, then this makes the code much more complex than it needs to be (and thus slower, buggier, etc.).

They can often be replaced by simpler data arrangements using indexing (e.g. non-scalar structures or tables).

"Or maybe you know what I could look up in the Matlab documentation in order to learn how to create these variables in a better way?"

It comes with practice and lots of reading.

A general rule of thumb is to use the simplest data class that will hold your data.

Iniciar sesión para comentar.

Transform NaN into number

2 comentarios
Mostrar Ninguno Ocultar Ninguno

Respuesta aceptada

3 comentarios
Mostrar 1 comentario más antiguo Ocultar 1 comentario más antiguo

Más respuestas (2)

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

2 comentarios
Mostrar Ninguno Ocultar Ninguno

Categorías

Etiquetas

Community Treasure Hunt

Transform NaN into number

2 comentarios Mostrar Ninguno Ocultar Ninguno

Respuesta aceptada

3 comentarios Mostrar 1 comentario más antiguo Ocultar 1 comentario más antiguo

Más respuestas (2)

1 comentario Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

2 comentarios Mostrar Ninguno Ocultar Ninguno

Categorías

Etiquetas

Ver también

Community Treasure Hunt

2 comentarios
Mostrar Ninguno Ocultar Ninguno

3 comentarios
Mostrar 1 comentario más antiguo Ocultar 1 comentario más antiguo

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

2 comentarios
Mostrar Ninguno Ocultar Ninguno