How to define categorical within factors in fitrm?

3 visualizaciones (últimos 30 días)
Jan
Jan el 17 de Nov. de 2014
Comentada: Stephane el 16 de Mzo. de 2016
I've observed unexpected behaviour of categorical factors in fitrm. Two ways to produce the same table of within factors produce different fitrm output. Why is this?
I've got a completely within-subjects design with 8 observations on 20 subjects. I first specified the two within factors by using table2array on the design matrix, with categorical indices for the factor levels:
within_fact = categorical(fullfact(nr_cond, nr_sessions]));
within_tbl1 = array2table(within_fact,'VariableNames',{'Condition','Session'});
The second way was to make the two factors only categorical after transforming them into a table:
within_fact = fullfact([nr_cond,nr_sessions]);
within_tbl2 = array2table(within_fact,'VariableNames',{'Condition','Session'});
within_tbl2.Condition = categorical(within_tbl2.Condition);
within_tbl2.Session = categorical(within_tbl2.Session);
According to the isequal function, the two factor tables are identical:
isequal(within_tbl1, within_tbl2)
ans = 1
However, they produce different outcomes when used as within design in a rm anova:
% M = 20x8 double matrix
data = array2table(M, 'VariableNames', {'S1C1', 'S1C2', 'S2C1', 'S2C2',...
'S3C1', 'S3C2', 'S4C1', 'S4C2'});
rm1 = fitrm(data,'S1C1-S4C2 ~ 1','WithinDesign',within_tbl1);
ranovatbl1 = ranova(rm1, 'WithinModel', 'Condition*Session');
rm2 = fitrm(data,'S1C1-S4C2 ~ 1','WithinDesign',within_tbl2);
ranovatbl2 = ranova(rm2, 'WithinModel', 'Condition*Session');
Unexpectedly, rm1 produces a anova table with df=3 for both Condition (which only has 2 levels) and Session (which has 4). rm2 has the correct df's (1 for Condition, 3 for Sessions). I'm confused as to why they produce different outcomes.
(MatLab 2014b on Win 7)
  1 comentario
Stephane
Stephane el 16 de Mzo. de 2016
Thanks for this question/answer - helped a lot. Indeed, the categorical function span over all the matrix elements (there is no dependance on rows or columns).
I overcame this problem by directly creating a within table with cells of strings as in
within_table = table({'c1','c2','c1','c2'}', {'L1','L1','L2','L2'}')

Iniciar sesión para comentar.

Respuesta aceptada

Tom Lane
Tom Lane el 19 de Nov. de 2014
When you create within_fact, you are defining a matrix with categories from 1 up to max(nr_cond, nr_sessions). So both columns of within_tbl1 are defined to have that many categories. It may be, say, that the first column has the notion of category 5 but doesn't actually have any data on that category.
When you create the variables in within_tbl2, each categorical variable is defined separately so it only has as many categories as actually appear in that column. This is what you almost certainly want.
Perhaps the way fitrm deals with this condition could be improved. I'll look into it.
  1 comentario
Jan
Jan el 19 de Nov. de 2014
I see, yes, that makes sense! I had interpreted the categorical function as merely putting a label 'categorical' on each value in within_fact and consequently did not expect this behaviour. But if the definition of these categorical labels depends on the entire dataset that you give the function, the outcome will be different for a matrix or a combination of two columns (as in my example). Thanks for the explanation!

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Repeated Measures and MANOVA en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by