Cell and double in datatable

Question

Jean-Marie Sainthillier el 4 de Oct. de 2024

1
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/2157465-cell-and-double-in-datatable

Movida: Walter Roberson el 7 de Oct. de 2024

I work with Excel xlsx files where matrix represent scores (integers between 0 and 3). Each column corresponds to a visit and each row to a patient (with size around 100x200).

The data may contains MD and NA code, or empty cells (sometimes a few, sometime a lot), in function of the visits and the progress of the study.

With readtable (without option), I obtain double and cell.

I don't understand how MATLAB makes its choice. Sometimes the column is in cell while there are only integers (?), sometime the column is in cell (or in double) for only few MD/NA codes, sometimes the colum is in double with a majority of nan (for empty cells or MD/NA codes).

In PowerQuery everything is in integer and only column with full empty cells are coded in text, so I don't think it's a problem with EXCEL.

Which rules MATLAB uses when importing ?

Do you have any suggestion ?

Thank you in advance.

SAINTHILLIER Jean Marie

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

dpb el 4 de Oct. de 2024

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/2157465-cell-and-double-in-datatable#answer_1526960

readtable and friends do their best to figure out what the data types are by pre-reading and analyzing the initial content of the file -- what causes the difference without using additional input parameters or an explicit import object is where in the file are the missing data; if they are relatively early, then the algorithm for data types will be influenced and may, as you've discovered, return what are thought to be numeric columns as strings.

Unfortunately, the exact conditions used to determine which is which are not documented, and they are subject to change from release to release as Mathworks continues to try to improve the algorithms, so there is no set value of "N" intact rows that can be stated will ensure the correct interpretation.

The best and most reliable choice when reading a number of files of a given structure is to create and save an import object that defines precisely how you want the data to be interpreted; then it won't matter where or how many missing data are there. You can also control what is done with missing data although not with total granularity. Note that the import object does not have to be created each time for each file; having one that describes the specific data format and using it for all files is the most efficient method. While you can control a lot of the options by using the optional named-parameter pairs, specifying the data type by variable/column is not one that can be done except by the import options object.

You can read more at <How MATLAB imports data> and then the doc itself for detectImportOptions.

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Jean-Marie Sainthillier el 7 de Oct. de 2024

Movida: Walter Roberson el 7 de Oct. de 2024

Thank you.

Iniciar sesión para comentar.

Cell and double in datatable

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Más respuestas (0)

Ver también

Etiquetas

Productos

Versión

Community Treasure Hunt

Cell and double in datatable

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Más respuestas (0)

Ver también

Etiquetas

Productos

Versión

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos