Error converting python DataFrame to Table

Question

0 votos

I have used the following commands to load in a python .pkl file.

fid = py.open("data.pkl");
data = py.pickle.load(fid);
T = table(data);

This loads a python DataFrame object. Newer versions of MATLAB have the ability to convert this object to a table using the table command, which I tried but encountered the below error:

Error using py.pandas.DataFrame/table
Dimensions of the key and value must be the same, or the value must be scalar.

What does this error mean? I'm guessing it's because the DataFrame object in the .pkl contains a couple nested fields. Most of the fields are simply 1xN numeric vectors, but a couple are 1xN objects which then have their own fields.

How can I convert this DataFrame object to something usable in MATLAB? I was given this datafile and did not generate it, and I am much more proficient in MATLAB than python, so I would rather solve this within MATLAB rather than having to create a python script or change how the file is created.

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Follow Question

Answer 1

Umar el 22 de Abr. de 2026

Editada: Umar el 22 de Abr. de 2026

1 voto

Hi @David, Thanks for writing in — you've actually already diagnosed this correctly, so let me just confirm it and get you moving. The table() conversion (available since R2024a) only handles one level of DataFrame nesting. Those 1×N object columns in your file that carry their own sub-fields push past that limit, and that's exactly what's throwing the dimension mismatch. Your flat numeric columns are completely fine — it's only the nested ones tripping it up. Before anything else, just run this to see what you're working with: fid = py.open("data.pkl", "rb"); data = py.pickle.load(fid); py.print(data.dtypes); py.print(data.head(int32(3))); For your flat columns, pull them out directly: flat_cols = {"col1", "col2", "col3"}; arrays = cellfun(@(c) double(data{c}.values), flat_cols,'UniformOutput', false); T = array2table(cell2mat(arrays), 'VariableNames', flat_cols); For the nested ones, you don't need a separate Python script. Call pandas.json_normalize inline from MATLAB — it flattens nested fields into dot-separated columns (e.g. sensor.value becomes a normal flat column), and after that table() will convert without issue: records = data.to_dict("records"); flat_data = py.pandas.json_normalize(records); T = table(flat_data); If that still gives you trouble, take a look at the PandasToMatlab utility on File Exchange (https://www.mathworks.com/matlabcentral/fileexchange/111770-pandastomatlab). The df2t() function there handles more edge cases than the built-in path and works entirely in memory. Full type-conversion details are in the docs here if you want to check what maps to what: https://www.mathworks.com/help/matlab/matlab_external/python-pandas-dataframes.html Hope this helps!

2 comentarios
Mostrar Ninguno Ocultar Ninguno

David K el 22 de Abr. de 2026

This is very helpful, thank you! I was able to use the json_normalize method to make it easily convertible to a table.

A couple other things: Would you mind formatting your answer? It is very difficult to read as is. Also, your line arrays = cellfun(@(c) double(data{c}.values), flat_cols,'UniformOutput', false); gives the error "Brace indexing is not supported for variables of this type."

Umar el 23 de Abr. de 2026

IMG_8286.jpeg

Glad json_normalize worked out, David! Apologies for the messy formatting — here's a cleaner version of the full answer. Step 1-Inspect what you're working with: fid = py.open("data.pkl", "rb"); data = py.pickle.load(fid); py.print(data.dtypes); py.print(data.head(int32(3))); Step 2 — For flat numeric columns only: flat_cols = {"col1", "col2", "col3"}; arrays = cell(1, numel(flat_cols)); for i = 1:numel(flat_cols) arrays{i} = double(data{py.str(flat_cols{i})}.values); end T = array2table(cell2mat(arrays), "VariableNames", flat_cols); Step 3 — For nested columns (the one that solved your problem): records = data.to_dict("records"); flat_data = py.pandas.json_normalize(records); T = table(flat_data); On the brace-indexing error: the issue is that {} is a MATLAB cell array operation — a py.pandas.DataFrame isn't a cell array, so MATLAB rejects it. The fix is wrapping the column name in py.str() so MATLAB passes a proper Python string key to the DataFrame. I've also swapped cellfun for a plain loop since it's more reliable across MATLAB versions. That said, since json_normalize already handles both flat and nested columns in one shot, you probably won't need the flat-column path at all. Hope that clears things up!

Iniciar sesión para comentar.

Error converting python DataFrame to Table

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Respuesta aceptada

2 comentarios
Mostrar Ninguno Ocultar Ninguno

Más respuestas (0)

Categorías

Productos

Versión

Etiquetas

Community Treasure Hunt

Error converting python DataFrame to Table

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Respuesta aceptada

2 comentarios Mostrar Ninguno Ocultar Ninguno

Más respuestas (0)

Categorías

Productos

Versión

Etiquetas

Ver también

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

2 comentarios
Mostrar Ninguno Ocultar Ninguno