Reading large CSV files

29 visualizaciones (últimos 30 días)
Sanchit Sharma
Sanchit Sharma el 2 de Abr. de 2022
Respondida: Esha Chakraborty el 5 de Abr. de 2022
Hello,
I have a 28GB Large .csv file that I am trying to read. I have tried readmatrix() and readtable(). Both functions are giving me below error:
Caught "std::exception" Exception message is:
Failed to convert character code.
Could you please provide me a solution.
Thanks
  3 comentarios
Sanchit Sharma
Sanchit Sharma el 2 de Abr. de 2022
Editada: Sanchit Sharma el 2 de Abr. de 2022
yes i have 512GB ram
per isakson
per isakson el 2 de Abr. de 2022
Editada: per isakson el 2 de Abr. de 2022
Propably, your file contains some strange characters, e.g. to indicate missing data. The error message indicates that. One way to spot the position in the file that causes the error is
textscan( __________ , 'ReturnOnError',false )
It produces a better error message.

Iniciar sesión para comentar.

Respuestas (1)

Esha Chakraborty
Esha Chakraborty el 5 de Abr. de 2022
Hi Sanchit,
I understand that you are receiving the message - 'Failed to convert character code' when you are attempting to read large CSV files. Possible reason can be that the read buffer is too large and too much data is being read at once. It is suggested to reduce the amount of data being loaded and see if the situation still exists.
Here are a few ways to import large CSV array:
  1. You can try to split the file into smaller sections using any reliable third-party file splitting software, before importing to MATLAB.
  2. You can explore if the Datastore feature suits your use case. A Datastore is an object for reading a single file or a collection of files or data. The Datastore acts as a repository for data that has the same structure and formatting. You can refer to the following documentation page for more details on Datastore here.
  3. You can also explore if MapReduce is an option in your use case. MapReduce is a programming technique for analyzing data sets that do not fit in memory. You can refer to the following documentation page for more details on Mapreduce here.

Etiquetas

Productos


Versión

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by