Borrar filtros
Borrar filtros

I have a Large csv file that I want to plot

8 visualizaciones (últimos 30 días)
Leen Almadani
Leen Almadani el 25 de Jul. de 2018
Comentada: Sushant Mahajan el 1 de Ag. de 2018
I have a large csv file, when I try to open it in matlab to plot it I run out of memory. I tried the 'tabularTextDatastore' but I don't know how to plot it after selecting the variable names and the formats, it seems like 'Datastore' is only meant for reading and displaying the data because I cant find anything on changing the data let alone plotting it. The problem is that I have 6 columns of endless data and I'm plotting a 4D graph. The original idea was to interpolate but since its very dense I won't need to interpolate I can just create a mesh gird and call 'GriddedInterpolant'. How can I plot each point from my csv file without running out of memory?
Update: I looked into tall arrays and it seems doable, BUT I don't think griddedInterpolant supports tall arrays.
  6 comentarios
Walter Roberson
Walter Roberson el 26 de Jul. de 2018
Which MATLAB release are you using? And are you using 32 bit or 64 bit? How much RAM do you have?
fid = fopen('data_632.0_43ddd.csv','rt');
data = cell2mat(textscan(fid, '%f%f%f%f%f%f', 'headerlines', 1, 'delimiter',',','collect',true));
fclose(fid)
For your 50 megabyte file, the result would be on the order of 45 megabytes of data storage.
You cannot use single precision for your first column without losing some of the information you have stored. For example in single precision, 0.994271503 would become approximately 0.994271517
Leen Almadani
Leen Almadani el 26 de Jul. de 2018
I'm using the latest version (R2018a), I have access to other versions if needed. My laptop is a 64 bit and I have 15.9 GB in RAM.

Iniciar sesión para comentar.

Respuestas (2)

Sushant Mahajan
Sushant Mahajan el 26 de Jul. de 2018
Editada: Sushant Mahajan el 26 de Jul. de 2018
The fact that your data file is 9 MB after zipping tells me that there is a significant wastage of RAM here. I can suggest ways to reduce your RAM usage so that more of it is available for plotting purposes:
Read the .csv file into MATLAB. You can read it linewise using fopen() and fgets() if you are running out of memory while reading the file itself. Then, store all variables you need in binary format on your hard drive (see here: fwrite() ).
This binary file should be multiple times smaller than your .csv. Restart MATLAB to reduce its memory usage ("clear all" does not actually let go of all the RAM MATLAB is using). Load the variables from the binary file and then try plotting again.
Other considerations for reducing memory usage:
1. When you are creating the binary file, consider using single precision format for storing the data instead of double precision (default). This itself shall cut down your memory usage in half. Use this only if that extra level of precision does not matter in your plots/results.
2. Open your task manager (on Windows) or System monitor (linux) and check which apps are using the most RAM. Most probably it is your web browser. Free as much RAM as you can by closing unnecessary apps and restart MATLAB to try plotting again.
  3 comentarios
Walter Roberson
Walter Roberson el 26 de Jul. de 2018
I calculate that the bin file would be about 89% of the size of the csv for your sample data. I don't think it would be worth going that route.
Sushant Mahajan
Sushant Mahajan el 1 de Ag. de 2018
OK, can you let us know the size of your .csv file, and how many data points are you trying to plot?
Am I correct in understanding that you can load the complete data file in memory, but when you try to interpolate, only then you run out of memory? Can you also post the error message you get?

Iniciar sesión para comentar.


KSSV
KSSV el 26 de Jul. de 2018
Are you looking for some thing like this?
T = readtable('data_632.0_43ddd.csv') ;
t = T.(1)(~isnan(T.(1))) ;
p = T.(2)(~isnan(T.(1))) ;
uoph = T.(3)(~isnan(T.(1))) ;
%
dt = delaunayTriangulation(t,p) ;
tri = dt.ConnectivityList ;
trisurf(tri,t,p,uoph) ; view(2) ; shading interp
I have takes location as first and second column.
  1 comentario
Walter Roberson
Walter Roberson el 26 de Jul. de 2018
Note that they indicated they were running out of memory. It turns out that the amount of storage required by the table version is 3+1/3 times larger than storing it purely numerically such as the textscan I posted.

Iniciar sesión para comentar.

Categorías

Más información sobre Large Files and Big Data en Help Center y File Exchange.

Productos


Versión

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by