How to place multiple csv files along coloum, side by side, in a single file.

Question

0 votos

Hi I want to place all csv files, under a folder, into a single sheet as single csv file alone coloum direction, side by side.

For example, first file- col A to C, second file- col D to F, and so on, according to file hierarchy (order) of the input folder. I Have studied some existing tutorial in mathwork but still can not solve my problem. All files have equal col and row dimension. I have attached 2 sample file here out of 4000 files. Note that all csv files have single sheet only.

Thanks in advance for solution.

2 comentarios
Mostrar Ninguno Ocultar Ninguno

per isakson el 18 de Dic. de 2015

Do you want a csv-file with 4000x3 columns?

hosse el 18 de Dic. de 2015

Yes. if 4000*3 is not possible. 1000*3 will work.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Follow Question

Answer 1

Guillaume el 18 de Dic. de 2015

Editada: Guillaume el 18 de Dic. de 2015

Abrir en MATLAB Online

1 voto

csv is a text format, and as all text formats it is written to file line by line. Therefore to do what you want, you have no choice but to hold in memory all 4000x3 columns of at least the row you are writting. So, depending on how much memory you have available you have two choices.

a) To hold 4000 files made of 3 columns by 47772 rows in memory, you need around 4.3 GB of memory. If you have the memory, the simplest thing is to read all 4000 files into a cell array of matrices, concatenate those matrices into one matrix and write that matrice in one go. Something similar to

%warning code is completely untested, there may be bugs
folder = 'somefolder';
files = dir(fullfile(folder, '*.csv')); %assuming you want the files in dir order
filedata = cell(1, numel(files));
for filecount = 1:numel(files)
   filedata{filecount} = csvread(fullfile(folder, files(filecount).name));
end
mergeddata = [filedata{:}];
csvwrite(fullfile(folder, 'somename.csv'), mergeddata);

b) You don't have enough memory to hold all the files at once in memory. You would have to read the first few rows of the 4000 files, merge them and write it to your destination, then read the next few rows, merge them, and append that to the file, and so on.

Option b) is bound to be much slower than a), but even a) is not going to be fast, since matlab will have to parse 4000 files.

6 comentarios
Mostrar 4 comentarios más antiguos Ocultar 4 comentarios más antiguos

Guillaume el 18 de Dic. de 2015

Abrir en MATLAB Online

Thinking more about this, there is option c) that does not involve parsing the files, so is bound to be more efficient: Simply read the files as text strings, merge the strings adding a comma in between and write to destination file.

Option c1): read the entirety of the files one by one, merge, save:

%warning code is completely untested, there may be bugs
folder = 'somefolder';
files = dir(fullfile(folder, '*.csv')); %assuming you want the files in dir order
filedata = arrayfun(@(file) fileread(fullfile(folder, file.name)), files, 'UniformOutput', false); %reading 4000 files is going to take a while
filedata = regexp(filedata, '\n', 'split'); %split each file into lines, may take a while
mergeddata = vertcat(filedata{:})';
fid = fopen(fullfile(folder, 'outputname.csv', 'wt');
for row = 1:size(mergeddata, 1)-1
   fwrite(fid, strjoin(mergeddata(row, :), ',');
   fwrite(fid, char(10));
end
fclose(fid);

option c2): read files one (or a few) line at a time, merge lines, save one (or a few) line at a time. One issue is you can't have more than 512 files open at a time (at least on Windows) so you'll be constantly opening and closing files, which is going to slow down an already slow process. Use fopen and fclose to open and close the files, fgetl and fwrite to read and write the lines.

Seung-Eon Roh el 17 de Jun. de 2019

Thanks, Hosse and Guillaume. It also worked for me.

Afshin Goodarzi el 14 de Sept. de 2020

hi Guillaume

in my case the first rows in csv files are names, not numbers. when trying to run the "Option c1" code, i get the error below, could you help me please?

Error using dlmread (line 147)

Mismatch between file and format character vector.

Trouble reading 'Numeric' field from file (row number 1, field number 1) ==>

lat,lan,pre,pre2\n

Error in csvread (line 48)

m=dlmread(filename, ',', r, c);

Error in Untitled6 (line 7)

filedata{filecount} = csvread(fullfile(folder, files(filecount).name));

Iniciar sesión para comentar.

Answer 2

Ingrid el 18 de Dic. de 2015

Abrir en MATLAB Online

0 votos

this can be easily achieved with the following code

listing = dir(nameFolder);
N = numel(listing);
data = [];
for ii = 3:N
  fid = fopen(listing{ii}));
  newData = textscan(fid,'%f%f%f);
  data = [data, newData];
  fclose(fid);
end

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Guillaume el 18 de Dic. de 2015

I'd qualify that easily. Doing 4000 reallocations of data due to the resizing on each file, to end up with 4.3 GB matrix is going to take a long time.

Iniciar sesión para comentar.

How to place multiple csv files along coloum, side by side, in a single file.

2 comentarios
Mostrar Ninguno Ocultar Ninguno

Respuesta aceptada

6 comentarios
Mostrar 4 comentarios más antiguos Ocultar 4 comentarios más antiguos

Más respuestas (1)

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Categorías

Etiquetas

Community Treasure Hunt

How to place multiple csv files along coloum, side by side, in a single file.

2 comentarios Mostrar Ninguno Ocultar Ninguno

Respuesta aceptada

6 comentarios Mostrar 4 comentarios más antiguos Ocultar 4 comentarios más antiguos

Más respuestas (1)

1 comentario Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Categorías

Etiquetas

Ver también

Community Treasure Hunt

2 comentarios
Mostrar Ninguno Ocultar Ninguno

6 comentarios
Mostrar 4 comentarios más antiguos Ocultar 4 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos