finding simillar folders (size and number of files anf files name)

1 visualización (últimos 30 días)
mohammad
mohammad el 19 de Sept. de 2011
In an specified folder there are some folders for example in these names:a,b,c,d,e,f,... .In each of these folders also there are some .xls files. Sometimes same .xls files are inside some of them. for example the same .xls files that are inside 'a', there are also in folder 'd' too. (exactly the same names and size and numbers of .xls files).
now its needed finding these similar folders that inside them are the same. and then renaming one of them to '(numbers of .xls file)' and deleting others similar. for example inside of 'a','f','e','w','f','c' are same and inside of these are the same 16 .xls files. it's needed deleting 'f','e','w','f','c'. then renaming 'a' to '(16)'
we could determine these similar folders from the size of them also and i think this is the simplest way (but not accurate because maybe exist 2 folder in same size but not same inside)
  2 comentarios
Jan
Jan el 19 de Sept. de 2011
What exactly is the size of a folder? Are you looking for similar or equal files? Would a checksum of the files or of all files insider a folder (and subfolders?) help?
mohammad
mohammad el 19 de Sept. de 2011
I am looking for equal folders that all files inside a folder to be same (equal) with all files of an another folder.

Iniciar sesión para comentar.

Respuestas (1)

Jan
Jan el 19 de Sept. de 2011
Usually files are compared using checksums, e.g. FEX: CalcMD5.
[EDITED]: You can use FEX: DataHash for the struct replied by DIR:
aDir = dir(FolderName);
isFile = not([aDir.isdir]);
fileSize = [aDir(isFile).size];
Hash = DataHash(sort(fileSize));
Now the number of the files is considered also and the sizes of the different files. It would be more accurate to use the names and dates also:
aDir = dir(FolderName);
isFile = not([aDir.isdir]);
Hash = DataHash(aDir(isFile));
Now you can store a list of already occurred Hash values in a cell string and if any(strcmp(Hash, HashList)) is TRUE, delete the folder, if FALSE rename it.
  4 comentarios
mohammad
mohammad el 19 de Sept. de 2011
thanks, no i am not sure ;)
i told because speed of comparing these folders is very important for me and because of this i suggested that
thanks a lot
let me check your answer
Jan
Jan el 19 de Sept. de 2011
In general validity is more important than speed. Creating a wrong result with a high speed will lead to troubles, creating a correct result slowly will increase the consumption of coffee.

Iniciar sesión para comentar.

Categorías

Más información sobre File Operations en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by