Will I be able to hash a DICOM image?
Mostrar comentarios más antiguos
I have a DICOM image as input. I would like to hash the dicom image using any hash algorithm? Will I be able to perform hash using SHA-1 algorithm? If so, can somebody help me with the matlab code?
Respuestas (2)
Walter Roberson
el 28 de Feb. de 2018
0 votos
Yes, you can hash any data that can be represented in binary. At worst, use typecast on the numeric array to convert to uint8 and hash that.
If I recall correctly sha is available in the file exchange
25 comentarios
Darsana P M
el 3 de Mzo. de 2018
Walter Roberson
el 3 de Mzo. de 2018
Darsana P M
el 4 de Mzo. de 2018
Walter Roberson
el 4 de Mzo. de 2018
Yes.
Depending on what you are doing you might want to use the 'bin' option of datahash(), and you will probably want 'hex' as well. But are you just trying to hash the image itself or are you trying to hash the entire file including all of the headers? What are you going to do with the hash ?
Darsana P M
el 4 de Mzo. de 2018
Walter Roberson
el 4 de Mzo. de 2018
opt = struct('Method', 'MD5', 'Input', 'bin');
L = DataHash(I, opt);
Walter Roberson
el 4 de Mzo. de 2018
By the way, there is no point doing two dicomread() there. Both of those dicomread() will return the same thing.
Darsana P M
el 5 de Mzo. de 2018
Walter Roberson
el 5 de Mzo. de 2018
Write the data in binary to a file and use an external program to calculate the hash. If the data is not uint8 then make sure you know what byte order was used in the calculations and what byte order you are writing into the binary file.
Darsana P M
el 5 de Mzo. de 2018
Darsana P M
el 5 de Mzo. de 2018
Walter Roberson
el 5 de Mzo. de 2018
"How to do that. I find programming tough."
fid = fopen('ToBeHashed.bin', 'w');
fwrite(fid, typecast(TheDataToBeWritten,'uint8') );
fclose(fid);
[status, result ] = system('C:\Users\Darsana\HashUtilities\md5_hash.exe ToBeHashed.bin');
Now analyze the character vector in result to see if it matches. You will need to strip out trailing end of line characters at the very least and you might need other manipulations depending on the format the external program returns.
C:\Users\Darsana\HashUtilities\md5_hash.exe should not be taken literally: you need to find a program that does md5 hashes and install it somewhere on your system and use the path to it.
I do not recommend hashing the dicom header apart from the dicom file. That datahash contribution has an option to apply the hash to an entire file: apply it to the dcm file.
The reason I do not recommend hashing the dicom header separately is that DICOM headers are structured binary data that are a nuisance to format to text in unambiguous ways, especially as it is entirely within the standard for people to write additional fields using "private" tags that are only intended to be decodable by the same software. The order of the fields is not fixed.
At this point I should probably ask why you are wanting to take this hash ? What are you going to do with it?
Darsana P M
el 5 de Mzo. de 2018
Editada: Walter Roberson
el 5 de Mzo. de 2018
Walter Roberson
el 5 de Mzo. de 2018
I think it is incorrect to proceed that way. I think you should just encrypt the entire file rather than the pixel data separately from the headers.
Due to export restriction of strong encryption technology we are not allowed to publish code for an AES encryption. You find many corresponding discussions: https://www.mathworks.com/matlabcentral/answers/?term=AES. E.g.: https://www.mathworks.com/matlabcentral/answers/28999-anyone-has-d-code-for-aes-in-matlab#answer_37399
Darsana P M
el 6 de Mzo. de 2018
Walter Roberson
el 6 de Mzo. de 2018
I guess the algorithm is what it is. However, I do not think it is a good algorithm, because it does not specify clearly what "dicom header" is intended to be, and it only encrypts the pixel data whereas the sensitive part of a dicom header is the patient identification.
You know what might make more sense procedurally? To take a copy of the dicom image but replace all of the pixel values with a constant such as 0, write it to file, and then use the resulting file in binary as the keying material. Then go back and replace the constant pixel values with the encrypted pixel values and write. But this still has the problem that the DICOM standard does not require that the output tags be written in any particular order when you write out a DICOM in full; and when you update the pixel values within any one file if you do that as update rather than as a new file then potentially that could result in the old block being marked as unused and the new pixels being written at the end. Or could potentially result in the new pixels being logically moved to the end and then for everything after the original pixel values to "fall down" to occupy the now unused space.
You have an original DICOM header, but unless you specify an order to extract the attributes for the purpose of keying, and unless you specify a representation for the purpose of keying, including a byte ordering and character width and encoding, then would have to send the original unencrypted DICOM over the channel to be sure that you got the same content to use for keying on the other end.
DICOM is not one of those container formats where you just slap on a magic number at the beginning and then have fixed sized pixel data at a fixed offset: instead the attributes can be stored in a wide variety of orders and the reading software is supposed to run through the file and build an index rather than relying on fixed offsets.
Darsana P M
el 6 de Mzo. de 2018
Walter Roberson
el 7 de Mzo. de 2018
dicomfilename = '....'; %as appropriate
dinfo = dicominfo(dicomfilename);
curimg = dicomread(dinfo);
zimg = zeros(size(curimg), 'like', curimg);
temp_filename = [tempname(), '.dcm'];
%deliberately do not verify header information written to file
%we are not "improving" the existing file, we need to work with
%what we have
dicomwrite(zimg, temp_filename, dinfo, 'CreateMode', 'copy');
fid = fopen(temp_filename, 'r');
raw_zfile = fread(fid, [1, inf], '*uint8');
fclose(fid);
delete(temp_filename);
Now raw_zfile contains a copy of the bytes of the dicom file as written on disk in which all of the pixels have been replaced by 0. This is not exactly the same as "the dicom header" (which is something that has an undefined structure) but it is independent of the original pixel values.
You need to understand here that a different version of MATLAB, or the same version running on a different operating system class (Windows, Linux, Mac) might write out that dicom file differently, and that if you are trying to communicate with a different system completely that is not running MATLAB, then that other system will have no way of doing to same dicom write to create this kind of file for verification.
Hmmmm, perhaps the situation is not as bad as I thought. I see from https://www.leadtools.com/sdk/medical/dicom-spec1 that within any one data set, "The Data Elements in a Data Set shall be ordered by increasing Data Element Tag Number"
Darsana P M
el 7 de Mzo. de 2018
Darsana P M
el 7 de Mzo. de 2018
@Darsana: If DataHash should treat the input in 'bin' mode, it considers the contents of the variable only. This works for floating point, integer, char or logical arrays only. If you provide a cell, there is no single block of data, which can be treated as byte-stream.
You need either the input mode 'array', or you have to provide the data as a single array of the mentioned types.
This is explained in the help section of DataHash also.
Darsana P M
el 8 de Mzo. de 2018
Jan
el 8 de Mzo. de 2018
@Dorsana: The question is still not clear to me, even after 23 comments. You could call DataHash with the option: 'Input', 'array'. But then the "hash over the header" is something very specific, which can be reproduced with DataHash only. Maybe it would be much better to get the "Dicom header" as byte stream and calculate the hash over this. This would be reproducible without Matlab also, while applying DataHash to the imported header information is very specific. The hash would e.g. change if Mathworks decides to use string objects instead of char vectors in the future.
In consequence I cannot suggest a specific method to solve your problem reliably, but I have the impression, that the problem is not defined exactly yet.
Darsana P M
el 8 de Mzo. de 2018
Jan
el 5 de Mzo. de 2018
0 votos
Categorías
Más información sobre Convert Image Type en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!