How to import hex values from text file?

Hi, I have a text file of hex values in the format of '3D 3D 3D 3E 41' etc.. and want to import to a matrix in order to use inshow(Matrix) to reconstruct the image of the data. It is 1280 values in a row and 1024 columns (1280px X 1024 px image). I have tried fscanf and dlmread but they are not working as I expect. dlmread is not recognizing the first value and fscanf does not read the data in the way that I want when I use [1280 1024]. Examples below:
3D 3D 3D 3C 3E 3A 40 3C 3E 3C 3D 3B 3B 3C 3D 3C 3C 3C 3C 3D 3D 3B 3B 3D 3F 3B 3D 3E 3D 3C 3F 3C 3D 3C 3E 3E 3F 3C 40 3E 40 3D 3E 3D 3E 3D 3E 3B 3C 3B 3D 3D 3D 3D 40 3B 3D 3D 42 3D 3C 3B 3D 3C 3D 3B 3C 3B 3B 3C 3B 3E 3F 3D 3E 3D 3E 3E 40 3C 3E 3D 3D 3C 3D 3C 3D 3D 3F 3B 3A 3B 3C 3D 3F 3B 3C 3D 3D 3B 3B 3A 3B 3C 3B 3B 39 3B 3C 3C 3B 3D 3E 3D 3D 3A 3C 3D 3F 3D 3D 3D 3B 3C 3E 3A 3C 3E 3D 3B 3D 3B 3E 3C 3B 3D 3C 3C 3C 3A 3D 3C 3C 3B 3D 3C 3B 3C 3B 3C 3C 3C 3D 3C 3D 3D 3A 3B 40 3C 3D 3B 3E 3D 3B 3A 3D 3A 3E 3E 3D 3B 3E 3D 3C 3B 3E 3B 3E 3E 3D 3D 3B 3A 3E 3C 3D 3C 3D 3C 3B 3A 3E 3A 3C 3D 3C 3E 3D 3E 3D 3D 3D 3B 3B 3A 3D 3A 3C 3B 3D 38 3A 39 3B 3B 3E 3A 3D 3D 3D 3A 3C 3C 3E 3A 3D 3A 3C 3A 3B 38 3C 3A 3D 3C 3D 3B 39 3B 3A 3B 3D 3A 3B 39 3F 3A 3E 3C 3D 3B 3B 3C 38 39 3A 39 3B 39 3D 3C 3A 3A 3E 3B 3D 3B 3E 3B 3D 3D 3A 3B 3E 3E 3E 39 3E 3B 3A 39 3E 3C 39 39 3D 39 3C 3A 38 3A 3A 3E 3B 3B 3C 3B 3A 3A 3C 3B 3D 3B 3D 3B 3A 39 3C 3A 3C 3C 3A 3A 3B 3A 3C 38 3D 3C 3C 3A 3B 3A 3B 3B 3B 3A 3D 3A 3B 3A 3B 3C 3B 38 3B 39 3B 39 3E 39 3C 3B 3A 3B 3B 39 3C 3C 3A 3A 3A 3A 3A 38 3E 39 3C 3B 3D 3E 3C 39 39 39 3E 3B 3C 3A 39 39 3B 37 39 3B 3C 3C 3D 37 3A 3A 38 3C 3B 3A 3A 3A 3A 39 39 39 3C 38 3C 3B 3D 3C 39 3B 3C 3A 3A 3B 3C 3A 3A 3A 3A 37 3B 38 39 39 39 3A 39 39 39 3A 3A 3B 3C 3B 3D 39 3E 3A 3B 39 3A 3B 3C 3C 3C 37 39 3A 3C 35 3B 39 39 39 39 3B 3C 3A 39 3A 3A 3C 3D 3A 3C 3A 3B 3C 3C 3A 3B 3B 39 39 39 3C 3D 3A 3A 39 3D 3B 3B 3A 3A 38 37 3A 39 3A 3A 37 3A 3A 3B 37 3A 3A 3A 37 3D 3A 3C 39 3D 38 3C 37 3A 39 3B 3A 3B 3A 3C 3A 3B 39 3C 3C 3A 3A 39 39 38 3A 3A 3A 3B 38 37 3B 39 39 3A 39 37 3B 3A 3A 39 38 39 39 3C 3A 3A 3A 3A 38 3B 3C 38 3A 39 3A 3A 3B 38 38 39 3A 3B 3A 39 38 3A 39 3B 36 3A 37 3A 3A 3A 38 39 37 38 38 3B 38 3B 39 3B 38 3A 38 37 36 3B 38 39 3B 39 3A 39 39 3A 3B 3C 38 3A 38 38 39 38 39 3A 39 39 39 3B 3B 3D 39 39 38 38 38 3A 37 39 39 38 38 3B 39 3A 38 38 34 39 36 39 38 39 38 39 38 3A 37 39 37 39 35 38 37 37 39 39 37 38 38 38 39 38 37 37 37 39 39 38 37 39 39 39 37 3A 37 39 38 3A 36 38 38 39 39 3A 37 3C 39 39 39 36 37 36 37 37 37 37 37 37 37 37 37 39 37 37 37 3B 36 38 36 39 36 39 38 38 38 39 34 37 38 3A 37 38 39 3A 38 39 36 39 37 37 38 3A 37 39 39 3B 38 39 38 37 36 37 37 38 36 39 36 36 35 36 36 37 35 36 36 37 35 36 37 37 36 3A 37 36 35 35 37 36 37 37 38 36 35 35 36 37 35 37 37 38 38 37 39 3A 35 37 36 39 37 38 37 36 35 37 35 35 35 37 35 37 38 37 35 37 36 37 34 3A 34 36 39 3A 37 37 36 38 34 35 35 36 34 37 36 34 37 36 35 37 34 35 34 36 37 38 35 35 35 34 34 38 37 35 36 37 33 35 32 33 34 36 34 36 37 35 34 35 34 36 35 34 34 37 36 35 35 34 33 37 34 35 32 35 35 38 33 35 34 34 33 38 34 32 34 34 32 36 34 36 33 35 35 33 33 36 31 36 33 36 34 34 33 36 31 33 33 35 33 34 33 36 34 33 33 37 33 35 34 34 34 34 31 33 32 34 33 34 33 35 33 34 31 32 33 33 33 35 35 34 32 34 32 34 34 30 32 32 30 34 32 34 31 33 33 34 30 34 32 33 33 32 32 31 31 33 32 33 31 34 32 2F 33 32 2F 32 31 33 2F 34 32 33 31 32 2E 30 31 32 31 31 31 34 31 34 32 32 30 33 2E 31 2F 31 30 33 31 31 30 31 30 30 30 32 31 33 31 32 2F 30 30 32 31 2F 2E 30 30 32 30 31 2F 30 2F 31 30 32 2F 2F 32 31 2F 2F 2F 31 2F 30 2E 2F 2F 2F 30 30 30 30 2F 2D 2F 30 2C 2E 2C 30 2D 2F 2F 31 2F 2F 2E 2F 2E 2E 2F 31 31 2F 2C 30 2F 31 2C 30 31 2D 2D 2E 2F 2D 2D 2D 2D 2F 2D 2E 2B 2E 2D 2D 2D 2E 2D 2E 2C 2A 2D 2E 2E 2D 2C 2D 2B 2C 2E 2C 2B 2D 2A 2A 2D 2C 2B 2E 2B 2B 29 2D 2B 2B 2B 2D 2B 2B 2C 2C 2D 2B 2C 2C 2A 2C 2B 2C 2C 2C 2A 2B 29 2B 2A 2B 29 2A 29 2A 29 2C 2A 2A 27 2B 27 2C 28 2A 29 2B 29 2B 2A 29 2B 2A 2B 2A 28 2A 2B 29 2A 2A 27 2A 27 2A 2B 2B 28 2A 28 29 27 29 28 2A 27 29 28 28 28 27 29 29 26 27 29 2A 28 2A 28 29 26 28 27 28 26 27 26 2A 28 28 27 26 28 28 26 28 2A 27 27 27 28 27 27 26 27 26 27 27 27 29 25 27 25 27 24 26 26 28 28 26 25 27 24 28 25 27 25 25 25 26 24 26 24 26 25 25 24 25 24 25 22 23 25 24 23 25 23 25 24 23 22 23 22 25 21 24 23 23 22 23 22 22 22 25 21 24 23 23 21
3D 3D 3D etc...
This is one row of the data in hex. A fair few, say 20, spaces follow the end and another begins right below.
fscanf I was using:
fid = fopen('cb130.dat');
cb130mat = fscanf(fid,'%x');
fclose(fid);
dlmread I was using:
cb130mat = dlmread('cb130.dat');
This is the error that showed:
Mismatch between file and format string. Trouble reading number from file (row 1u, field 1u)
Thanks for the help.

13 comentarios

Stephen23
Stephen23 el 6 de Ag. de 2015
Please upload your file using the paperclip button and then clicking both the Choose file and Attach file buttons.
Walter Roberson
Walter Roberson el 6 de Ag. de 2015
With the test data you show here, the '%x' format works. We will need a sample file to test with.
You will not be able to use dlmread for this: dlread() can only handle numeric values.
dpb
dpb el 6 de Ag. de 2015
Editada: Walter Roberson el 6 de Ag. de 2015
But the data are numeric, Walter!!! :)
The limitation on dlmread is more restrictive than just "numeric"; it's also base 10. (Pedant mode off) <vbg>
Walter Roberson
Walter Roberson el 7 de Ag. de 2015
Looking at the hex converted to decimal, I doubt the data is "numeric"; it looks to me to be either character or graphic represented as numeric.
dpb
dpb el 7 de Ag. de 2015
What the data represent is immaterial to the point I was making (albeit it is somewhat pedantic, hence the smile/grin/chuckle) that if the %X format string works it can be considered numeric, just in base 16. Of course, all data are numeric in memory representation; it's only our interpretation of them that makes them otherwise...
I've wondered in the past why there wasn't more flexibility in some of the routines such as dlmread although there are ways around them it would sometimes be convenient to be slightly more general than they are...
Walter Roberson
Walter Roberson el 7 de Ag. de 2015
But pedantically, the data is not numeric, the data is represented by something that represents something that represents numeric values (and I suspect there is at least one more another layer of representation in there too.)
Cedric
Cedric el 7 de Ag. de 2015
I wanted to mention this Wiki about the DIKW pyramid before I pull in the discussion a philosopher of science from the department next door.
dpb
dpb el 8 de Ag. de 2015
And all this to an engineering major who thought Moby Dick was just a good story of a fishing trip ... :)
Cedric
Cedric el 8 de Ag. de 2015
Editada: Cedric el 8 de Ag. de 2015
Wait, Moby Dick was not just about a fishing trip ?! ;-)
Walter Roberson
Walter Roberson el 8 de Ag. de 2015
No-one knows; no-one has actually read Moby Dick, only skimmed some of the pages. (Except, of course, Laurie Anderson)
dpb
dpb el 9 de Ag. de 2015
From the comments, looks like her reading may have been even more cursory... ;) I think it was roughly age 14 I first read it; can recall at least once more in HS and then the frosh American Lit prof who didn't much appreciate my trying to cast into a Swift-like satirical commentary... :)
Samuel Davies
Samuel Davies el 12 de Ag. de 2015
Okay csv file uploaded because .dat was not supported. I hope the structure is preserved. It's all in hex so the lettered values are out of alignment.
Cedric
Cedric el 12 de Ag. de 2015
My code should work on your file.

Iniciar sesión para comentar.

 Respuesta aceptada

Cedric
Cedric el 7 de Ag. de 2015
Editada: Cedric el 12 de Ag. de 2015
EDIT: I updated my answer with comments and made it a little more efficient.
buf = fileread( 'chromebackground_0_150810-144439.csv' ) ;
% - Keep only relevant characters.
buf = buf(buf>47) ;
% - Cast to double (makes the following faster).
buf = buf + 0 ;
% - Map '0'-'9' to 0-9 and 'A'-'F' to 10-15.
idNum = buffer < 58 ;
buf(idNum) = buf(idNum) - 48 ;
buf(~idNum) = buf(~idNum) - 55 ;
% - Build dec value from components and shape as rectangular array.
img = reshape( [16, 1] * reshape(buf, 2, []), 1280, [] ) ;
=====[ FORMER ]=========================================================
Does the following work?
buffer = fileread( 'cb130.dat' ) ;
buffer(buffer<48) = [] ;
idNum = buffer < 58 ;
buffer(idNum) = buffer(idNum) - 48 ;
buffer(~idNum) = buffer(~idNum) - 55 ;
img = reshape( [16, 1] * reshape( buffer, 2, [] ), 1280, [] ) ;
You may have to transpose the output or to reshape with 1024 rows because it is not very clear from your question how it is ordered in the file.
Note that it's worrisome if it works, because you are using 3 times the space needed to store the data that define your image.

3 comentarios

Samuel Davies
Samuel Davies el 12 de Ag. de 2015
Editada: Samuel Davies el 12 de Ag. de 2015
The raw .dat file is ordered in 1280 little hex values making up a block and there are 1024 blocks. LIKE SO --->
xxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxx
xxxxxxx BLOCK 1
xxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxx
xxxxxxx BLOCK 2
etc...
Each block represents a row in that case as my image is 1024 x 1280. So your answer when I run it (top one), I get buffer is missing an argument, something like x = buffer(M,N), see doc buffer. I can open the file if I go to the path and open the .dat, it opens as with the same format as the CSV but is fixed space. So I can open the file in the format that I want it to be in, its just takes an inordinately long time to open and import as a matrix, it misses all the alphabet letters importing. * The bottom one worked but I'll have to check that the values are what they should be. * They are what they should be. May you explain what the program does, I have had a look at buffer doc but can't really get what it says. And I did end up transposing img and flipping the rows but it works I think. I am then using M = mat2gray(img) to normalize the data. So going from hex to dec and norm. But your program works I think.
Cedric
Cedric el 12 de Ag. de 2015
Editada: Cedric el 12 de Ag. de 2015
I think that what we don't understand is what is the format of your original data (the .dat file).
Hexadecimal is a way to represent numbers (in base 16). One byte (=8bits) that codes for e.g. levels of grey can code for 2^8=256 levels. It can be represented using unsigned integers from 0 to 255, eight symbols 0 or 1, two hex symbols, etc. The hex representation is generally used because it is compact (two symbols vs eight for binary or max 3 for integers). You attached a CSV file to your question, which stores an hex representation of the data. Each representation of a byte is coded with two hex symbols, and successive representations of successive bytes are separated by commas (D3,AA,24,FF,..). Each row ends with two invisible characters: a carriage return (one byte, ASCII code 13) and a new line (one byte, ASCII code 10). This means that you are using three characters per byte for representing your data (= 3 bytes on disk per data byte), plus a little extra for the new line characters.
Now we all asked questions related to what was the actual format of your data and whether hex was just a representation or really the aforementioned format. The reason is that some people open files with hex editors, see hex code and say that they have hex data, but they don't; the hex editor is just representing bytes in hex.
Looking at the CSV file that you provided, I see that it contains the hex representation stored as characters (internally, of course, these are bytes, but we read them and interpret them as ASCII codes, and we display the hex representation). I guessed that it was the case before having the CSV, and I proposed a solution which reads the file as characters, eliminates all characters which are not in {'0',..'9','A',..,'F'}, converts ASCII codes to numbers, and aggregates them in base 16. This seems to be working (?) as it outputs the following image:
Now I still have no clue about the content of the .dat file that you mention, and whether it stores the bytes of the image represented in hex, or if the CSV is an hex representation that you made by yourself of the bytes of the .dat file.. ?
Finally, I apologize about buffer. I am not using the function of the Signal Processing Toolbox, but I am using it as a variable name. It is a bad habit that I took at a time when I didn't have this toolbox.. and bad habits are difficult to loose, even when we finally have the toolbox ;-) I changed if for buf.
PS: if you don't understand the detail of what my solution does, try on a small example:
>> buf = sprintf( '08,D3,FF,12\r\n27,1B,00,7C\r\n' )
buf =
08,D3,FF,12
27,1B,00,7C
This is a string, where we insert a carriage return and a new line at the end of each line. Let's check that it is a string (type/class = char):
>> class( buf )
ans =
char
Now MATLAB stores the ASCII code of characters, which are number, and knowing that the class is char, it is able to display characters that correspond to ASCII codes. There are many ways to display the ASCII codes, one way is to typecast to numeric (double) by adding 0 (number) to the string. Let's test with the first upper case letters ABC:
>> class( 'ABC' )
ans =
char
>> 'ABC' + 0
ans =
65 66 67
>> class( 'ABC' + 0 )
ans =
double
This shows that the 'ABC' is a string (char), that ASCII codes for these letters are respectively 65, 66, 67, and that they are numeric (double). So ASCII codes of upper case letters go from 65 to 90. What about characters '012' etc?
>> '012' + 0
ans =
48 49 50
So ASCII codes of characters '0' to '9' go from 48 to 57. Now let's see what ASCII codes of characters in buf are:
>> buf + 0
ans =
48 56 44 68 51 44 70 70 44 49 50 13 10 50 55 44 49 66 44 48 48 44 55 67 13 10
Here you recognize 48 for the first '0', etc. You see that commas are 44, carriage returns \r are 13, and new lines \n are 10. So our first move if we want to keep the hex characters only is to pick all elements whose ASCII code is above 47 (which eliminate commas, carriage returns, and new lines):
>> buf = buf(buf>47) ;
>> buf + 0
ans =
48 56 68 51 70 70 49 50 50 55 49 66 48 48 55 67
Now you see ASCII codes for '0', 'B', 'D', '3', etc. And then we convert buf to numeric, subtract the ASCII code of '0' to characters that were coding digits (this remaps 48-57 into 0-9), etc.

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Large Files and Big Data en Centro de ayuda y File Exchange.

Preguntada:

el 6 de Ag. de 2015

Comentada:

el 13 de Ag. de 2015

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by