Issue with native2unicode and windows-1252 encoding
12 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Hi all,
I'm trying to encode some bytes into a character set using the windows-1252 encoding and I've checked that native2unicode
1 comentario
Respuestas (3)
Walter Roberson
el 14 de En. de 2022
source = char(0:511)
bytes = unicode2native(source, 'windows-1252')
backport = char(bytes)
whichdiffer = find(source(1:256) ~= backport(1:256) )
source(whichdiffer)
bytes(whichdiffer)
backport(whichdiffer)
What this is telling us is that Unicode 129 to 141 are not represented in Windows 1252
bytes2 = uint8(129:141)
encodes_as = native2unicode(bytes2, 'windows-1252')
double(encodes_as)
Looks about right.
2 comentarios
Walter Roberson
el 17 de En. de 2022
code point 26 is the standard value to substitute for codepoints that cannot be represented
https://en.m.wikipedia.org/wiki/Substitute_character
Borja Heriz
el 17 de En. de 2022
1 comentario
Rik
el 17 de En. de 2022
This is an answer, but it looks like a comment. Please use the comment sections to post comments. The order of answers can change, which will make reading back confusing.
Please post this as a comment and delete the answer.
When you do, I (or Walter) will post something along these lines:
Why do you think 153 and 156 are encoded as the same character? They are displayed as the same character, but that is probably due to a limitation in the display, as this could very well encode a control character without a proper symbol.
Ver también
Categorías
Más información sobre Data Type Conversion en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!