Issue with native2unicode and windows-1252 encoding
Mostrar comentarios más antiguos
Hi all,
I'm trying to encode some bytes into a character set using the windows-1252 encoding and I've checked that native2unicode
1 comentario
Rik
el 14 de En. de 2022
Most of your question seems to be missing.
Respuestas (3)
source = char(0:511)
bytes = unicode2native(source, 'windows-1252')
backport = char(bytes)
whichdiffer = find(source(1:256) ~= backport(1:256) )
source(whichdiffer)
bytes(whichdiffer)
backport(whichdiffer)
What this is telling us is that Unicode 129 to 141 are not represented in Windows 1252
bytes2 = uint8(129:141)
encodes_as = native2unicode(bytes2, 'windows-1252')
double(encodes_as)
Looks about right.
2 comentarios
Borja Heriz
el 17 de En. de 2022
Walter Roberson
el 17 de En. de 2022
code point 26 is the standard value to substitute for codepoints that cannot be represented
https://en.m.wikipedia.org/wiki/Substitute_character
Borja Heriz
el 17 de En. de 2022
1 comentario
Rik
el 17 de En. de 2022
This is an answer, but it looks like a comment. Please use the comment sections to post comments. The order of answers can change, which will make reading back confusing.
Please post this as a comment and delete the answer.
When you do, I (or Walter) will post something along these lines:
Why do you think 153 and 156 are encoded as the same character? They are displayed as the same character, but that is probably due to a limitation in the display, as this could very well encode a control character without a proper symbol.
Categorías
Más información sobre Data Type Conversion en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!