£ is still just a byte

johannes1234321 · on June 2, 2023

When using latin-1/latin-15/iso-8859-1/iso-8859-15/cp1252 that statement is true. With utf-8 it is two bytes (c2 a3), if a software uses utf-16, ucs-2, etc. it may be more.

edent · on June 2, 2023

And yet it is reasonably common to see "Â£" when the UTF-8 is misinterpreted.

josefx · on June 2, 2023

Not in any modern encoding and certainly not in ASCII either. Having the highest order bit set makes that kind of problematic.

rightbyte · on June 2, 2023

'u32_pound & 0xff == u32_pound' happens to be true, ye. It doesn't make it a byte. You need the leading 0s.

mjburgess · on June 2, 2023

My mistake

tuukkah · on June 2, 2023

Nope, says UTF-8.