Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

UCS-2 is a bad choice -- it fails to represent most unicode characters. If you meant UTF-16, that's also a bad choice, because UTF-16 is also a variable width encoding, forcing programmers to use a some for of "extra-wide char".

I'm of the opinion that wchar_t should become an alias for char32_t.



Yes, I meant the 31-bit code point value (more than 16, anyway). It is the most useful width for doing things with wide characters.


UTF-32 is also a variable-width encoding; eg 00000044 00000308 aka "D̈".


I thought it was strictly one character per 32-bit code. Anyway, whatever it is called it is what wchar_t should be.


There are no fixed width encodings with range of encodable characters anywhere near that of Unicode.


It's too bad Unicode wasn't designed around the concept of easily-recognizable grapheme clusters and "write-only" [non-round-trip] forms that are normalized in various ways. A text layout engine shouldn't have to have detailed knowledge of rules that are constantly subject to change, but if there were a standard representation for a Unicode string where all grapheme clusters are marked and everything is listed in left-to-right order, and an OS function was available to convert a Unicode string into such a form, a text-layout using that OS routine would be able to accommodate future additions to the character set and and glyph-joining rules without having to know anything about them.


You can't do that without commiting to not supporting pathological text, otherwise you're stuck adding new special cases to the layout engine every update anyway.

I do have some ideas for a better encoding (like, I assume, anyone competent with sufficient free time and interest in text encoding), but there's a lot of reluctance to put effort into something that's already completely eclipsed by a technically inferior but not completely unusable alternative, so I've had it mostly shelved.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: