nhmall
2015-01-09 01:43:05 UTC
What are people's thoughts on the best approach to achieving the
following goals in the base nethack code:
- Allow players to be able to use any language to name themselves or
their pets and possessions and have them referred to as such in the game
messages.
- Display Unicode characters on the map for various things.
While a lot (majority?) of projects have chosen to use UTF-8 and
variable 8-bit character strings, it would also be possible for nethack
to stray from the pack and internally use UTF-32 with strings of
fixed-size 32-bit characters and convert them to UTF-8 for input and
output. The code is already set up that way.
The obvious argument against using UTF-32 internally is the wasted
bytes, just like any other project. On the other hand, converting the
nethack code to use variable character strings everywhere might be the
challenge for UTF-8.
What do you think?
Some potentially-related reading that may be of possible interest:
http://en.wikipedia.org/wiki/UTF-32
http://en.wikipedia.org/wiki/UTF-8
http://www.joelonsoftware.com/articles/Unicode.html
http://stackoverflow.com/questions/496321/utf8-utf16-and-utf32
http://utf8everywhere.org/
http://programmers.stackexchange.com/questions/102205/should-utf-16-be-considered-harmful
following goals in the base nethack code:
- Allow players to be able to use any language to name themselves or
their pets and possessions and have them referred to as such in the game
messages.
- Display Unicode characters on the map for various things.
While a lot (majority?) of projects have chosen to use UTF-8 and
variable 8-bit character strings, it would also be possible for nethack
to stray from the pack and internally use UTF-32 with strings of
fixed-size 32-bit characters and convert them to UTF-8 for input and
output. The code is already set up that way.
The obvious argument against using UTF-32 internally is the wasted
bytes, just like any other project. On the other hand, converting the
nethack code to use variable character strings everywhere might be the
challenge for UTF-8.
What do you think?
Some potentially-related reading that may be of possible interest:
http://en.wikipedia.org/wiki/UTF-32
http://en.wikipedia.org/wiki/UTF-8
http://www.joelonsoftware.com/articles/Unicode.html
http://stackoverflow.com/questions/496321/utf8-utf16-and-utf32
http://utf8everywhere.org/
http://programmers.stackexchange.com/questions/102205/should-utf-16-be-considered-harmful