Variable Width Fonts and space efficiency
Disclaimer: I'm not even remotely proficient in Japanese, but I definitely deal with a lot of Japanese text in both my hobbies and at work, so I happen to have some observations on it.
Preface
A recent study I saw popping up on some social media commented on how information density over time across spoken languages was fairly similar despite vastly different approaches (different syllable counts, general rate of speech, etc...). There are some obvious criticisms, of course. For example, language tends to bias towards specific cultural requirements and so there may be specific contexts where the language excels and conversely, specific ones where it falls short... but that aside, it did get me thinking a bit about written information density, especially when it comes to displaying things on fairly low-spec systems (yes I mean the Gameboy).
I've done a fair amount of reverse engineering on old GB stuff to enable translations from Japanese. I think I'm generally rather slow overall, but I do have a process:
- Identify what's writing the text out
- Text is usually represented as a set of indices into a larger tileset
- Use the knowledge from (1) to figure out where the text itself is stored
- Intuitively, text is probably being written out as part of a larger scripting system, but we generally don't need all of it to just translate text
- Sometimes it's embedded in the script itself (like most old GBDK titles), which means we have to figure out the scripts to an extent anyway...
- (Optionally) Dump the tileset to png and allow fo rebuilding with it cleanly
- Sometimes, like with Sakura Wars GB2, the text itself already supports English font so I just reuse it
- Extract the text into some common format (usually CSV) and rebuild with it cleanly (text extraction + reinsertion)
Generally after text extraction/insertion, the next big task to overcome is... how do we actually fit English text in the space originally meant for Japanese text? ...The answer is usually "it depends" from game to game and I won't bore you with the details (for now).
Instead, with the above context, I'd rather dive in a bit into Japanese text itself and how exactly it compares to the translated English. Japanese text in old games tends to be fairly space efficient and storage efficient. Since it's New Years' day and I'm tired from all the festivities, I'll just briefly talk space efficiency here.
Space Efficiency in Dragon Warrior 3
Randomly, on Cohost this year, a user @Bek0ha (bsky) went and found and recreated a cool font that caught my eye:
Bekoha went so far as to even provide me an 8x8 version of it:
Anyway, I figured I'd see how it would look in Dragon Warrior 3, which I had disassembled and gotten text reinsertion working for a while back for a re-localization project.
From left to right: The original Japanese, the original English, Kinema 8-bit, and Kinema 8-bit narrow
Let's look at these lines (note that things surrounded by brackets are a single tile):
Line 1
*「[・・・][・・・]はあ はあ.
*「ねえ! オルテガさんの 子供が
生まれたんですって!?
[*:] Pant, pant…
Is it true that
Ortega had a baby?
Line 2
*「そうとも! すごい元気な
赤ちゃんだそうだ.
[*:] That['s] right!
I hear the baby['s]
really lively too.
Line 3
*「アリアハンのゆうしゃ オルテガの
子どもなら[・・・]
*「きっと りっぱな
戦士になるぞ!
[*:] Any child of
Ortega, Aliahan['s]
hero, is sure to
become a great
warrior.
Line 4
*「[・・・][・・・]そうよね.
*「さあ 早く 赤ちゃんのかおを
見せてもらいましょう!
[*:] That['s] true.
We should go see
the baby!
(Did you notice how the Japanese actually has enough leeway to indent lines around the asterisk? The English definitely didn't...)
I suspect if you have some common variable-width font for English enabled, the English text actually looks like it takes significantly less space than the fixed-width Japanese:
ねえ! オルテガさんの 子供が 生まれたんですって!?
Is it true that Ortega had a baby?
So let's see some stats as measured on the Gameboy:
Line | JP Character Count | EN Character Count | JP Pixel Width | EN (Original) Pixel Width | EN (Kinema) Pixel Width | EN (Kinema Narrow) Pixel Width |
---|---|---|---|---|---|---|
1 | 10 + 17 + 13 = 40 | 13 + 15 + 18 = 46 | 80 + 136 + 104 = 320 | 104 + 120 + 144 = 368 | 74 + 132 + 68 = 274 | 56 + 124 + 24 = 204 |
2 | 14 + 11 = 25 | 14 + 16 + 18 = 48 | 112 + 88 = 200 | 112 + 128 + 144 = 384 | 88 + 140 + 58 = 286 | 66 + 136 + 16 = 218 |
3 | 18 + 8 + 10 + 9 = 45 | 14 + 16 + 16 + 14 + 8 = 68 | 144 + 64 + 80 + 72 = 360 | 112 + 128 + 128 + 112 + 64 = 544 | 126 + 130 + 108 + 48 = 412 | 140 + 140 + 36 = 316 |
4 | 9 + 16 + 13 = 38 | 13 + 16 + 9 = 38 | 72 + 128 + 104 = 304 | 104 + 128 + 72 = 304 | 80 + 122 + 34 = 236 | 60 + 120 = 180 |
Total | 148 | 200 | 1184 | 1600 | 1208 | 918 |
(The GIFs are direct 2x scale, so I just split the frames and used those to measure the pixels :D)
Conclusions
It's not quite as apples-to-apples, but English taking 200 characters to convey what Japanese does in 148 is indicative of a larger trend we see that causes a lot of trouble when it comes to the actual storage of English text when memory is limited... This is a storage problem though, and not one of actual visual space. If we're willing to spend the computation to render English text dynamically, then we can fit as much, if not more information within the same visual space by using a variable-width font!
Now if we could just figure out a way to do it in a way that didn't involve needing to draw text dynamically while maintaining the same text storage efficiency... I suppose that's a deep dive for another time, or research for someone smarter than me to get into. For now, at least, I'll focus on shoving VWF into as many projects as I can.
Some final personal notes
One of my New Years' resolutions was to write a bit more and to get generally better at articulating my thoughts. It has a few other nice benefits too...
- I finally get to live out the HTML/CSS dream I missed because I scoffed at MySpace and personal blogs in my teenage years
- It's quite therapeutic to put things to paper and/or keyboard
- Writing things down in this format really lets me dig into topics I wouldn't really spend as much time on, especially with things like gathering stats!
Also, since I'm starting to forget more things recently, I figured it might be a good time to start really investing in archiving whatever knowledge I've accumulated... if for no one else's benefit but my future own.
So that being said, thanks for reading!