Score Keeping on the NES

Sometimes I feel like I should put a content warning here when the technical level of a post is higher than usual. This one would probably be a five out of five for geekery. It’s a video from NESHacker on counting score on the Nintendo Entertainment System. But I don’t want to discourage you from watching it! It’s nine minutes long, and it contains a definition of the term double dabble.

Human-readable numbers are tracked by computers in a number of different ways. Nowadays we basically just do a printf or some version of it, but on a 1 megahertz platform, optimization really matters. It’s easy to think of computers as being impossibly fast, but in truth speed only ever counts relative to the efficiency of the algorithm you use. Computers are fast, but they aren’t all that fast.

One of the big tradeoffs in processor design is, fewer complex instructions that do a lot but take a lot of cycles, and processor complexity, to execute, or many simple instructions, each doing little and being relatively simple, and not needing a complex processor design to implement.

The 6502 microprocessor generally follows the latter design philosophy. It made some important tradeoffs to keep costs down. For example, it doesn’t have hardware that can multiply arbitrary numbers together. It relies on the programmer, or else a library author, to use the instructions given to code their own multiplication algorithm, if they need one. The result is going to be slower, probably, that if the chip had the circuits to do this automatically in silicon, but it reduced the cost of the chip, basically allowing more to be made, or else increasing the profits for the manufacturer.

Personally I’m a fan of just storing the score as a series of digits that match up to their positions in the character set. Gain 1,000 points? Just bump the 1000s-place up by one, and if it goes past 9, subtract 10 and bump the 10,000s place. That’s a tried-and-true system that many games use, and works well if all you ever have to do is add numbers. Comparing values, like for detecting extra life award levels, make things slightly more complex, but not by much. There’s sometimes other factors involved though, and that may explain why Super Mario Bros. uses different systems for its counters, as explained by NESHacker.

Computing Pi on a NES

Today’s link is to a madperson who explains how to compute digits of pi on a NES’s 6502 to an arbitrary length. As you do. Along the way it explains how to multiply and divide in binary on a processor without hardware support. It’s around nine minutes long, but if you want a machine to get to the end of pi it’ll probably take a tad bit longer.

We link to such a variety of things here. Sometimes we post light videos where someone has Kirby do funny things. Sometimes we show explainers that explain how to do arithmetic on old processors. I presume that you’ll take from these what you want, and leave the rest to the crazy people, by definition the people who are not you. I understand.

Kaze Emanuar’s Adventures in Mario 64 Optimization: Calculating Sine

I’ve mentioned Kaze Emanuar’s efforts to make the best Mario 64 there can possibly be on its native hardware. He’s compiled it with optimization flags turned on, made its platforming engine much more efficient, and worked hard to minimize cache misses, which was a major source of slowdowns in the game’s code. Under his efforts, he’s gotten the engine running at 60fps (although not yet in a playable version of the original). While these optimizations are not the kind of thing that can keep being found indefinitely, he’s bound to run out of ways to tune up the code, currently he’s still finding new ways to speed it up.

I hope you’re ready for some F-U-N (approximation FUNctions)

He made a Youtube video detailing his most recent optimization find: getting the game’s trigonometric functions executing at their speediest. What is interesting is that the Mario 64 code already uses a couple of tricks to get sine and cosine results in a rapid manner: the game only uses 4096 discrete angles of movement direction, and contains a lookup table that covers each of those angles. But it turns out that this optimization is actually a mis-optimization, because the RAM bus hits incurred to read the values into the cache are actually more expensive than just figuring out the values in code on the N64’s hardware!

The video starts out decently comprehensible, but eventually descends into the process of figuring out sine and cosine on the fly, and the virtues of the various ways this can be done, so you can’t be faulted for bailing before the end, possibly at the moment the dreaded words “Taylor series” are mentioned. But it’s a fairly interesting watch until then!