GB Studio, by Chris Maltby, is fairly well-known now, isn’t it? It’s a free and open source solution to fairly easily making Gameboy roms on your own, that are properly termed not romhacks but homebrew. It has its own website and it’s available on itch.io. It was what Grimace’s Birthday, which we linked to last year, was made with.
Now there’s a heavily-modified version of GB Studio, called BB Studio, that produces NES roms in a similar manner! It’s made by Michel Iwaniec, and can be gotten from Github here. It’s recommended that you be familiar with GB Studio first, and to read the list of caveats on the page. Particularly, the NES supports fewer sprites per scanline than the Gameboy hardware does, and runs at a slower clock speed. BB Studio is also “early alpha software,” meaning, it might or might not work well for you at the moment.
While we’re on the topic I should also mention NES Maker, which isn’t free, but it also isn’t “early alpha software,” and at $36 isn’t expensive either, and is custom-built for generating runnable NES games.
Sometimes I feel like I should put a content warning here when the technical level of a post is higher than usual. This one would probably be a five out of five for geekery. It’s a video from NESHacker on counting score on the Nintendo Entertainment System. But I don’t want to discourage you from watching it! It’s nine minutes long, and it contains a definition of the term double dabble.
Human-readable numbers are tracked by computers in a number of different ways. Nowadays we basically just do a printf or some version of it, but on a 1 megahertz platform, optimization really matters. It’s easy to think of computers as being impossibly fast, but in truth speed only ever counts relative to the efficiency of the algorithm you use. Computers are fast, but they aren’t all that fast.
One of the big tradeoffs in processor design is, fewer complex instructions that do a lot but take a lot of cycles, and processor complexity, to execute, or many simple instructions, each doing little and being relatively simple, and not needing a complex processor design to implement.
The 6502 microprocessor generally follows the latter design philosophy. It made some important tradeoffs to keep costs down. For example, it doesn’t have hardware that can multiply arbitrary numbers together. It relies on the programmer, or else a library author, to use the instructions given to code their own multiplication algorithm, if they need one. The result is going to be slower, probably, that if the chip had the circuits to do this automatically in silicon, but it reduced the cost of the chip, basically allowing more to be made, or else increasing the profits for the manufacturer.
Personally I’m a fan of just storing the score as a series of digits that match up to their positions in the character set. Gain 1,000 points? Just bump the 1000s-place up by one, and if it goes past 9, subtract 10 and bump the 10,000s place. That’s a tried-and-true system that many games use, and works well if all you ever have to do is add numbers. Comparing values, like for detecting extra life award levels, make things slightly more complex, but not by much. There’s sometimes other factors involved though, and that may explain why Super Mario Bros. uses different systems for its counters, as explained by NESHacker.
Today’s link is to a madperson who explains how to compute digits of pi on a NES’s 6502 to an arbitrary length. As you do. Along the way it explains how to multiply and divide in binary on a processor without hardware support. It’s around nine minutes long, but if you want a machine to get to the end of pi it’ll probably take a tad bit longer.
We link to such a variety of things here. Sometimes we post light videos where someone has Kirby do funny things. Sometimes we show explainers that explain how to do arithmetic on old processors. I presume that you’ll take from these what you want, and leave the rest to the crazy people, by definition the people who are not you. I understand.
It’s only two episodes in, but this series from the Youtube channel What’s Ken Making is already really interesting, with episodes averaging at around 16 minutes each. The first part is titled “The Design of a Legend,” which doesn’t really grab me much, but the second is about the main processor, “The 6502 CPU,” which Ken admits near the start isn’t exactly accurate. The Famicom/NES’s processor isn’t precisely a MOS 6502; it’s a Ricoh 2A03 in NTSC territories, and a 2A07 in others. The 2A03 is licensed from MOS, but lacks the original’s Binary-Coded Decimal mode, and includes the Famicom/NES’s sound hardware on-die.
Episode 1 (15 minutes):
Episode 2 (17 minutes):
That removed BCD feature. Why? The video notes that the circuits are right there within the chip, but have been disabled by having five necessary traces severed. The video notes that the 6502’s BCD functionality was actually patented by MOS, and asks, was the feature disabled because of patent issues? Was Ricoh trying to avoid paying royalties?
EDIT: I got the name of the chip wrong, as xot pointed out in a comment. I knew the right now but I always get it mixed up. Corrections have been made, here is xot’s comment:
“The 65C02 is a low-power CMOS variant of the venerable 8-bit 6502 with minimal extra abilities. The 6502 successor used in the Apple IIGS is the 16-bit 65C816. It was designed by Western Design Center in collaboration with Apple, Inc. The story that Steve Jobs held back the IIGS in favor of the Mac is popular because it perpetuates Jobs’ mythic status of being a petty, conniving villain … but it isn’t true. The Apple IIGS was created atop a heap of questionable design decisions. No one decision doomed it but its CPU absolutely held it back. The very boring truth is that WDC could not reliably supply ‘816 processors at the speeds they promised (up to 14 MHz). The IIGS is limited to 2.8 MHz because Apple needed a stable product, which unfortunately was way slower than it should have been.”
Some of this slightly contradicts what was said in the video, but not that far. Whether Steve Jobs was petty and conniving or not I will leave to the ages, at least for now.
It had Apple’s first color point-and-click interface, and it ran on a 65C816.
It was the Apple IIGS. It was released two years after the original Macintosh, three after the Lisa, and it worked surprisingly well. It came with 256KB of memory stock but could be gotten with a whole megabyte, and could be expanded to up with 8 MB–in 1986! It supported hard drives and devices could be attached to it via the Apple Desktop Bus. It ran at less than 3MhZ, but its processor was capable of going much faster, with the rumor being that it was a decision of Steve Jobs to limit its processor so it wouldn’t steal the Macintosh’s thunder. (Jobs had been forced out of the company by the time the GS was released, but these decisions are not so easily reversed?)
What’s more the Apple IIGS was made to compete with the Amiga, and so it had considerable audio-visual advantages over the black-and-white Macintosh. 4096 colors and a sound chip designed by the people who had created the SID. And while it had a mode that made it compatible with Apple II software, it used an OS that looked and worked a whole lot like a Macintosh. It was surprisingly capable as a gaming machine; it took a long time, but in 1997 an Apple IIGS version of Wolfenstein 3D was made, although running at a pretty low frame rate:
The 65C816, a 16-bit version of the classic 6502, was used in a number of platforms but ultimately didn’t have the reach of its predecessor. But if Apple had thrown more weight behind the GS, we could well be living in a world where 6502 variants still saw use outside of embedded and hobbyist systems, instead of the Intel and ARM chips that dominate the market today.
I’m thinking along these lines because Vintage Geek made a video about the GS’s virtues, and it’s interesting to speculate about. It really was a kind of wonder machine, and the last gasp of the Apple II line. Here it is (15 minutes):
This is a 52-minute talk from 2010, from the 27th Chaos Communication Congress in Berlin, Germany (the talk is in English), presented by Michael Steil of Visual 6502, which successfully reverse engineered the venerable 6502 microprocessor, a chip used, in one capacity or another, in one form, or another, in all the Apple, Commodore and Atari microcomputers, the BBC Micro, the Atari 5200, in a modified from the Atari 2600 the NES, and countless arcade games, as well as in other places.
The talk is intended for a technical audience… literally. When the speaker asks who in the audience has coded in assembly before, practically everyone raises their hands. It’s recognized that we at Set Side B veer wildly between the most surface-level populist material and in-depth treatments for those with gigantic capacities for technical discussion and the attention span of a Galapagos Giant Tortoise. We like to think this is charming, and will listen eagerly if you tell us that you agree.
Anyway, here is that talk. I already mentioned that it’s 53 minutes. If that’s too long, there’s a speed-up function on Youtube. If that’s too technical, well, I don’t know how to help there. Maybe a read through pagetable.com’s documentation on the 6502. Oops! I’ve made it worse, haven’t I. Well, if you like, you might console yourself that the 6502 is really a simple processor to learn to code in. I’ve done it myself! There’s no memory management, there’s only three general-purpose registers, the stack is fixed in place, and all opcodes are one byte. It’s so simple that an extremely motivated child could learn it. Guess how I know?
Here’s another of those deep-dive NES internal videos from Behind the Code, possibly the most complex one they’ve done to date. Most game engines, when you examine their basic logic, are basically physics simulations, with some AI included to determine how actors behave.
Not so with the Punch-Out!! games. They are essentially entirely different kinds of games from that. You have certain things you can do moment to moment, and opposing boxers do too. Each of those opponents basically runs a big script, made out of byte code, that determines their behavior throughout each round of each fight. I am struck both by the simplicity (no need to simulate gravity) and the complexity (boxers take all kinds of things into account) of the system.
One of the interesting things shown is that the engine can affect more than just the boxers, but can also subtly affect the crowd, which is how the previously-revealed fact that a specific camera person in the crowd uses his flash right at the moment the player must counter Bald Bull’s charge move. It turns out that this isn’t the only instance of this happening in the game!
You don’t need to know 6502 assembly code to get what the narrator is talking about, but a lot of code is shown, so those of you who understand it may get a bit more out of it. Here are a few basics to help you follow along.
The 6502 has only three registers (bits of memory internal to the CPU that can be accessed quickly), the Accumulator (sometimes called just A), the X register, and the Y register. Each is only one byte long. The Accumulator is by far the most flexible, but all three are general-purpose registers. The most common instructions are Loads (LDA, LDX, LDY), Stores (STA, STX, STY), Transfers between registers (TAX, TAY), Incrementing and Decrementing (INX, INY, DEX, DEY), Adding (ADC), Subtracting (SBC), Comparing (CMP), Branches (some of them, Branch Not-Equal to Zero: BNE, Branch Equal to Zero: BEQ, Branch of Carry Set: BCS, Branch on Carry Clear: BCC), Jump (JMP), Jump to Subroutine (JSR), and Return from Subroutine (RTS). While some instructions are just one byte long, the longest any 6502 instruction can be is three bytes, and the opcode (the command itself) is always just one.
(I wrote all of that from memory. I figured, I have all of this in my head from my coding youth, I might as well use some of it.)
The 6502 can only address 64K of memory, so often systems will use bank switching to connect various memories to it within that space. The great majority of NES/Famicom games had to do this. Punch-Out!! was unique on the NES in that it was the only game to use Nintendo’s MMC2 chip. (I wonder if the chip was designed ahead of time, and they made this game as an excuse to use it?) Punch-Out!! uses MMC2 to bank in each boxer’s large data script as needed.
The Commodore 64 was, for its time, quite a wonder, an inexpensive home computer with 64K of RAM and excellent for its time graphics and sound capabilities. Sadly, it came with one of the more limited versions of Microsoft BASIC out there.
Microsoft BASIC had its strengths, but many of them were not a good match for its hardware. The C64 had no commands to take advantage of any of its terrific features. To do nearly anything on the machine besides PRINTing and manipulating data, you had to refer to a small number of cryptic-yet-essential commands: POKE for putting values into arbitrary memory addresses, PEEK for reading values out of them, READ and DATA to read in lists of numbers representing machine language routines, and SYS to activate them.
And getting the values to do those things required obtaining and poring over manuals and the venerable C64 Programmer’s Reference Guide. Even then, Microsoft BASIC was notably slow, especially when doing work with numbers, due to its dogged insistence of converting all values, including integers, into floating point before doing any math on them. So while BASIC supported integers, which required less memory to store, actually slowed the machine down due to the need to convert to and from floating point whenever an operation needed to be performed on them. This doesn’t even begin to get into the many inefficiencies of being an interpreted language.
Vision BASIC, an upcoming commercial compiled language for the Commodore 64, looks to remedy many of these faults. The above video is a nearly 40-minute explainer and demonstration of the system. It requires the purchase of a memory expansion unit in order to be used on a physical machine, but it can produce executable code that can be run on a stock C64 as it came out of the box.
It’s not free, and at $59 for the basic package it may seem a little high for a system for developing software on a 40-year-old computer, but that price includes the software on floppy disk and a USB drive. It’s certainly capable, and runs much faster than many other compiled languages on the system. It’s definitely something to look into for people looking to make games on the system without digging deep into assembly, and if you have a desire to do that it has a built-in assembler for producing in-line machine code too! It is an intriguing new option for Commodore development.