Improved Means for Achieving Deteriorated Ends

Written on 2012-06-22 17:49:15

Going into this project, I knew almost nothing about emulation. I still know very little. But I was tired of seeing emulators written in C and Java for performance or portability that wound up looking like a big switch statement sooner or later. My 6502 emulator is ANSL CL and should run anywhere that sbcl, clisp, ccl, or any other fine Common Lisp implementation will run. Granted, if a totally new processor architecture comes out, it's probably easier to cross-compile/port a 6502 in C than the CL compiler hosting cl-6502 but I digress. I wanted to write a fast, high-level emulator. The closest thing I found to an emulator design that I liked was py65 but even that was a bit less abstract than I would've hoped.

I'm still searching for ways to improve the core abstractions that I have (and no doubt there are many hardcore low-level hackers that would be disgusted by my work) but I'm enjoying the process and here are some preliminary thoughts...

Addressing modes!

Addressing modes, addressing modes, addressing modes. This is the biggest difference between assembly language and the languages I use day to day. Hence, it's a crucial abstraction to get correct. For the first time, Common Lisp's notion of "generalized places" has been a real boon. There were two nuances to addressing modes that I found I really needed to account for.

Some modes access CPU registers, others RAM. (This mostly just effects the way I need to use setf.)

Most opcodes use the byte /at/ an address rather than the address, but sometimes an opcode *does* need the address.

To solve these issues and abstract some code patterns, I wrote a macro called defaddress. It defines a method on the CPU that returns the address computed by the mode /and/ generates a setf function that sets the register in the CPU struct or the byte in memory based on a cpu-reg keyarg to the macro. Finally, to solve the issue that we most often want the byte, opcodes are defined as either being :raw or not (the default). If they're not, instead of passing the mode symbol to be funcalled, we pass a lambda that gets the byte at the address computed by the mode. So far, concerns seem nicely separated. Time will tell if I've struck on the right design here.

Opcodes

Opcodes in cl-6502 are really mnemonics, a set of opcodes that encode the same language primitive but for different addressing modes. As long as the foundation of addressing modes as funcallables is there to support the opcodes, you can write any opcode cleanly with a single body of code shared across all addressing modes. This has made me deeply happy as it seems to me to be _THE RIGHT THING_.

The implementation of defopcode is a bit hairy, particularly the EVAL-WHEN block to make sure metadata about the opcodes gets set in the *opcodes* array at load-time, but the supporting defins macro is fairly clean. The important thing is that opcode definitions wind up looking marvelous. For example, here's ASL and BCC:

    (defopcode asl (:docs "Arithmetic Shift Left" :raw t)
        ((#x06 5 2 'zero-page)
         (#x0a 2 1 'accumulator)
         (#x0e 6 3 'absolute)
         (#x16 6 2 'zero-page-x)
         (#x1e 7 3 'absolute-x))
      (update-flags (funcall mode cpu) '(:carry))
      (let ((result (wrap-byte (ash (funcall mode cpu) 1))))
        (update-flags result)
        (funcall setf-form result)))

    (defopcode bcc (:docs "Branch on Carry Clear" :track-pc nil)
        ((#x90 2 2 'relative))
      (branch-if (lambda () (zerop (status-bit :carry cpu))) cpu))

ASL defines 5 methods here each with a different addressing mode but sharing the same body. They update flags in the status register as expected, increment the program counter properly, and put metadata in *opcodes* to aid with dispatch and disassembly. Not bad, eh?

Objects, Functions, Macros, Whatever!

So far, I've mostly just used objects (read: CLOS) to define some conditions, two core methods on the CPU (step and execute), and the instructions themselves. The CPU itself is defined as a Struct rather than a Class. All the instructions are methods EQL-specialized on the opcode and the opcode alone which should make dispatch pretty speedy. The methods reference the *cpu* global directly and since Common Lisp has /usable global variables/ that look up the most recent binding in the current thread on reference, I should be able to run many instances safely in different threads on a single core. Just do something like...

    (make-thread :foo (lambda ()
           (let ((*ram* (make-array (expt 2 16) :element-type '(unsigned-byte 8))
                 (*cpu* (make-cpu))))
             &body))) ;; and we're off to the races!

What's next?

I have a bunch of potential ideas for what's next. I want to extend this work towards full NES emulation in the browser. There isn't a formal ordering of priorities or milestones yet (cause this is kind of an art project), but coming up with a sane, RESTful API to sit on top of cl-6502 is probably the next step. Hopefully I'll get some time to hack on that this weekend.

cl-6502: An assembler! Unit tests + bug fixing.

famiclon: An NES emulator backend built on romreader and cl-6502. Video, Sound. Input?
Qeng-Ho: Hunchentoot+ST-JSON, REST API wrapping cl-6502+famiclon, No persistence! Multiple CPUs! Internal private API for now.

Pham-Nuwen: Clojure+Clojurescript@Deepclouds.net. Persistence! Nice web interface. Graphics w/canvas! Etc...

cl-z80: DO THAT SHIT! (how hard could it be? just another 8bit thing. see emu-docs.org)

comments powered by Disqus

Unless otherwise credited all material by Brit Butler

An Emulator Design Pattern

Addressing modes!

Opcodes

Objects, Functions, Macros, Whatever!

What's next?