An Emulator Design Pattern

posted on 2012-06-22 17:49:15

Going into this project, I knew almost nothing about emulation. I still know very little. But I was tired of seeing emulators written in C and Java for performance or portability that wound up looking like a big switch statement sooner or later. My 6502 emulator is ANSL CL and should run anywhere that sbcl, clisp, ccl, or any other fine Common Lisp implementation will run. Granted, if a totally new processor architecture comes out, it's probably easier to cross-compile/port a 6502 in C than the CL compiler hosting cl-6502 but I digress. I wanted to write a fast, high-level emulator. The closest thing I found to an emulator design that I liked was py65 but even that was a bit less abstract than I would've hoped.

I'm still searching for ways to improve the core abstractions that I have (and no doubt there are many hardcore low-level hackers that would be disgusted by my work) but I'm enjoying the process and here are some preliminary thoughts...

Addressing modes!

Addressing modes, addressing modes, addressing modes. This is the biggest difference between assembly language and the languages I use day to day. Hence, it's a crucial abstraction to get correct. For the first time, Common Lisp's notion of "generalized places" has been a real boon. There were two nuances to addressing modes that I found I really needed to account for.

Some modes access CPU registers, others RAM. (This mostly just effects the way I need to use setf.)

Most opcodes use the byte /at/ an address rather than the address, but sometimes an opcode *does* need the address.

To solve these issues and abstract some code patterns, I wrote a macro called defaddress. It defines a method on the CPU that returns the address computed by the mode /and/ generates a setf function that sets the register in the CPU struct or the byte in memory based on a cpu-reg keyarg to the macro. Finally, to solve the issue that we most often want the byte, opcodes are defined as either being :raw or not (the default). If they're not, instead of passing the mode symbol to be funcalled, we pass a lambda that gets the byte at the address computed by the mode. So far, concerns seem nicely separated. Time will tell if I've struck on the right design here.

Opcodes

Opcodes in cl-6502 are really mnemonics, a set of opcodes that encode the same language primitive but for different addressing modes. As long as the foundation of addressing modes as funcallables is there to support the opcodes, you can write any opcode cleanly with a single body of code shared across all addressing modes. This has made me deeply happy as it seems to me to be _THE RIGHT THING_.

The implementation of defopcode is a bit hairy, particularly the EVAL-WHEN block to make sure metadata about the opcodes gets set in the *opcodes* array at load-time, but the supporting defins macro is fairly clean. The important thing is that opcode definitions wind up looking marvelous. For example, here's ASL and BCC:

    (defopcode asl (:docs "Arithmetic Shift Left" :raw t)
        ((#x06 5 2 'zero-page)
         (#x0a 2 1 'accumulator)
         (#x0e 6 3 'absolute)
         (#x16 6 2 'zero-page-x)
         (#x1e 7 3 'absolute-x))
      (update-flags (funcall mode cpu) '(:carry))
      (let ((result (wrap-byte (ash (funcall mode cpu) 1))))
        (update-flags result)
        (funcall setf-form result)))

    (defopcode bcc (:docs "Branch on Carry Clear" :track-pc nil)
        ((#x90 2 2 'relative))
      (branch-if (lambda () (zerop (status-bit :carry cpu))) cpu))

ASL defines 5 methods here each with a different addressing mode but sharing the same body. They update flags in the status register as expected, increment the program counter properly, and put metadata in *opcodes* to aid with dispatch and disassembly. Not bad, eh?

Objects, Functions, Macros, Whatever!

So far, I've mostly just used objects (read: CLOS) to define some conditions, two core methods on the CPU (step and execute), and the instructions themselves. The CPU itself is defined as a Struct rather than a Class. All the instructions are methods EQL-specialized on the opcode and the opcode alone which should make dispatch pretty speedy. The methods reference the *cpu* global directly and since Common Lisp has /usable global variables/ that look up the most recent binding in the current thread on reference, I should be able to run many instances safely in different threads on a single core. Just do something like...

    (make-thread :foo (lambda ()
           (let ((*ram* (make-array (expt 2 16) :element-type '(unsigned-byte 8))
                 (*cpu* (make-cpu))))
             &body))) ;; and we're off to the races!

What's next?

I have a bunch of potential ideas for what's next. I want to extend this work towards full NES emulation in the browser. There isn't a formal ordering of priorities or milestones yet (cause this is kind of an art project), but coming up with a sane, RESTful API to sit on top of cl-6502 is probably the next step. Hopefully I'll get some time to hack on that this weekend.

cl-6502: An assembler! Unit tests + bug fixing.

famiclon: An NES emulator backend built on romreader and cl-6502. Video, Sound. Input?
Qeng-Ho: Hunchentoot+ST-JSON, REST API wrapping cl-6502+famiclon, No persistence! Multiple CPUs! Internal private API for now.

Pham-Nuwen: Clojure+Clojurescript@Deepclouds.net. Persistence! Nice web interface. Graphics w/canvas! Etc...

cl-z80: DO THAT SHIT! (how hard could it be? just another 8bit thing. see emu-docs.org)

On Interactive Retrocomputing

posted on 2012-06-18 04:27:59

Lately, I've been working on an emulator for the MOS 6502 processor in Common Lisp that I've been boring enough to name cl-6502. The emulator is basically finished as is the disassembler. There are also pretty solid docs. An assembler should be added soon and hopefully a helper utility or two and unit tests. But why do this? Well, for a couple of reasons.

1) I never did enough Systems Programming. I never did any assembly in college and wrote an absolute paucity of C (granted, that's my fault). I never learned enough about the inner working of Operating Systems or how to exploit memory hierarchies. I want to know more about how the machine works at a low level. Writing an emulator in a high-level language isn't a great way to do that but I wanted to anyway. Writing some assembly programs to run on this emulator might help though and I hope to do some of that later.

2) I was curious how concise, extensible, and performant an emulator could be written with Common Lisp. Most emulators are written in C/C++ for performance reasons. There are a few in Java (for portability?) or Javascript for what we now call portability but even these are not terribly high-level from a design perspective. I don't have all the answers to this question yet but I'm excited by some of the work I've done so far. In particular, writing macros for Addressing Modes and Opcodes has been quite helpful and I expect CLOS's :before, :after, and :around methods to go a long way where extensibility is concerned. I'm hoping the EQL-specialized methods, SBCL, and perhaps some shrewd profiling can lead to good performance. Also, 45 lines of code for a 6502 disassembler doesn't seem bad to me. :P

3) ICU64/Frodo Redpill. I'm not wild about static types and I tend to write tests after the fact because I view programs as clay until they're ready to be fired in the kiln. Roly Perera's work on self-explaning computation interests me a lot as does Chris Granger's work on Light Table. Rapid feedback loops are important. Maintaining flow is important. As far as I'm concerned, all emulators should strive towards the sort of "peek/poke the machine" experience that ICW/Frodo Redpill offers. Games are a very easy way to get people to engage with computers. Everybody likes games. And lots of folks at some point in wondering about programming, ponder how much is involved in changing something about a game they like. With a system like ICU/Frodo Redpill they could literally /see/ the answer. Add an integrated editor and you're in pretty interesting territory. Feel like changing something about the "hardware"? Feel like having breakpoints and step debuggers pop up on arbitrary memory accesses or instruction executions? You got it. But ICU64/Frodo Redpill has this all locked down on desktops. Why not do as much as possible with HTML5, <canvas>, and Clojurescript? I'm not going to be the guy to come up with the next Mother of all Demos ... but I hope to make something cool. And if I'm really lucky, I'll have something interesting to play with online by Strange Loop on September 23rd. I've already got a 6502 emulator. What's next?

A tentative Strange Loop 2012 Schedule

posted on 2012-06-17 00:30:57

Strange Loop has posted their schedule for 2012 and my company has been kind enough to send me. Without further ado, here's my current thought on which talks I'll attend. I just can't wait for September. :)

;; Sunday, September 23 (Emerging Languages Preconf)
-- 7:30 flight? ZOMG WHAT WAS I THINKING?!?
09:30 Jeremy Ashkenas - Symbiotic Languages: Transpiling into Javascript
10:30 Ostap Cherkashin - Bandicoot: code reuse for the relational model
11:30 Hakan Raberg - Clever, Classless and Free?
12:40 Michael Fogus - The Reemergence of Datalog
13:20 Brian McKenna - Roy
14:40 David Herman - Rust
15:50 James Noble - Grace: an open source educational OO language
16:30 Jose Valim - Elixir: Modern Programming for the Erlang VM
17:10 David Pollak - Visi: Cultured & Distributed

-- STRAAAAANNNGE LOOOOP
;; Monday, September 24
09:00 Michael Stonebraker - In-Memory Databases
10:00 Dustin Getz - Monad Examples for normal people, in Python and Clojure
11:00 Pieter Hintjens - Software Architecture using ZeroMQ
-- (or Functional Design Patterns - Stuart Sierra)
12:20 Neil Milstead - Augmented Reality and CV
13:00 Craig Kersteins - Postgres Demystified
14:00 Neha Narula - Executing Queries on a Sharded Database
-- (or Clojurescript by David Nolen)
15:30 Scott Vokes - Data Structures: The Code That Isn't There
-- (or Lessons from Erlang by Garrett Smith, Types vs Testing by Paul Snively and Amanda Laucher) GAAH
16:30 Rich Hickey - The Database as a Value
17:30 Lars Bak - Pushing the Limits of Web Browsers
20:00 Matthew Taylor - Humanity 2.0

;; Tuesday, September 25
09:00 Jeff Hawkins - Computing Like the Brain
10:00 Chris Granger - Behind the Mirror
11:00 Nathan Marz - Runaway complexity in Big Data...and a plan to stop it
12:20 Carlton Mills - Computer Architecture of the 1960s
13:00 Oleg Kiselyov - Guess lazily! Making a program guess and guess well
14:00 Cliff Moon - The Audubon Society for Partial Failures
15:30 Ola Bini - Expressing Abstraction, Abstracting Expression
-- (or Building visual, data-driven UIs with ClojureScript)
16:30 Bret Victor - ?
-- And then I'm out because of my 7:30 flight. Sorry Brendan Eich!

10 Great Hacking Albums

posted on 2012-06-06 18:23:52

My posts on this blog have leaned away from technical content for the last year, tending towards the introspective and music or poetry. I'm hoping to start shifting back in the other direction for a little bit and plan to write about my latest personal hacking project soon. In the interim, here are 10 favorite albums to write code to off the top of my head. I will note at the outset that I prefer ambient and instrumental music for hacking. Ambient stuff in particular seems to naturally encourage a state of "flow" for me. Also, here's a link to some C2 wiki discussion on flow as it relates to programming. Also, as long as we're throwing out great hacking music I might as well shill the mixtapes I've been working on the last two months. :) All three are pretty solid. And there's a one hour extended mix that mashes together I/Omega and Lost Without a Traceback that isn't on soundcloud. Ping me if you're interested. There are some track changeups and much improved transitions in the I./Omega half.

Hackerjams:
Tim Hecker - Harmony in Ultraviolet
Tim Hecker - An Imaginary Country
Tim Hecker - Ravedeath, 1972
Fennesz - Venice
Fennesz - Black Sea
Fuck Buttons - Tarot Sport
Rustie - Glass Swords
Araabmuzik - Electronic Dream
Four Tet - There Is Love In You
Amon Tobin - Supermodified

(Honorable Mentions: Tycho - Dive, Washed Out - Life of Leisure)

If you were to proceed through the above albums in sequence, you'd start with some fantastic downtempo ambient/noise stuff with Hecker and Fennesz, shift into uptempo "noisetronica" with the Fuck Buttons, transition into the over the top and in your face Rustie, ride those dancey vocals and synths into Araabmuzik, and then start winding down with the more midtempo vocals of There Is Love In You and the outstanding groove of Supermodified.