Written on 2013-07-05 11:44:00
This will be the last post about emulation that doesn't involve graphics or disassembly of old NES games, I promise. cl-6502 0.9.5 is out and, in my testing with SBCL, pretty snappy. The book has received updates and is also available on lulu. Below is the 'Lessons Learned - Common Lisp' chapter:
Structures are much more static than classes. They also enforce their slot types. When you have a solid idea of the layout of your data and really need speed, they're ideal.
CLOS, for single-dispatch at least, is really quite fast. When I redesigned the emulator to avoid a method call for every memory read/write, my benchmark only ran ~10% faster. I eventually chose to stick with the new scheme for several reasons, performance was only a minor factor.
My second big speedup came, indirectly, from changing the arguments to the
opcode lambdas. By having the opcode only take a single argument, the CPU, I
avoided the need to destructure the opcode metadata in
step-cpu. You don't
want to destructure a list in your inner loop, no matter how readable it is!
That is, the times I found myself using it always involved computing data at
compile-time that would be stored or accessed in a later phase. E.g. I used
it to ensure that the status-bit enum was created for use by
*mode-bodies* variable was bound in time for
try to go without it if possible.
DECLAIM is for global declarations and DECLARE is for local ones. Once you've
eked out as many algorithmic gains as possible and figured out your hotspots with
the profiler, recompile your code with
(declaim (optimize speed)) to see what
is keeping the compiler from generating fast code. Letting the compiler know the
FTYPE of your most called functions and inlining a few things can make a big