Lessons from cl-6502
Written on 2013-07-05 11:44:00
This will be the last post about emulation that doesn't involve graphics or disassembly of old NES games, I promise. cl-6502 0.9.5 is out and, in my testing with SBCL, pretty snappy. The book has received updates and is also available on lulu. Below is the 'Lessons Learned - Common Lisp' chapter:
Structures can be preferable to classes
Structures are much more static than classes. They also enforce their slot types. When you have a solid idea of the layout of your data and really need speed, they're ideal.
CLOS is fast enough
CLOS, for single-dispatch at least, is really quite fast. When I redesigned the emulator to avoid a method call for every memory read/write, my benchmark only ran ~10% faster. I eventually chose to stick with the new scheme for several reasons, performance was only a minor factor.
Destructuring is more expensive than you think
My second big speedup came, indirectly, from changing the arguments to the
opcode lambdas. By having the opcode only take a single argument, the CPU, I
avoided the need to destructure the opcode metadata in step-cpu
. You don't
want to destructure a list in your inner loop, no matter how readable it is!
Eval-when is about data more than code
That is, the times I found myself using it always involved computing data at
compile-time that would be stored or accessed in a later phase. E.g. I used
it to ensure that the status-bit enum was created for use by set-flags-if
and
the *mode-bodies*
variable was bound in time for defaddress
. Regardless,
try to go without it if possible.
Use DECLAIM (and DECLARE) wisely
DECLAIM is for global declarations and DECLARE is for local ones. Once you've
eked out as many algorithmic gains as possible and figured out your hotspots with
the profiler, recompile your code with (declaim (optimize speed))
to see what
is keeping the compiler from generating fast code. Letting the compiler know the
FTYPE of your most called functions and inlining a few things can make a big
difference.