JIT part 2

Success!

I have lock bytecode being compiled to native powerpc binary on-demand. There's still a lot of work -- this is so far just in proof-of-concept -- and don't expect to see it in a Penn release any time soon (If ever). Among many other things, I need to do some benchmarking and find out if there's any benefit to doing this.

Transcript of a gdb session follows the break.

Script started on Sat Aug 11 08:18:24 2007
[shawnw@iDrone ~/src/penn/1.8.3/jit/game]$ gdb netmush 9169
GNU gdb 6.3.50-20050815 (Apple version gdb-573) (Fri Oct 20 15:54:33 GMT 2006)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "powerpc-apple-darwin"...Reading symbols for shared libraries ..... done

/Users/shawnw/src/penn/1.8.3/jit/game/9169: No such file or directory.
Attaching to program: `/Users/shawnw/src/penn/1.8.3/jit/game/netmush', process 9169.
Reading symbols for shared libraries . done
0x9001f988 in select ()
(gdb) break eval_lock
Breakpoint 1 at 0x8cd44: file lock.c, line 819.
(gdb) c
Continuing.

Breakpoint 1, eval_lock (player=2, thing=1, ltype=0x280e603 "USE") at lock.c:819
819	  struct lock_list *ll = getlockstruct(thing, ltype);
(gdb) n
822	  if (!ll)
(gdb) n
825	  if (!unparsing_boolexp)
(gdb) n
826	    log_activity(LA_LOCK, thing, unparse_boolexp(player, ll->key, UB_DBREF));
(gdb) n
829	  if (ll->fun == NULL)
(gdb) p ll->fun
$1 = (boolexp_func) 0
(gdb) n
830	    ll->fun = jit_compile_boolexp(ll->key);
(gdb) n
832	  if (ll->fun)
(gdb) p ll->fun
$2 = (boolexp_func) 0x2825400
(gdb)  disassemble 0x2825400 0x28254b4
Dump of assembler code from 0x2825400 to 0x28254b4:
0x02825400:	mflr    r0
0x02825404:	stw     r0,8(r1)
0x02825408:	stwu    r1,-96(r1)
0x0282540c:	stmw    r22,56(r1)
0x02825410:	mr      r23,r3
0x02825414:	mr      r22,r4
0x02825418:	mr      r30,r23
0x0282541c:	lis     r24,17
0x02825420:	lwz     r9,-25000(r24)
0x02825424:	cmpwi   r9,10
0x02825428:	blt-    0x2825458
0x0282542c:	mr      r3,r30
0x02825430:	lis     r24,0
0x02825434:	ori     r24,r24,55168
0x02825438:	mtctr   r24
0x0282543c:	bctrl
0x02825440:	li      r3,0
0x02825444:	lwz     r0,104(r1)
0x02825448:	mtlr    r0
0x0282544c:	lmw     r22,56(r1)
0x02825450:	addi    r1,r1,96
0x02825454:	blr
0x02825458:	li      r31,0
0x0282545c:	li      r9,1
0x02825460:	cmpwi   r9,0
0x02825464:	blt-    0x282549c
0x02825468:	lis     r24,16
0x0282546c:	lwz     r10,-13860(r24)
0x02825470:	cmpw    r9,r10
0x02825474:	bgt-    0x282549c
0x02825478:	lis     r10,15
0x0282547c:	ori     r10,r10,51672
0x02825480:	add     r10,r10,r9
0x02825484:	lwz     r11,56(r10)
0x02825488:	andi.   r24,r11,16
0x0282548c:	bne-    0x282549c
0x02825490:	subf    r24,r30,r9
0x02825494:	subfic  r31,r24,0
0x02825498:	adde    r31,r31,r24
0x0282549c:	mr      r3,r31
0x028254a0:	lwz     r0,104(r1)
0x028254a4:	mtlr    r0
0x028254a8:	lmw     r22,56(r1)
0x028254ac:	addi    r1,r1,96
0x028254b0:	blr
End of assembler dump.
(gdb) n
833	    return ll->fun(player, thing);
(gdb) quit
The program is running.  Quit anyway (and detach it)? (y or n) y
Detaching from program: `/Users/shawnw/src/penn/1.8.3/jit/game/netmush', process 9169 thread 0xd03.
[shawnw@iDrone ~/src/penn/1.8.3/jit/game]$ exit
Script done on Sat Aug 11 08:21:00 2007

The lock in question is =#1; about the simplest you can get. Much of the generated instructions is just checking to make sure the lock refers to a valid non-garbage object, and standard function prologue stuff. Then, really weird stuff is going on at the end when it should be just comparing 2 registers. I don't know enough ppc assembly to tell if what's happening is a clever trick, or braindeadness in lightning (Which is not very smart. The documentation repeats this several times. It's a tradeoff the developers made for speed.)

While this works on 32-bit powerpc, it's causing a bus error on 64-bit sparc (Maybe lightning only works with 32-bit sparc?). I haven't tried x86 or x86_64 yet.

I didn't realize that every time a lock is tested, it gets decompiled and logged. It's supposed to be to help track down crashes, but man, that's a performance issue. Maybe it should be optional...