Optimizing PennMUSH

Submitted by raevnos on Mon, 2007-01-29 11:33

Penn isn't typically a performance hog, especially on modern computers. Even older ones from the mid-90's are more than powerful enough to run a decent-sized game on. The major constraints with those older computers is hard drive space and memory. This document has some hints for those times when performance is a factor, or for those who just like to tweak.

Generally, network transmission times are long enough that they're the major factor in lag or slowdown. You can tell if it's the mush process itself that's causing problems by looking at @uptime to see how much processor time the game is using, and, from a shell, using top to find out what's using the largest percentages of CPU time. Problems could very well be caused by an unrelated program and an OS with a scheduler that starves other processes of CPU cycles. The traceroute and ping programs can be used to time how long it takes network traffic to reach a host and where slow spots are.

Compiler options

If Penn itself is running slow after all, the big step to improving it is to recompile it with optimizations. Edit pennmush/Makefile to add to the CCFLAGS line. -ON turns on basic optimizations, where N is 1 to 3. The higher the number, the more optimizations, but the longer it takes to compile. After editing, do a 'make clean; make' to rebuild the server.

Other optimization options of note (Assuming you're using gcc):

  • -march=X -mcpu=X

    This tunes the outputted binary for a particular processor type's instructions and scheduling. For the typical x86 system, you can use 'pentium4', 'athlon-xp', and so on. (On PowerPC systems, and newer GCCs, use -mtune instead of -march). The complete list of known processors is in the gcc documentation, as well as more architecture-dependent options.

  • If you have a processor with a large cache, -funroll-loops will help.
  • The game doesn't have any floating-point intensive parts, so -ffast-math's benefit is limited. If you're using a processor with Intel's SSE instructions, try -mfpmath=sse. This is the default on x86_64.
  • If you don't care about debugging, remove the -g option, and add -fomit-frame-pointer to give gcc another register to work with — especially useful on x86.
  • With gcc 4 and higher, compile with -fprofile-generate, run the mush with normal activity for a few days, and recompile using -fprofile-use instead, with all other options the same. This'll give the scheduling and branch prediction algorithms real-world data to use to make better choices.

Memory

If memory is an issue (Older machines, and/or huge databases), there are a few steps you can take.

  • Fiddle with the attribute compression choices in options.h. Huffman or word-based might be better for a particular game depending on the contents of the database
  • Tweak the chunk swapfile options in game/mush.cnf. Infrequently-used attributes and mail messages get swapped out to disk. You can increase the size of the swapfile, and how much data gets moved every second. See the relevant section of the 'Hacking PennMUSH' book for details on this.
  • Increase the OS's swap space. Increasing or adding additional swap partitions can help.

Disk space/speed tradeoffs

When compressing databases, you have to choose between speed and compression efficiency. bzip2, for example, will produce the smallest database file, but take the longest. Using no compression will result in the fastest dump, but the largest file on disk.

If you're using forking dumps, this isn't an issue, of course.

Other considerations

If you use SQL, take a close look at any SELECTs. Each one pauses the mush until it finishes, so doing complex joins or talking to a SQL server across a slow network will cause problems.

Internally, the developers try to use algorithms with good big-O performance (A measure of how the time increases as the amount of data increases). If you know of a better one than what we're using for something, let us know.