Hacking PennMUSH

Hacking PennMUSH javelin Tue, 2003-01-28 22:39

This is an update of the Hacking PennMUSH section of Javelin's Guide for PennMUSH Gods. This version pertains to PennMUSH 1.7.7 and later.

Adding commands

Adding commands raevnos Sat, 2007-06-23 13:44

This page talks about how to add new builtin commands.

Rule 1: If at all possible, do it via softcode and @command.

Registering a new command

Adding to the command table

To be written. See src/command.c for examples.

Using command_add()

See the example in cmdlocal.c

Writing the command

TBW, possibly as a child page. Variables available in COMMAND() blocks, etc.


Adding new switches

1.8.3p6 and later

Just include the switch name in the string of valid switches for the command.

1.8.3p5 and earlier

To add a new switch that doesn't already exist, add it, in all-caps, as new line to src/SWITCHES. The first time you re-run make after that, hdrs/switches.h and src/switchinc.c will be automatically rebuilt to include the new switch. There's a limit as to how many switches you can have -- to raise it, increase the NUM_BYTES #define in hdrs/command.h. Each time you increase it by 1, you get room for up to 8 more switches.

Testing switches

For switches included in src/SWITCHES: SW_ISSET(sw, SWITCH_FOO).

For switches not in the file (1.8.3p6 and later): SW_BY_NAME(sw, "FOO").

Adding local config options

Adding local config options talvo Thu, 2004-03-18 00:19

Many external coders need to have configurable settings which players may change without recompiling the server. In fact, it could be as easy as using the @config command. As of PennMUSH 1.7.7p28 adding your own is easy.

Adding your config

We will be working primarily with the add_config() function, but first we need to setup some global variables to hold the data. Outside of any function declarations, usually near the top of a source file, we will declare some global variables:

int config_happyserver = 1;
char config_favoriteadmin[256] = "grapenut";

Now that we have a place to store our variables, we can go into the local_config() function in src/local.c and add the following:

add_config("happyserver", cf_bool, &config_happyserver,
           sizeof config_happyserver, "cosmetic");
add_config("favoriteadmin", cf_str, config_favoriteadmin,
           sizeof config_favoriteadmin, "cosmetic");

Here's how we break this function down. The first argument is the name of the option, ie, what does it look for in the .cnf file.

The second argument is what data type the option is. Valid types are cf_int, cf_str, and cf_bool.

The third option is a pointer to the variable that will hold the option's data. Notice that the happyserver int variable has a & in front. The favoriteadmin variable is already a pointer, so it doesn't need the & operator.

The fourth option is the size in bytes of the option's variable.

The last variable is simply a string telling which category the option belongs to.

The final step in adding a config option is to check this line:

  hashinit(&local_options, 10, sizeof(CONF));

In this example we have enough space for 10 new config options. This number should be set at or above the number of config options you have added with add_config().

Using new config options

Using local config options in softcode is just like using regular config options - the config() function is your friend.

To access the value of a local config option in hardcode, use the get_config() function (e.g. get_config("happyserver")). get_config() returns a pointer to a CONF structure, so you'll want to learn about them (reading the code for config_list_helper() can be useful here).

Note: get_config() wasn't properly exposed to hardcode files until 1.7.7p29, so upgrade to that version or later before expecting this to work.

Allocating and tracking memory

Allocating and tracking memory raevnos Tue, 2007-06-12 03:39

Penn includes a basic memory allocation debugger, enabled by setting mem_check to yes in mush.cnf and rebooting. However, to take advantage of this, you have to use special functions to allocate and free memory.

It also (As of 1.8.3p4) features a slab allocator meant for space-efficient allocation of small fixed-size objects.


Slabs raevnos Fri, 2007-06-22 14:52

Each time you malloc() some memory, a bit more than what you requested is actually used -- the malloc system uses some bytes before or after (Or both) what it returns for its own bookkeeping. With large allocations, this doesn't matter much, but with lots of small allocations, it can cause problems. Some malloc implementations are better than others. A slab allocator can be used to reduce much of this overhead no matter how the underlying malloc implementation acts.

Before 1.8.3p4, attribute and lock structures were stored in their own, simple, slabs. In 1.8.3p4, a more general purpose and powerful allocator was added, and many areas of the source converted to use it.

The slab allocator is intended for small, fixed-size objects — structures, basically. strings are not suitable unless they are all the same length. Object sizes should be well under the page size of the host computer's virtual memory system — usually 4K, sometimes 8K.

The allocator uses malloc() to allocate a page's worth of memory at a time, and crams as many objects into that as it can. It keeps track of how many objects in each page are being used and how many are free, so it can delete any unused pages.

When allocating objects from a slab, you can hint as to what page should be looked at first for a free object. This is used to improve cache behavior (For example, when iterating through all attributes on a database object, it's faster if most of the structs are on the same page.).


Coming later.


mush_malloc() and memcheck

mush_malloc() and memcheck raevnos Fri, 2007-06-22 14:40

Penn includes a basic memory allocation debugger, enabled by setting mem_check to yes in mush.cnf and rebooting. However, to take advantage of this, you have to use special functions to allocate and free memory.

The memcheck system uses reference counting, incrementing the count for a particular tag that have been allocated and decrementing it when a tag is freed. You can see the current counts with @list allocations as an admin character (@stats/tables and God for pre-1.8.3p3p4 versions). Each line looks like:


A constantly increasing count where there shouldn't be one means you're not freeing memory after you're done. Attempts to free memory with a tag that has a 0 or negative count get logged in game/log/checkpt.log, along with where this was done in the source (The latter as of 1.8.3p4).

tags are just strings describing what's been allocated. More specific tags, for example "fun_align.string", make tracking down leaks easier than more generic tags like "string", but increase the size of the data structures that store reference counts.


To associate memory with a tag, use the following wrapper functions, prototyped in externs.h. All take a tag argument, which is a string describing what's being allocated or freed. The same tag must be used to free a chunk of memory as what used when it was allocated.

void* mush_malloc(size_t bytes, const char *tag)
Wrapper for malloc(), allocates a number of bytes and returns them or NULL.
void* mush_calloc(size_t count, size_t size, const char *tag)
Allocates an array of count elements of size bytes and returns a pointer to the first or NULL. The returned memory is filled with zero bytes. (Added in 1.8.3p3)
void* mush_realloc(void * restrict ptr, size_t size, const char * restrict tag)
Wrapper for realloc(): Allocates, resizes, or frees memory depending on the values of ptr and size. The tag and pointer cannot be the same string. (Added in 1.8.3p4)
char* mush_strdup(const char * restrict string, const char * restrict tag)
Returns a newly allocated copy of a C string (or NULL). The arguments cannot be the same string. Does not work with multibyte character strings.
void mush_free(void * restrict ptr, const char * restrict tag)
Frees memory. The arguments cannot be the same string.

Even if an allocation function returns NULL, it still increases the reference count of that tag.

If you have no control over allocation of some memory (Such as something returned by a foreign library), you can add a reference for it with void add_check(const char *tag) and decrement it with void del_check(const char *tag). Finally, void list_mem_check(dbref player) displays the current counts to a player and void log_mem_check(void) writes them to a log file.

See the Source documentation for more on the functions mentioned in the above paragraph.

Other options

Penn comes with an malloc implementation that includes optional memory debugging (CSRImalloc in debug mode, #define MALLOC_PACKAGE 2 in options.h. I've never used it.

(TODO: Links for the things mentioned below)

There are a variety of third-party tools for detecting memory leaks and other allocation-related problems, such as the MallocDebug program and library that come with the Mac OS X developer tools, the Boehm garbage collection library in leak-detection mode, and more.

The Chunk Memory Management System

The Chunk Memory Management System raevnos Fri, 2003-09-05 11:56


The chunk memory management system has three goals: to reduce overall memory consumption, to improve locality of reference, and to allow less-used sections of memory to be paged out to disk. These three goals are accomplished by implementing an allocation management layer on top of malloc(), with significantly reduced overhead and the ability to rearrange allocations to actively control fragmentation and increase locality.

Basic operation:

The managed memory pool is divided into approximately 64K regions. Each of these regions contain variable-size chunks representing allocated and available (free) memory. No individual allocation may be larger than will fit in a single region, and no allocation may be smaller than two bytes. Each chunk has between two and four bytes of overhead (indicating the used/free status, the size of the chunk, and the number of dereferences for the chunk), and each region has additional overhead of about 32 bytes.

Allocations are made with the chunk_create() call, which is given the size of the data, the data value to be stored, and an initial dereference count to be assigned to the chunk. Once created, the value of a chunk cannot be changed; the storage is immutable. chunk_create() returns an integral reference value that can be used to retrieve or free the allocation.

Allocations are accessed with the chunk_fetch(), chunk_len(), and chunk_derefs() calls. Each of these if given a reference (as returned by chunk_create()), and chunk_fetch() is additionally given a buffer and length to fill with the allocated value. Both chunk_fetch() and chunk_len() increment a chunk's dereference count (up to the maximum of 255), which is used in migration to improve locality.

Allocations are freed with the chunk_delete() call, which also requires a reference as input.

Finally, allocations are allowed to rearrange themselves with the chunk_migration() call. chunk_migration() takes an array of pointers to chunk references as input, and examines each of the indicated chunks to see which need to be moved to improve the distribution of allocations. If any allocations are moved, then the references to the moved allocations are updated in place (hence the array of pointers to references, instead of just an array of references). Migration may be done incrementally by submitting only a portion of the allocations with each call to chunk_migration(); however, _all_ allocations made with chunk_create() must eventually be submitted for migration in order to maintain the memory pool in a non-fragmented state.


Under normal conditions, extended use of this chunk allocation system would lead to a significantly fragmented datastore, unless there was some means to defragment the storage arena. In the long run, this could be very bad, leading to quite a mess. Calling chunk_migration() gives the allocator permission to move allocations around both to defragment the arena and to improve locality of reference (by making sure that all the infrequently used chunks are segregated from the chunks in active use). Of course, moving all the allocated chunks at once would be a slow and painful process. Instead, migration may be done incrementally, giving permission to move a small number of chunks at any one time, and spreading out the cost of defragmenting the data store.

Just because you give permission to move a chunk doesn't mean that it will be moved. The chunk may be perfectly happy where it is, with no need to move it elsewhere. Chunks are only moved when their personal happiness would be improved by a move. In general, maximizing the happiness of individual chunks will improve the happiness of the whole.

There are several things that factor into a chunk's happiness. The things that make a chunk unhappy are:

  • Having free space both before and after the chunk in its region.
  • Having only one allocated neighbor (or worse, none). The edges of a region count as allocated neighbors.
  • Having a dereference count different from the region average. The greater the difference, the more unhappy the chunk is.
  • Being in a sparsely populated region. The more free space in a region, the more unhappy the chunks in it.
  • Being away from the other chunks migrated at the same time. If some of the other chunks allowed to migrate are in the same region as a chunk, then it is happier. (This is specifically to improve locality during dumps.)

None of these factors are absolute; all of them have different weights that add into a general unhappiness for the chunk. The lower the unhappiness, the better.

Over time and usage, the dereference counts for chunks will increase and eventually reach a maximum value of 255. (The count is limited by the fact that it's stored in a single byte for each chunk.) If this is left unchecked, eventually all chunks would have a dereference count of 255, and the counts would be useless for improving locality. To counteract this, when the average dereference count for a certain number of regions exceeds 128, the 'migration period' is incremented and all chunk dereference counts are halved. The critical number of regions is determined based on the cache size and the total number of regions. In general, period change should be controlled primarily by the frequency of database dumps (which end up incrementing the dereference count on all chunks, and thus all regions). Given a dump frequency of once per hour (the default), there should be a period change about every 2.6 days.


The chunk memory management system keeps several statistics about the allocation pool, both to maintain good operation through active encouragement of locality, and to satisfy the curiosity of people using the system (and its designer ;-)). These statistics are reported (in PennMUSH) through the use of the @stats command, with /chunks switch.

@stats/chunks generates output similar to this:

Chunks:         99407 allocated (   8372875 bytes,     223808 ( 2%) overhead)
                  74413 short     (   1530973 bytes,     148826 ( 9%) overhead)
                  24994 medium    (   6841902 bytes,      74982 ( 1%) overhead)
                      0 long      (         0 bytes,          0 ( 0%) overhead)
                  128 free      (   1319349 bytes,      23058 ( 1%) fragmented)
Regions:          147 total,       16 cached
Paging:        158686 out,     158554 in
Storage:      9628500 total (86% saturation)

Period:             1 (   5791834 accesses so far,       1085 chunks at max)
Migration:     245543 moves this period
                 145536 slide
                     45 away
                  30719 fill exact
                  69243 fill inexact

First, the number of allocated chunks is given, along with their total size and overhead. Then, the allocated chunks are broken up by size-range; short chunks (2 to 63 bytes) with two bytes of overhead each, medium chunks (64 to 8191 bytes) with three bytes of overhead each, and long chunks (8192 to ~64K bytes) with four bytes of overhead each. Rounding out the individual chunk statistics is the number of free chunks, their total size, and the amount of fragmented free space (free space not in the largest free chunk for its region is considered fragmented).

Next comes statistics on regions: the number of regions in use and the number held in the memory cache. All regions not in the cache are paged out to disk. Paging statistics follow, listing the number of times a region has been moved out of or into memory cache. After that, the total amount of storage (in memory or on disk) used is given, along with the saturation rate (where saturation is indicated by what fraction of the used space is actually allocated in chunks).

Finally comes statistics on migration and the migration period. The period number is listed, along with the total number of dereferences in the period and how many chunks have the maximum dereference count of 255. Then the amount of migration movement is listed, both in total and broken up by category. Slides occur when an allocation is shifted to the other side of a neighboring free space. Away moves are made when an allocation is extremely unhappy where it is, and is pushed out to somewhere else. Fills are when an allocation is moved in order to fill in a free space; the space can be either exactly filled by the move, or inexactly filled (leaving some remaining free space).


The chunk memory management system can also display a few histograms about itself. These histograms are reported (in PennMUSH) through the use of the @stats command, with the /regions, /freespace, or /paging switches.

All of @stats/regions, @stats/freespace, and @stats/paging produce histograms vs. region average dereference count. The histograms use buckets four counts wide, so all regions from 0-3 will be in the first bucket, 4-7 in the second, etc., up to 252-255 in the last bucket. If the heights of the buckets are significantly different, then the highest spikes will be allowed to extend off the top of the histogram (with their real values labeled in parenthesis next to them).

@stats/regions is a histogram of how many regions at each count currently exist. In a healthy game, there should be a large spike at some dereference count between 64 and 128 (representing the largely unused portion of the database), a lesser spike at 255 (representing the portion of the database that's used very frequently), and a smattering of regions at other counts, with either new areas of the database (below the large spike) or semi-frequently used areas (above the large spike). New migration periods occur when the large spike would pass 128, at which point everything is halved and the spike is pushed back down to 64.

@stats/freespace is a histogram of how much free space exists in regions at each dereference count. This histogram is included to aid in diagnosis of the cause for dropping saturation rates. Ideally, the free space in any bucket should be less than 120K.

@stats/paging is a histogram of the number of regions being paged in or out at each dereference count. As of this writing, a very unhealthy behaviour is observed, wherein the histogram shows a trapeziod between 64 and 128, drowning out most of the rest of the chart. This indicates that as time goes on, the attributes associated with a single low-use object are getting scattered randomly throughout all the low-use regions, and thus when dumps occur (with their linear enumeration of all attributes on objects) the low-use regions thrash in and out of cache. This can be very detrimental to dump performance. Something will have to be done to fix this tendency of migration; the unhappiness based on other chunks being migrated in the same region is the current attempt. Healthy behaviour will make some other pattern in the paging histogram which has not yet been determined.

Data structures

Data structures raevnos Thu, 2007-05-24 16:27

The pages in this chapter describe the various data structures used in Penn. They include hash tables, shared strings, unique-prefix lookup tables, and more.

Hash tables

Hash tables raevnos Mon, 2007-06-04 22:26

Hash tables are a (usually) fast lookup function for (string key, value) tuples. They are used for things like looking up softcode functions. They support lookup of exact key names, and iteration through all elements in a random order.


To use hash tables, #include "htab.h" if it isn't already. Hash tables can then be declared with the HASHTAB type.

For example:

HASHTAB htab_sample;

Before using a hash table, it must be initialized with hash_init. This takes three arguments: A pointer to the hash table, an initial size of the table, and a pointer to a function that is passed the data stored in an entry when it's deleted (Or NULL). hashinit() is a macro that passes a NULL cleanup function and only takes the first two arguments.

hash_init(&htab_sample, 256, NULL);

When done with a hashtable, call hash_free() on it.

Once you've initialized a table, you can add elements to it with hashadd(). It takes three arguments: The string to use as a key, a pointer to the data to store, and a pointer to the table. The key is copied; the data pointer is stored untouched. It is this that's passed to a cleanup function if given in hash_init().

hashadd("foo", &some_value, &htab_sample);.

You can look up elements with hashfind(). Its arguments are the key to look up and a pointer to the table. It returns the data pointer for the associated key, or NULL if the key was not found. Hence, it's not a good idea to use a NULL data pointer.

if (hashfind("foo", &htab_sample)) { item_present(); }

Delete elements from the table with hashdelete(), which takes the key and pointer to table arguments just like hashadd().

Iterate through all data pointers in the hash table with hash_firstentry()/hash_nextentry(), which returns the first data pointer, followed by every other data pointer, returning NULL after the last. hash_firstentry_key()/hash_nextentry_key() does the same thing for the keys; you cannot currently (1.8.3p3) get both key and data in one iteration; this might change in the future.


Hash tables use cuckoo hashing, which produces very dense memory-saving tables. They strongly favor usages where there are few insertions (Which can potentially be very costly), and many lookups (Which are very fast and cheap).

Perfect Hash Tables

Perfect Hash Tables raevnos Sat, 2011-04-16 02:04

For cases where we have to check to see if a string is one of a fixed set of values that will never change while a MUSH is running, Penn uses a tool called gperf to generate lookup functions that use a perfect hash table to quickly and efficiently check for a match.

For details on the format of a gperf file, see its documentation.

To include a gperf-generated table in Penn:

  1. Create a new foo.gperf gperf file. See the existing ones for templates.
  2. Look for the existing gperf entries in src/Makefile.in and add a new one following their template to generate foo.c.
  3. Run ./config.status to rebuild src/Makefile.
  4. In the source file that uses the lookup function, #include "foo.c"

Examples can be seen in src/funmath.c and src/markup.c, among others.

Integer Maps

Integer Maps raevnos Thu, 2008-01-31 14:53

Integer Maps map unsigned 32-bit integer keys to arbitrary pointers. They were added in 1.8.3p7. They're used in a number of places; for example, in queue entry process id related code (src/cque.c), and in looking up the structure representing a connected socket via its descriptor number (src/bsd.c).


To use integer maps, you have to #include "intmap.h". The type of an integer map is a intmap *. The type is opaque; you cannot directly access elements of the structure.

Basic functions

intmap * im_new(void)
Return a new, empty intmap structure.
bool im_insert(intmap *, uint32_t, void *)
Insert a new key,value pair into the map. Returns true on success, false on failure (Usually because of a duplicate key).
void * im_find(intmap *, uint32_t)
Return the pointer associated with the given key, or NULL if the key isn't present.
bool im_exists(intmap *, uint32_t)
Returns true if a key is present in the map.
int64_t im_count(intmap *)
Returns the number of keys stored in the map.
bool im_delete(intmap *, uint32_t)
Removes a mapping from the map. Returns true on success, false on failure (If the key didn't exist.) Deletion of memory allocated in the data pointer is left up to the user.

Less common functions

void im_destroy(intmap *)
Delete an entire intmap. The intmap passed to this function becomes unusable. Memory allocated in the data pointers must be freed by the user.
void im_dump_graph(intmap *, const char *)
Dump a representation of the map to the given filename. The dump is in the dot language, which can be rendered by tools in the graphviz package, specifically dot(1). Useful for debugging the internal tree structure.


Integer maps use a slight modification of radix trees (AKA patricia trees), a self-balancing binary tree where branching is based on the status of particular bits in the key.

Prefix tables

Prefix tables raevnos Thu, 2007-05-24 17:03

Prefix tables, aka ptabs, are a (string key,value) lookup structure used to find the entry for a name that is the unique prefix of a key in the table. For example, they're used by @set for flag names so that you don't have to get the full name of the flag.


To use a ptab, #include "ptab.h" if it isn't already in the source file. Prefix tables then can be declare using the PTAB type.

For example:
PTAB ptab_sample;

Before using any other functions on a new ptab, it must be initialized, with ptab_init(), which takes a pointer to a PTAB as its argument. Unless otherwise mentioned, all ptab functions take a pointer to a PTAB as their first argument.


When done with the table, call ptab_free() to release allocated memory:


Before looking up anything, you have to add (key,value) tuples to the table. There are three steps involved in this. First, you have to call ptab_start_inserts(), then add new entries with ptab_insert(ptab, key, data). Once done inserting items, call ptab_end_inserts(). You can only insert items between a start/end; lookups are disallowed.


for (n = 0; n < initarray_size; n++) {
ptab_insert(&ptab_sample, initarray[n].key,

With version 1.8.3p3 and later, you can use ptab_insert_one() to insert a single new item. It's better to use ptab_insert() when inserting multiple entries at one time.

ptab_insert_one(&ptab_sample, "I AM A UNIQUE PREFIX", &stdin);

You can look up an entry with ptab_find(). If an entry matching the given key is found, that element's data pointer is returned. If it's not found or there are more than one keys with the given lookup key as a prefix, NULL is returned. Note that this means that if you use a NULL data pointer, there is no way to tell it apart from a failed lookup.

data = ptab_find(&ptab_sample, "I AM A");

You can also do a lookup testing for an exact match only with ptab_find_exact():

data = ptab_find_exact(&ptab_sample, "I AM A UNIQUE PREFIX");

Delete entries from a table with ptab_delete(). The key must be an exact match, and no cleanup is done of the data pointer. If it needs to be freed, that has to be done first.

ptab_delete(&ptab_sample, "I AM A UNIQUE PREFIX");

Finally, you can iterate through all entries in a table with the ptab_firstentry_new() and ptab_nextentry_new() functions. The former returns the data of the first element, and the latter subsequent data. It returns NULL once all elements have been returned. The second argument to each, if non-NULL, must be large enough to hold the longest key in the table. If you don't care about the key, use ptab_firstentry() and ptab_nextentry().

Internally, ptabs are implemented as a sorted array. Exact lookups are done with a binary search and thus have a O(log N) performance. Prefix lookups use a modified binary search that finds the first entry with a prefix of the lookup key and checks the elements before and after. to see if it's unique.

When new entries are added, the key (A string) is copied. The calling function can safely free the memory used as the key argument if needed. The data pointer is merely stored.

ptab_start_inserts() marks the table as being unsorted. ptab_insert() adds new elements to the end of the array, and ptab_end_inserts() sorts it again. ptab_insert_one() does a single insertion into the proper place in the table.

See the documentation of ptab.c for more information.

Shared strings

Shared strings raevnos Thu, 2007-05-24 18:28

Shared string trees, aka StrTrees or just string trees, allow you to cache frequently-repeated strings instead of having dozens or hundreds of copies of the same one. They are used for things like attribute and object names.


Use string trees by putting a #include "strtree.h" in your source if it isn't there already. String tree variables have the type StrTree. Unless otherwise mentioned,all string tree functions take a pointer to a StrTree as their last argument.

StrTree st_sample;

Before using one, it must be initialized by calling st_init().


New strings are added to the tree with st_insert(). The string that it returns is the persistent one that should be saved by you.

mystruct->name = st_insert("FOO", &st_sample);

When done with a particular copy of a shared string, call st_delete().

st_delete(mystruct->name, &st_sample);

To test if a given string is in the tree, use st_find(). It returns a pointer to the shared string and NULL if not. This pointer should not be saved beyond the immediate function it's used in. Use st_insert() for that.


The string tree is a red-black tree (Hence the name) of reference-counted strings. Every time a string is inserted, its count goes up, and when it's deleted, the count goes down. A string is removed from the tree when its count goes to 0. If a string's reference count goes to 127, it becomes a permanent entry and will never be deleted.

See the the documentation of strtree.c for more information.

Makefile rules

Makefile rules raevnos Mon, 2007-06-04 00:57

This page describes some of the more useful Makefile rules for PennMUSH.

Most people use 'make update', 'make', and 'make install' and get by fine. There's a lot of other options, of which the more useful are mentioned here. Some have not been used in living memory and are due to get cleaned up.

The default rule run by just plain 'make'. Rebuilds hdrs/cmds.h, /hdrs/funs.h, and other headers. Builds netmud and info_slave.
Updates options.h from options.h.dist and game/mush.cnf from mush/mushcnf.dist, to add new options or remove deleted ones.
Makes symlinks from game/netmush to src/netmud and the same for info_slave.
Builds just netmud.
Creates an etags file in src/TAGS for use with emacs (Move the cursor over a function name, and hit M-.)
The same as etags but for vi
Builds src/portmsg, the port announcer program that can be used when a mush is down or moved to give people who try to connect a message.
Runs indent on Penn source to re-format it to project standards.
Rebuilds the file dependencies in src/Makefile.in that determine when files need to be recompiled. Requires the makedepend program, part of X11 or its own package depending on the OS distribution. Re-run ./config.status to generate a new src/Makefile if it's not done by make for you. (Pending feature as of this writing.)
Delete all object and executables in src/
Delete a lot more files than just make clean, including game/mush.cnf. Use with care.
Rebuilds the help-file change logs from the CHANGES.* files. Requires the Sort::Versions perl module.

And more.


Powers raevnos Sat, 2003-02-01 17:56

Generally, powers are used to give an object limited permissions for something specific without having to resort to giving an object the wizard flag.

You can add and otherwise manage powers as God from in a game using the commands documented in HELP @POWER2, but aside from softcode checks using haspower(), you'll have to modify hardcode to test for them.

Test for the presence of a power or not with:

has_power_by_name(object, "power name", NOTYPE)
object is the dbref of the object, and "power name" should be obvious (It's a string).

You can also create new powers in the source, by modifying src/flaglocal.c to include an appropriate add_power() call. See the example in that file.

Working with attributes

Working with attributes raevnos Mon, 2007-06-04 00:40

This chapter contains information on working with attributes -- setting, getting, iterating over, etc.

When it's done. Which isn't now.

An attribute is a C struct, of type ATTR, defined in hdrs/attrib.h.

Common tasks related to attributes are described below:

TODO: Describe attribute struct fields

TODO: Child page on getting/setting
TODO: Child page on attribute flags
TODO: Child page on iteration

Adding new builtin attributes or changing existing ones

There are two ways to add a builtin attribute (One that can be set with @NAME as well as &NAME), or change the default permissions for a built-in one:

  1. Use @attribute from in the game. This is the preferred way, as it involves no source modifications. Changes made with @attribute are not (As of 1.8.3p2) persistent across reboots, so they should go in a @startup.
  2. Modify hdrs/atr_tab.h. Detailed instructions to follow.

Technical details
(TODO: Maybe move this to a child page?)

Attributes are stored (As of 1.8.3p2) as a sorted linked list, one per object. Names of all attributes in the database are held in a shared string tree. The text of attributes are stored in the chunk manager. Attribute flags are in a 32-bit int, one bit per flag. It's almost full.

See the attrib.c documentation for API details.

Working with locks

Working with locks raevnos Thu, 2004-06-03 14:02

This section is an introduction to dealing with locks: Adding new built-in lock types, evaluating an object's locks, adding or changing individual locks, and so forth.


The code for locks is located in hdrs/lock.h and src/lock.c. All information about a specific lock is held in a lock_list struct. lock_lists are stored in a linked list sorted by lock name, in a way similar to attributes. The actual key part of a lock, called a boolexp, is handled by hdrs/boolexp.h and src/boolexp.c.

The fields of a lock_list should be accessed via the macros defined in hdrs/lock.h. They are:

The flags of a lock.
The dbref of the player that set the lock.
The name of the lock ("Basic", "Enter", etc.)
The boolexp

Adding new lock types

Adding new lock types raevnos Thu, 2004-06-03 17:06

Adding new lock types that can be set in-game with @lock/foo is very easy.

Near the top of


is an array named lock_types. Near the end of the array is a comment "Add new lock types just before this line.". Just copy & paste one of the existing entries in the array to that spot, and change the name to that of your new lock type. You can also change the default flags of the lock if you wish (The LF_PRIVATE part).

If you want to test this new lock type in hardcode, you'll also want to add a new line to the section just above the lock_types array, where there's a comment "Define new lock types here.". Once again, use the existing ones as an example.


also needs to be modified. If you added
'const lock_type Foo_Lock = "Foo"' in


, then


needs to have 'extern const lock_type Foo_Lock;". This new variable is used with the eval_lock() function described later. It's not neccessary to use, but it's easier than re-typing the actual name of the lock everyplace you want to test it.


Boolexps raevnos Mon, 2007-06-04 22:53

A boolexp is the key part of a @lock: #1, sex:M*, canpass/1, etc. This page talks about their implementation and describes how to add a new type of key.

Well... a new key of a 'X^Y' sort. Adding new tokens is tricky for a couple of reasons: There just aren't that many possible ASCII characters left, and it's a lot more work than extending ^. (Which really should have been used to mean an exclusive-or to go along with & and |, but too late now. Too bad I didn't think of it back when...).


All boolexp-related code is in src/boolexp.c.

Keys are transformed into a parse tree by a hand-written recursive descent parser (The top function is parse_boolexp_E()). This parse tree is then compiled to bytecode. The bytecode itself is usually stored in the chunk manager; the boolexp field of the lock structure holds a chunk reference.

Some locks are checked very frequently, hence the use of bytecode to make evaluating a boolexp as fast as possible.

Adding a new key type

Adding a completely new key type involves extending the parser and "assembler", and the VM function that evaluates the bytecode, and the function that converts bytecode to a string for display, and and and...

Adding a new ^-type key involves several fewer steps (But still quite a few). This is a brief guide; look at the comments and existing code in src/boolexp.c for hints:

  1. Add a new opcode to the bvm_opcode enum.
  2. Add a new entry to the flag_locks array.
  3. Add a case for the new opcode to the switch in eval_boolexp(). The variable r is the result register; set it to 1 if the key succeeded, 0 if it failed. The value of (char*)(bytecode + arg) is the string on the right hand side of the ^.
  4. Add a case for the new opcode to the switch in unparse_boolexp() to convert the bytecode into a string. Use OP_TFLAG as a template.
  5. Add a case for the new opcode to the switch in emit_bytecode(). Just adding it to the same group as OP_TFLAG and letting it fall through is usually all that's needed here.
  6. Add a case for the new opcode to the switch in print_boolexp() to print out a pseudo-assembly instruction. This is used in debugging the boolexp compiler.
  7. Optionally add a case for the new opcode to the switch in check_lock() to carry out @warnings checks.


emit_bytecode() might move from a switch to, on GCC, an optimized jump table. Macros will be used to choose which version to use with only one evaluation function, using the same approach as the ocaml bytecode VM.

  • Testing locks

    Testing locks raevnos Thu, 2004-06-03 17:18

    Testing to see if one object passes another object's lock is, again, easy to do.

    The usual way to test is with the eval_lock() function. It takes three arguments: The dbref of the object that is trying to pass the lock, the dbref of the object whose lock is being testing, and the name of the lock type. This last argument is usually one of the Foo_Lock variables listed in


    , or can be a string containing the name of the lock. Enter_Lock and "Enter" will test the same lock, but the former is the recommended usage. eval_lock() returns a true value if the lock is passed, and false otherwise. Remember that if the lock doesn't exist on the object in question, it always succeeds.

    If all you have is a boolexp (The 'key' part of a lock) and not an actual lock on an object (A channel lock, for example), it can be tested with the eval_boolexp() function. Once again, it takes three arguments: The dbref of the object trying to pass the lock, the boolexp, and the dbref of the object to pretend that the boolexp is on, for eval-locks, permission and visibility checks and so forth. If there is no object associated with the boolexp, such as in channel locks, use the same dbref that's trying to pass the lock.

    When to hack

    When to hack javelin Tue, 2003-01-28 22:47

    What? There's a command that your MUSH really needs, a function not in standard PennMUSH, a bug you've discovered? In cases like this, the fact that you can "Use the Source, Luke" can greatly enhance your enjoyment of MUSH, your understanding of MUSH, and your MUSH. :)

    I should probably insert here the standard cautions about not adding loads of feeping creatures ("feeps") and kludgy hacks to your MUSH, but I won't. Feeps are fun and usually harmless, and as long as you're happy with your code, don't let anyone tell you otherwise.

    While there are no rules, then, there are a few guidelines I try to follow when deciding what *not* to hack:

    • Avoid hacks which permanently alter the db or maildb structure, unless you know that you'll always be willing to support the code and cope with possible future PennMUSH upgrades, or if you don't care about ever upgrading. The sort of thing I'm talking about here is adding a couple new variables to the db structure. If you think your hack is of general interest, let me know and if I agree, you'll be assigned an official db-flag for your hack, and we'll arrange for built-in db conversion code to turn your changes on or off (the way USE_WARNINGS works, for example). Another good approach is to create your own file for storing local properties, and sing set_objdata and get_objdata to manipulate them once they're loaded (usually from local.c). This is discussed later.
    • Avoid hacks which require the MUSH to filter all outgoing text. Ansi and HTML filtering is bad enough. While I've seen some beautiful commands like @wrap (which word-wraps output to players), each time you add a filter to
      all the MUSH's output, you significantly increase CPU requirements. Leave client-functions to clients, if possible.
    • Avoid hacks with very limited functionality. If you've got spare time to hack, ask Javelin if there's something he's been stalling on doing...he keeps a big list.
    • If the hack is something that seems very useful, email the idea and/or the source code (a context diff) to pennmush-developers@pennmush.org. It may well appear in the next release!

    PennMUSH 1.8.0p8 and 1.8.1p3

    PennMUSH 1.8.0p8 and 1.8.1p3 javelin Thu, 2005-09-15 18:30

    Well, the recent 1.8.1p2 release revealed that several important Configure units from 1.7.7 were missing in the 1.8.x series, including the one for detecting mysql. These new releases fix Configure, and are highly recommended.

    Diffs, patches, and source code control

    Diffs, patches, and source code control javelin Tue, 2003-01-28 23:06

    Sure, you want to add your own hacks to PennMUSH, but you also want to keep up with the patchlevels that the developers release so you get the benefit of bugfixes and new features, right? In this section, we'll look at how that's done, and you'll learn about generating your own patches, applying patches, and more sophisticated mechanisms of source code control.

    I'll refer to the distributed PennMUSH source code and patches as the 'distributed branch' or 'dist'. We'll be looking at the situation where you have modified the dist to create your 'local branch' or 'local'. If you start out with the dist at patchlevel n (we'll call that dist[n]), and you add some hacks to it, you've created local[n] -- dist[n] + your local changes. Suddenly, the developers release dist[n+1] and a patch dist[n]->dist[n+1]. Now what?

    There are two main strategies to upgrade:

    1. Take your copy of local[n] and apply dist[n]->dist[n+1]. If it succeeds, you've probably got local[n+1]. If not, you'll have to fix the failed chunks by hand.
    2. Get a copy of dist[n] and a copy of local[n] and generate a patch dist[n]->local[n]. This patch essentially contains your changes to dist[n]. Then get a copy of dist[n+1], and apply the patch. If it succeeds, you've probably got local[n+1]. If not, you'll have to fix the failed chunks by hand.

    Each approach has pros and cons. The first method doesn't require you to keep many copies of the source code; you just use the patches as they come out. The downside is that if patch hunks fail, you'll have to apply them by hand, and if you didn't back up your source tree, you may not be able to get up and running if there's a serious failure.

    The second method requires you to keep two or three source trees, which is both more annoying and safer. On the plus side, if a patch fails, you'll be applying your hack back by hand -- code that you probably know quite well. And if you want to contribute your hack to PennMUSH (or to another user), you've already got a patch.

    In general, the first method works best when your local changes are small and well-contained (in fact, if they're all in the *local.c files, the first method is ideal, as those files are never patched directly). When your local changes are extensive, the second method wins.

    Even better, you can automate the second method by using source code control software, which we'll discuss later. First, a look at how to make a patch file, and how patches are applied.

    Making patches: the diff program

    Making patches: the diff program javelin Tue, 2003-01-28 23:18

    The "diff" program (a standard unix utility) can produce apatchfile, given the original and revised source code
    files. For example, if you revise player.c, and save the older version as player.c.orig, you could make a patchfile like this:

        diff -c player.c.orig player.c > patchfile

    The "-c" switch indicates that you want a context diff, which is more detailed than an ordinary diff and better for patches. If you're going to publicly distribute the patch, be sure it's a context diff! (Another diff format, unified diff, is also appropriate. You can use "-u" to get a unified diff. Some people find them easier to read, and some -- like me -- don't.)

    The order of the files is important. The patchfile will be written to apply the differences between player.c.orig and player.c so that player.c will be the end result.

    If there's more than one source file changed, you can do this:

        diff -c player.c.orig player.c > patchfile
        diff -c game.c.orig game.c >> patchfile

    You may be able to quickly make a collection of
    diffs across the whole PennMUSH source tree by using David
    Cheatham's mkpatch shell script, which is
    available at http://ftp.pennmush.org/Accessories/mkpatch

    Applying patches: the patch program

    Applying patches: the patch program javelin Tue, 2003-01-28 23:36

    Changes and bugfixes to the MUSH code are often distributed as patches in context diff format. Diffs are files which describe how the source code should be changed. The program "patch" (by Larry Wall, distributed by the GNU project) automatically reads these files and makes the changes to your source code. If you don't have the patch program, ask your system administrator to get it! (A version for Win32 systems is available at http://unxutils.sourceforge.net)

    Typically, you can get patches to PennMUSH from the pennmush mailing list or FTP site. To apply a patch, place it in your top-level pennmush directory. Then READ THE PATCH FILE.

    PennMUSH patches contain instructions at the top of the file. These instructions can be very important, and can differ from patch to patch. Always read the patch file.

    That said, the most common instruction is to apply the patch by typing:

        patch -p1 < patchfile

    in order to process the patch.

    The patch program is very smart. The changes in the context diff are split into hunks. Each hunk represents a section of code that's been changed (and that's far enough away from any other code changes to be handled separately). A hunk includes information about the line numbers in the original file where the change was made, along with a few lines of context -- lines of code that haven't changed but surround the changed lines. Using this context, patch can apply the changes in the right place even if you've modified other sections of the program.

    Sometimes, however, you may have made changes that overlap the changes in a patch hunk. In this case, the hunk will fail to patch because the patch program won't find the appropriate context. When this happens, you need to read the failed hunk yourself and apply the changes by hand. Failed hunks are usually named for the file they were intended for, but end in .rej (e.g. "bsd.c.rej").

    Here's how to read a failed hunk (this is based on an email message by T. Alexander Popiel):

    • If the diff begins with the line "Prereq: ", then it means that in the file that follows, the string should be present within the first few lines, or the diff should not be applied. Patch checks for prerequisites to help insure that you're patching the right version of the source code. For example, PennMUSH patches always start by patching the Patchlevel file, and use the contents of that file as a prerequisite to be sure you don't apply 1.7.7-patch08 to the 1.7.7p4 version by mistake.
    • For each file in the diff, there are two header lines, which look like this:
      *** filename1	date1
      --- filename2	date2

      This indicates that the diff is the difference between filename1 and filename2, and should usually be applied to filename2.

    • After the header lines, the diff will indicate which line numbers in filename1 (the source) are to be examined, what should be changed or deleted, what the resulting line numbers in filename2 (the destination) are, and what should
      be changed or added:
      • A '-' in the first column of the source part of the patch indicates a line deletion.
      • A '+' in the first column of the destination part of the
        patch indicates a line addition.
      • A set of '!'s in the first column in both the source and the destination indicate a line-group replacement; a group of consecutive lines in the source are replaced with a corresponding group of lines in the destination.

    The patch program can also "reverse" a patch, applying it backward to turn the new version back into the old version. To reverse a patch, you feed the patch to the patch program giving it the "-R" switch.

    Source code control

    Source code control javelin Tue, 2003-01-28 23:51

    You can go a long way with diff and patch, but if you're making serious changes, you need more powerful tools. If you hack at the PennMUSH source code long enough, you will eventually make a mistake, and want to go back to an earlier version of your work. Or you'll make a really good but extensive change, and want to keep up with patchlevels or produce your own patchfiles to distribute it to others.

    You can make your life a lot easier with some form of "source code control" or "revision management" software. ere are some common revision management approaches:

    1. Backups. Make a directory alongside your pennmush directory called "oldpenn", and put a copy of your source code into it. When you make changes, you can recover your old files from oldpenn, use it to produce patches, and eventually copy your new files into oldpenn when you're sure they work. Many people keep a "clean" source directory containing the original dist code of their version, in case they need it. Of course, if you need to go back more than one revision, you're in trouble unless you clutter your disk with many many oldpenn directories. A variant of this strategy involves storing older versions as compressed tar files.
    2. SCCS. SCCS (source code control system) is a more sophisticated way to manage source code. It stores changes from version to version in a subdirectory. You "check out" files to work on them, and "check in" files that you've hacked. You can revert to any revision at any time. This is good. Many major unix systems (Ultrix, SunOS, HP-UX) come with sccs installed. Read the man pages for info.
    3. RCS. RCS (revision control system) is the GNU project's free replacement for SCCS, available from
      http://www.cvshome.org. The commands are different from SCCS, and some things are easier to do. RCS is standard with Linux. RCS can be used to ease upgrading to a new patchlevel by preserving your hacks to the older patchlevel.
    4. CVS. CVS (also from GNU) is the "concurrent version system". It uses RCS to store revisions, but provides a higher-level concept of a project version (rather than just single files) and has better support for multiple programmers concurrently changing files (including making changes to the same file). Finally, CVS repositories can be made accessible over the Internet.
    5. prcs. prcs (project revision control system) is the PennMUSH devteam's current favorite. It's available from
      http://prcs.sourceforge.net. Like CVS, prcs uses RCS, but provides a high-level concept of a project rather than individual files. It also has excellent support for automatically merging code changes (such as new PennMUSH releases) into your locally modified version.
    6. Subverson. Subversion (svn) is a version control system built with the intention of being a compelling replacement for CVS. Available from http://subversion.tigris.org under an Apache/BSD-style license. It includes many of the key features of CVS, including the higher level concept of projects rather than individual files. It also tracks meta-data for directories, renames, and files, has truly atomic commits, and cheap easy branching and tagging. Repositories are accessible locally, through http(s) with WebDAV/DeltaV, and via its own svn server protocol for easy remote usage.

    If you choose to use a source code control system (and I can't recommend it highly enough), discipline yourself to always* check in code after each revision, so that you can undo each step. If you have multiple people hacking especially from different accounts on the machine), you can take advantage of the fact that RCS and SCCS will "lock" revisions so that only the person who checked it out can modify it and check it back in, preventing two people from making inconsistent changes. Or use CVS or prcs, which allow (and expect) multiple people to change things at once, and try to help deal with possibly conflicting changes.

    If you can't decide what software to use, CVS has a very large userbase who can probably be helpful, but prcs has the PennMUSH devteam who can advise you. Your call.

    When you get your first pennmush distribution, check in the entire source directory. With prcs, that's:

    prcs checkout pennmush
    prcs populate
    prcs checkin

    If you make some changes and then want to produce a diff of your changes:

    prcs diff > patchfile

    makes a diff from the last checked-in revision to the curren
    version of the project. If you read the man page for prcs, you'll see that it can also make diffs between checked-in revisions, for single files, etc.

    prcs supports the notion of multiple branches. You can store a branch that tracks the distributed PennMUSH source code, and a second branch that tracks your locally hacked code. You can then produce diffs between them at any time using prcs diff, or merge changes to the dist into your local code using prcs merge. See the man page for details.

    #ifdef and #define

    #ifdef and #define javelin Tue, 2003-01-28 23:56

    You can save yourself a lot of hassle if you're careful in how you hack the PennMUSH code. When you decide to add new code, or change old code, add an #define into options.h which will turn your code change on or off. For example, if you're adding a new feature to change the WHO format, put something like this into options.h:

    /* If defined, the WHO commands will use a new format */
    #define NEWWHO

    Then, surround your additions with #ifdef NEWWHO...#endif pairs. For changes, use #ifdef NEWWHO...#else...#endif. For deletions from the original code, use #ifndef NEWWHO...#endif around the code to delete:

    #ifdef NEWWHO
      this is code that you've added
    #ifdef NEWWHO
      this is code you've changed
      this is the original code
    #ifndef NEWWHO
      this is original code you want deleted

    This allows you to preserve the original PennMUSH coding, should you ever need to refer back to it (if, for example, you're trying to apply someone else's patch to something you've already changed), and allows you to turn on and off your feature as necessary.

    The PennMUSH code style

    The PennMUSH code style javelin Wed, 2003-01-29 00:09

    The PennMUSH devteam have adopted some coding conventions, which are listed below. If you use these conventions, your hacks are more likely to work on multiple systems, and will be easier for me to integrate into new patchlevels.

    • Use ANSI C style. All functions should be explicitly prototyped, and function definitions should use the ANSI C style. Although PennMUSH does contain some function definitions in the K&R style, they are being replaced by ANSI C definitions as the devteam rewrites sections of the code.
    • All source files should #include "config.h" before any other include file (except perhaps copyright.h) and should #include "confmagic.h" after all other include files. This lets the file take full advantage of the autoconfiguration script. You usually want to #include "conf.h" after "config.h" and any files, but before any other PennMUSH header file.
    • All header files should be idempotent. That's a fancy way of saying that they should be set up like this:
      #ifndef _MYFOO_H
      #define _MYFOO_H
      ...header file here...
      #endif  /* _MYFOO_H */

      This protects you if the file gets included twice by mistake.

    • If you're doing string handling, learn how to properly use the PennMUSH safe_* functions (e.g. safe_str, safe_chr, safe_integer, safe_format) to avoid buffer overflows.
    • Use ISO C 9899 functions when possible. For example, use memcpy in preference to bcopy, and strchr in preference to index
    • Signal handlers return Signal_t, defined through autoconfiguration. If SIGNALS_KEPT is defined, you don't have to reset signals in the handler.
    • malloc returns Malloc_t, and calls to free should have their parameters cast as (Malloc_t). I.e.: free((Malloc_t)buff);
    • Use mush_malloc(size to malloc,"name of mem_check") and mush_free((Malloc_t) ptr to free, "name of mem_check") when possible. These are macros which are preprocessed to plain old malloc/free when MEM_CHECK is not defined, and which call a function in memcheck.c to add/delete a mem_check before doing the malloc/free when MEM_CHECK is defined. You still need to add MEM_CHECK's manually if you get your memory by strdup or safe_uncompress or something.
    • Use the UPCASE() and DOWNCASE() macro to get the uppercase or lowercase versions of a character. The autoconfig checks whether toupper() can accept only lower-case letters, and defines UPCASE to protect toupper. If toupper() is safe, UPCASE() is defined as toupper() to be more efficient.
    • Learn about Penn's typedefs and macros, and use them when you can. This requires reading header files and examples in the source code.
    • The code follows a standard indentation scheme, which is documented in src/Makefile under the 'indent' target. It requires GNU indent 1.9 or later. You can use 'make indent' to reindent your code. Please re-indent before making context diffs for patches.
    • Document your code. As we revise the PennMUSH code, we're moving toward using doxygen ( http://www.doxygen.org ) to help us generate html documentation of the code.

    Adding flags

    Adding flags javelin Sat, 2003-02-01 17:41

    N.B. This discussion applies to the PennMUSH flag system as of PennMUSH 1.7.7p5. If you're using an earlier version of PennMUSH, you need to find an older edition of Javelin's Guide for PennMUSH Gods

    The easiest way to add flags in PennMUSH is to log in as #1 and use @flag. If you don't need your flags to have any special hardcoded effects, this is definitely the way to go.

    So from here on out, I'll assume that you are coding your own hardcoded system (e.g. a space system) that you expect to distribute, and your system needs to automatically add a SPACE flag to the database and be able to test it in hardcode in order to implement special effects.

    Adding the flag

    Adding a new flag is as easy as adding a call to add_flag() in the local_flags() function of src/flaglocal.c, like this:

    add_flag("SPACE", '&', NOTYPE, F_ANY, F_ANY);

    The arguments, in order, are:

    1. The flag's name, in "quotes".
    2. The flag's character, in 'single quotes'. If you don't want to assign a character to the flag, use '\0'
    3. The set of object types to which the flag can be applied. NOTYPE means all. Otherwise you can use a bitwise-OR combination of TYPE_ROOM, TYPE_PLAYER, TYPE_THING, and TYPE_EXIT
    4. Permissions required to set the flag. See listing in hdrs/flags.h
    5. Permissions required to clear the flag. See listing in hdrs/flags.h

    Adding a macro

    To make the use of your flag in code easier, add a macro that checks to see if an object has the flag. In PennMUSH, these macros are usually defined in hdrs/dbdefs.h. If your flag only applies to a single object type, you can define it with the IS() macro, like Floating():

    #define Floating(x)     (IS(x, TYPE_ROOM, "FLOATING"))

    If your flag applies to multiple (or all) object types, use the has_flag_by_name() function instead, like this:

    #define Audible(x)      (has_flag_by_name(x, "AUDIBLE", NOTYPE))

    Introduction to the new flag system

    Introduction to the new flag system javelin Mon, 2003-01-27 12:39

    This posting should eventually make its way into the Guide for Gods, when that moves here as a collaborative book, but until it does, I thought I'd share some information about the new flag system - how it works and why. I'm going to try to keep this at a conceptual level, but some experience with programming is probably useful.

    The old way

    Until PennMUSH 1.7.7p6, an object's flags were stored as two 32-bit integers. One integer stored "generic flags", like WIZARD, that can be set on any kind of object. The other stored "toggles", flags that only applied to specific types of objects. Each flag and toggle had a specific bit position within their integer.

    This system provided room for 32 generic flags and 32 toggles for each object type. Object "type" itself was treated as a generic flag, so there were practically only 28 generic flags available. And PennMUSH used 25-27 of them itself, leaving little room for people to add their own flags. As a result, two things happened:

    • Tortuous hardcode approaches to quasi-generic flags by making toggles that could apply to multiple object types
    • Tortuous softcode approaches that usually involved using a visual attribute to store flags. (Actually, these approaches were and are pretty good, but did require more work of the softcode and the server)

    The primary goal of the flag system rewrite was to remove this limitation. A secondary goal was to make it easy for Gods to add new flags that would just be tested in softcode, and for pennhacks to add new flags with hardcode behaviors.

    The new way

    Under the new system, flags (and toggles - there is no longer a distinction) are internally stored as variable-length arrays of bytes (a byte is 8 bits). The system keeps track of how many bits are currently in use. When a new flag is added, the system checks to see if there's a bit available in the last byte of the flag array. If so, the flag gets assigned to that bit. If not, the array is grown by a byte, providing room for that flag (and 7 others).

    These bit assignments are arbitrary and temporary - they are not fixed positions, but are subject to change (typically, when the server is rebooted). They should never be referenced or manipulated outside of the functions in flags.c. Everywhere else in the code - and in the database when it's written to disk - flags are represented as a string of flag names, like "WIZARD TRANSPARENT". (This actually does impose a limit on number of flags, as the typically MUSH string buffer is 8192 characters. If your flags average 10 characters long, you are limited to about 740 of them. That seems like enough for now.)

    As I mentioned above, toggles are no longer distinguished from generic flags. Each flag now includes information about which object types it can apply to. This eliminates the kludgery that used to be required to make a flag like LISTEN_PARENT, that applied to all types except PLAYER.

    Other information that goes with each flag is its associated flag character ('W' for "WIZARD"), any flag aliases (like "INHERIT" for "TRUST"), and the permissions required to set or clear the flag.

    Under the new system, flags can be created without associated flag characters; that's important, because it makes it possible to have more flags than characters. Flags without a character don't appear next to an object's dbref when you look at the object or in its flags() list, but do appear in examine and can be retrieved with the lflags() function. Two additional functions, andlflags() and orlflags(), allow for testing flag combinations using lists of full flag names.

    Of course, there's a trade-off for this flexibility. The new system is more memory-intensive and slower than the old system. Fortunately, computers have sped up considerably, and the difference is unlikely to be noticeable. To help matters a bit, an object's type (player, room, etc.) is no longer stored as a flag. Instead, it gets its own integer in the object structure, which speeds up type checking, a very frequent operation.

    The definitions of the flags themselves are now dumped to the same database that contains the objects.


    Once flag bit assignment became dynamic, it was possible to allow God to add flags on the fly, from within the MUSH. The new @flag command provides this ability. It can also provide some interesting information about flags.

    Because flags are now stored in the database, @flag only needs to be run once - not at every startup - to add a flag. Another consequence, however, is that if your server should crash before it dumps, the new flag won't be in your database and may need to be re-added.

    @flag/add (and most other flag manipulation commands) is restricted to God, largely because you don't want Wizards to accidentally muck with the permissions on the WIZARD flag, etc.

    add_flag and flaglocal.c

    Pennhacks who write contributed patches and systems may also want to add flags to a game. For example, a space system may need a SHIP flag that it manipulates in the hardcode. One apporoach would be to require users to add the necessary flags with @flag, but it's also possible for the patches to introduce the flags themselves.

    The new flaglocal.c file (copied from flaglocal.dst) provides the local_flags() function, which is called after the flag table is set up from the database. From this function, patches can call the add_flag() function to add a new flag at startup (see example in flaglocal.dst). It is explicitly safe to call add_flag() when the same flag is already in the database, so you don't have to test if you've already added the flag on a previous startup - just try to add it.

    Some subtleties

    Flags are a critical part of the PennMUSH code and appear in many places. Here are few of the intricacies involved in the flag rewrite.

    Converting old databases. We do this by remembering how the old flag system used to handle flags (this nostalgia lives in hdrs/oldflags.h and in the flag table in flags.c, which is no longer used for anything but conversion). If we're loading a database that doesn't include flag definitions, we initialize our flags from the old flag table. Thereafter, the db will be dumped with the flag definitions. If you had already hacked new flags in to your flag table, this will also pick up and convert these flags! (As long as you move their bit definitions into hdrs/oldflags.h).

    Macros. Several of the common lower-level macros like IS() and all of the flag macros (like Wizard()), had to change to use the new functions to look up the existence of a flag on an object (has_flag_by_name()). Examples abound in hdrs/dbdefs.h

    Command restrictions. Because commands can be restrict to objects with certain flags, we now need to load the flag table, and therefore the db, before we initialize the command table and before we read any restrictions from mush.cnf. But we needed other mush.cnf directives loaded before we load the db, so we now take two passes through mush.cnf. In the first pass, before the db read, all directives except command_restrict, command_alias, function_restrict, function_alias, and attribute_alias are read. After the db is read and we have a flag table, we initialize the command and function tables and do a second pass to pick up those directives and apply them. You may notice the multiple passes in log/netmush.log.


    I hope this has been helpful in explaining some features of the new flag system. If you have any questions, you know where to find me.

    Adding new 'help'-like commands

    Adding new 'help'-like commands javelin Sat, 2003-02-01 17:49

    Let's say you want to a new indexed text file to support a command called 'rumor'. rumor is to work just like help, news, or events, but is to be based on a file game/txt/rumor.txt which will be automatically generated from files in the game/txt/rumor directory with names like january.rumor, february.rumor, etc.

    This turns out to be pretty easy. You'll need to modify two files:

    • game/mush.cnf
    • game/txt/Makefile (for Unix-like systems)

    Here's the plan:

    1. Edit game/mush.cnf, and add this line:
      help_command rumor txt/rumor.txt
    2. Users of Unix-like systems (including Windows systems like msys that provide bash, perl, and make):
      • Edit game/txt/Makefile. Find this line:
        TXT=help.txt news.txt events.txt

        Add your new file name. Now it look like this:

        TXT=help.txt news.txt events.txt rumor.txt

        Find these lines:

        rules.txt: rules/*.rules compose.sh
                ./compose.sh rules

        Make a copy of these lines right below them, and change 'rules' to 'rumor'. Now you have this:

        rules.txt: rules/*.rules compose.sh
                ./compose.sh rules
        rumor.txt: rumor/*.rumor compose.sh
                ./compose.sh rumor

        NOTE: The whitespace before './compose.sh' must bea single tab character, not spaces.

      • Create the game/txt/rumor directory and populated it with some files with names ending in .rumor (jan.rumor, feb.rumor, etc.) Each of these files should be in help file format (topic names beginning with &'s, followed by text, and the first line should be a topic name).
      • In the game/txt directory, type 'make', and you should see rumor.txt being created.
    3. Users of pure Windows systems without Unix-like shell environments should just create the file game/txt/rumor.txt and put all their entries in there.
    4. Shutdown and restart the MUSH and test out your new rumor command!

    Dealing with softcode datatypes in hardcode

    Dealing with softcode datatypes in hardcode raevnos Wed, 2003-06-25 23:19

    When adding a new softcode function to the Penn source, you'll quickly notice that all the arguments are passed as strings (In the args array), and it returns values by adding them to another string. But in softcode, you have functions that expect numbers, dbrefs, etc. Penn provides a set of functions for use in the hardcode for converting strings to and from C types upon which real work can be done, and for appending values to the return string buffer.

    They're described in the following table:

    Datatype C type Is a TYPE String to C C to string Adding to a string buffer
    Integer int is_integer(char*) and is_strict_integer(char*) parse_integer(char*) unparse_integer(int) safe_integer(int, buff, bp)
    Unsigned Integer unsigned int is_uinteger(char*) parse_uinteger(char*) unparse_uinteger(unsigned int) safe_uinteger(unsigned int, buff, bp)
    Floating-point number NVAL is_number(char*) and is_strict_number(char*) parse_number(char*) unparse_number(NVAL) safe_number(NVAL, buff, bp)
    Dbref dbref is_dbref(char*) parse_dbref(char*) unparse_dbref(dbref) safe_dbref(dbref, buff, bp)
    Boolean int is_boolean(char *) parse_boolean(char *) unparse_boolean(int) safe_boolean(int, buff, bp)
    Character char N/A N/A N/A safe_chr(char, buff, bp)
    String char* N/A N/A N/A safe_str(string, buff, bp, string), safe_strl(string, length, buff, bp)
    • For adding to a string buffer, there's also safe_format(buff, bp, fmt, ...) where fmt is a printf()-like format.
    • safe_chr() is currently a macro, so avoid arguments with side-effects when using it.
    • For numeric types, is_FOO() obeys the tiny_math config option where if there aren't any numbers at the start of a string, it's treated as 0. The is_strict_FOO() functions require numbers of the proper type.