A treatise on MUSHcode

A treatise on MUSHcode walker Tue, 2009-10-13 23:14

I originally wrote this to describe mushcode to a few programming friends to help explain why I think learning MUSHcode as well as I did considerably aided my understanding of programming in many fields. It started off a simple summary, then an endless loop of: "I should expand on that ... I should explain this ... This could probably use clarification ..." and so ended up with this. =)

What's interesting is just how amazingly complicated it gets, and yet once you've got a solid grasp of all these ideas, they fairly directly apply to parallel and concurrent programming.

Premises: A mush is basically a large world with lots of objects (rooms, players, things, exits (that go from one room to another)) in it. On an active mush, a large number of those things will probably have commands in one or more of the 'queue's. A number of issues are in force here:

* Permission: Does object A have permission to affect object B?
* Message passing: called 'pemiting' - sending text to other objects.
* Concurrency: If object A is doing something to object C, and object B destroys C (or otherwise makes something happen that invalidates A's operation), what happens? Or worse, B destroys A?
* Object A sends a message to object B. What happens then?

So there are three different language parsers in a mush. (Four if you count SQL, but we'll leave that out for now.) They are:

- The command parser.
- The function parser.
- The locks. (These handle permissions)

In addition to this, there is:

- Everything is a string.
- Attributes. MUSH's version of variables
- The command queues.
- Queue Entries.
- Command *lists*. Not related to the queue.
- The name matcher. (Given the string "Walker", find what is meant by that.)
- The command matcher. (Given the string "+who", what object has the code that runs it?)
- 'pemit' messages and "listening"
- Semaphores and @waits (kinda like a "sleep 10 ; echo hi" in bash)

(I may be leaving a few things out, but I've got the gist of it).

Complicated yet? Let's dive into those. I've known accomplished programmers to break down and go mad when faced with these things. (Granted, those are typically the ones who learned only one paradigm so they're lost without it. And it often helps them hugely with programming in other languages if they put time into learning mushcode.)

Basically: "Commands do things. Functions return things." For experienced programmers: Functions cannot have side effects. Commands can. (Reality: A number of functions do have side effects, but this isn't what I want to duplicate.)

Because mush engines are single threaded, we have queues. Every time a command needs to be run, it gets inserted into one of the queues, and are processed FIFO. Player queue entries have higher priority over object queue entries. In addition, there's a faux "socket queue" - commands given directly by a player are run directly, and not inserted to the queue. (Though that command may itself insert things to the queues.)

Some truisms:
* All players are objects.
* Functions are called to generate output that something else can use. No function is called without either a command or an event (that sends a message) being triggered.
* Commands are caused by three things: Other commands, a player, a trigger, or an event.
* Every command has an enactor (The object who caused the command to run) and an executor (The object who is executing the command). These are important for permissions.

Some conventions:
* Typical programming commands begin with "@" while world-interaction or simpler commands don't. e.g: "@switch" "@pemit" "@create", compared to "think", "say", "look". They are often called @-commands when dealing with them in a programming method.
* Command arguments are typically separated by = for a command with two arguments, and the 'right hand' side might be further separated by commas.
* Commands that are built into the server are called "hardcoded" commands. Commands made using mushcode are called "softcoded commands".
* Similarly: "softcode" is another name for mushcode. "hardcode" refers to C and hacking the server source.

Some caveats:
* A few things behave differently when executed as a command from the socket compared to a queued "object command". Most importantly: setting attributes when "&" - When run directly (socket), the value isn't evaluated. When run from the queue, the value *is* evaluated.
* Text received from the player socket is run directly as a command. It is not a command list, so ";"s do not split commands.

Everything is a string.

Yup. It's all in how you treat it. add(1,1) returns 2, but it's still a string. There's nothing stopping you from calling add(1,hello) other than that it'll complain at you for not using a number where you should be. There's no 'string' type that you designate with quotes. No float type you designate with 1.0f. etc.

Objects

An object has an identifier ('dbref'), name, attributes, locks, flags and a few relationships with other objects: Owner, parents (very similar to inheritance in programming), location, contents (it can contain objects), and 'home' (where it belongs).
Of most import here are the identifier, locks and attributes. Flags can be considered to be part of locks/permissions.

Identifiers are integers prepended by a #. "#0" "#1" "#2". (Interesting note: Error messages are of the form "#-1 ERROR MESSAGE". For true/false checking: A # followed by a negative is false. So "if(#-1 FALSE,this is true,this is false)" returns "this is false")

Name matcher
Objects have contents and neighbors. MUSH provides a name matcher. If you try to 'get' something, it will attempt to match neighbors. If you try 'drop' something, it will attempt to match contents. Try to page a player, and it will attempt to match players (in particular, connected ones).

In addition, it understands a few English keywords: "here" - The location of the object doing the looking. "me" - the object doing the looking.

Attributes
Each object has any number of attributes. (PennMUSH effective limit: 2048). An attribute has a name, flags, and value.

An attribute whose value begins with "$" is checked for matching commands. An attribute that begins with "^" is checked for matching listen patterns.

You set attributes on an object using "&attributename object=value". You fetch them in several ways, the main ones being: "get(object/attrname)" to get an attribute on another object, "v(attrname)" to get an attribute on the executor (who is evaluating the function), "u(object/attrname)" to get and evaluate a function - evaluation happens from the perspective of the object the attribute is on (that object becomes executor). "u(attrname)" to obtain the evaluated result of an attribute on the executor.
To see attributes non-programmatically, use "ex" to examine them. (Kinda like "cat"ing or "grep"ing a file. ex object/attr*pat* will return attributes matching a pattern.)

Evaluating attributes on another object will change the caller. (described below)

Enactor, Executor and Caller

If object A causes object B to evaluate a function or command, then A is the enactor and the caller, while B is the executor. B is executing the code while A caused it to happen and was the immediate caller of it. If during the process of evaluating that function, B calls a function on C, then C is an executor for that code, B is the caller, and A is the enactor.

For shorthand, the function parser knows "%#" as the enactor, "%@" as the caller and "%!" as the executor. For ease of making cosmetic code for mudding, there are registers such as "%n" (the name of the enactor), "%l" (the location of the enactor)

The function parser
This basically evaluates expressions. It's both simpler and more complex than it sounds.

An expression that begins with a function, such as "reverse(hello)" evaluates the function with its arguments. So that generates "olleh". But if a function is not at the beginning of an expression, a new expression can be created using []s. []s are often used to chain functions in a row. "reverse(hello)reverse(goodbye)" evaluates to "ollehreverse(goodbye)". So add []s for "reverse(hello)[reverse(goodbye)]" or be a bit clearer with "[reverse(hello)][reverse(goodbye)]" to get "olleheybdoog"

There are registers that the function parser understand. They are denoted by a percent sign followed by a character. e.g: "%#" is the enactor. "%!" is the executor. Arguments passed to the evaluation are %0-%9.

[reverse(hello)] -> olleh
[switch(2,1,one,2,two,3,three)] -> two. (Flow control)
[elements(one two three four,3)] -> three. (Array manipulation)
[elements(one two three four five,2 2 5 3 2)] -> two two five three two. (Hey, neat!)
[randword(one two three four)] -> A random word from "one two three four"
[add(1,2)] -> 3
... etc

The Queues
There is a player queue and an object queue. These contain command lists that "need to be executed asap". They are processed in a FIFO order. Commands entered from the socket are processed immediately and are not queued.

Object queue: This is the default queue. When an object queues a command list, it is inserted here.

Player queue: The 'high priority' object queue. The only way to get something into this is: When a player enters a softcoded command, then the command list is placed into the player queue. This is a design decision: Players expecting results from softcoded commands always get them quickly, even if there's a large number of objects with entries in the queue.

When the game pumps the queues, it first looks for queue entries in the player queue, then the object queue.

There are two special queues that aren't processed immediately: The Wait Queue and the Semaphore Queue. The Wait Queue contains queue entries that will be popped at a certain time. The Semaphore Queue contains queue entries that are popped on receiving a notification - But may also have a timeout after which they pop. The semaphore queue is interesting, so we'll go in depth into that later, since we'll need some background in the command parser first.

When a command list is popped from the wait or semaphore queue, it is inserted into the object queue. So it might not run immediately when it is supposed to, if there's a packed object queue.

The command parser

A typical command is: [@]command
A typical action list is a series of commands separated by semicolons.

Please note: When I demonstrate commands, queue, and the like, I ignore the fact that socket commands will treat an entire string as a single command, instead of a command list. If you're seeing what looks like a single command happening where you should be seeing two, then put "@wait 0=" before the string to force evaluation as a command list.

> think reverse(olleh)
hello

In here, 'think' is a command. All it does is take its (evaluated) arguments and displays them to the object running the command. Its argument is the expression, "reverse(olleh)". Since think evaluates its arguments, that invokes the function parser on "reverse(olleh)". Which turns it into "hello"

> &color me=blue ; think v(color)
Walker/COLOR - Set.
blue

There are two commands in this command list. "&color me=blue" and "think v(color)". They run in sequence - The first command completes before the next one runs. And so on. "v(color)", as stated in 'attributes' above, returns the value of an attribute. In this case, 'blue'. So the object thinks 'blue'.

> @wait 2=think reverse(olleh)
// (2 seconds pass)
hello

An example of a command that causes another command. There is one command here: "@wait". It has two arguments, separated by the =. The first is the # of seconds to wait, the second is the command that is inserted into the wait pool. In two seconds time, the right hand side will be added to the object queue, at which point it runs. The right hand side of @wait is not evaluated, but when it is dequeued, it is processed as normal. And 'think' evaluates its commands.

> @switch 2=1,think one,2,think two,3,think three
two

@switch here is used as a control structure for commands. It evaluates the left side, gets the test case (in this case, 1), then for the *unevaluated* arguments in the right hand side, it: Evaluates one. If it matches the test case, then the unevaluated argument after it is added to the queue, and is then evaluated when dequeued.

> think before @switch ; @switch 2=1,think one,2,think two,3,think three ; think after @switch
before @switch
after @switch
two

... Wait, wtf just happened? That's the sheer horror that is the queue. This whole thing is one command list. When executed, all the commands in this command list are executed in a sequence. That's three commands: think before, @switch, and think after. As they are executed: @switch inserts a command into the queue (it isn't run immediately). After this command list is complete, the next item in the queue is popped out and executed. In this case, it is "think two".

> think before ; @dolist one two three=think ## ; think after
before
after
one
two
three

Same queue thing here. @dolist is a command that: For each argument in the left hand, creates a new queue entry using the right hand, with all instances of '##' replaced with the current argument. (And '#@' replaced with the current position).

> think one ; @assert 0 ; think two
one

@assert (and its more negative companion, @break) halts execution of the current command list if its left hand argument evaluates to a false expression. (@break halts if it evaluates to a true one). If it is given a right hand argument, it replaces the command list with that, instead.

> think one ; @assert 0=think interrupted! ; think two
one
interrupted!

Like so.

Queue Entry / Process Entry
Every function that's evaluated, command that's executed, lock that's tested ... is within a queue entry. A queue entry contains the following information for the commands, functions and locks evaluated under its purview:

* Enactor. (%#)
* Caller. (%@)
* Executor. (%!)
* Arguments. (%0-%9)
* # of arguments. (%+)
* Regular Expression patterns from the current regexp match, if any. ($0-$9, $)
* The command given. (%c, %u)
* 36 Q-registers. These are temporary data registers that can be set and fetched by functions. Similar to thread-local variables. (%q0-%q9, %qa-%qz)
* Some other data useful only for information, debugging or similar purposes: Function call count so far, recursion depth, etc.

Some commands copy queue entries entirely. e.g: @wait, @switch, @dolist. The only things changed are the command given and the debugging information.

Other commands create entirely new queue entries. @force, @trigger.

Functions (or command side effects) that call and evaluate attributes on other objects or on themselves will modify caller and executor. e.g: if A calls u(B/foo), then within the confines of evaluating foo, %@ is A, while %! is B. If B/foo then calls u(bar) - evaluating 'bar' on B, then within the confines of evaluating bar, "%@" is B as well.

The $-command matcher

There are hardcoded commands: The ones the mush knows, and there are softcoded commands - That you can create. Softcoded commands, however, are second class citizens (in more than one way).

So how does it find commands: As I said above with attributes, an attribute beginning with "$" is a mushcoded command. (Typically called $-commands).

When a string is executed as a command, here's the flow:
1) If ] is the first character, then the whole command is rendered "no-eval", so step (3) and (5) don't happen, and most hardcoded commands behave differently: Not automatically evaluating their arguments if they otherwise would be. Advance to second character (remove ']' from the equation)
2) If the first character is special, a hardcoded command is executed. (eg: '"' for 'say') Skip the rest of this list.
3) The first word of the command is evaluated by the function parser.
4) If that first word matches a command within the hardcode command, execute that command and skip the rest of this list.
5) The rest of the expression you enter is evaluated by the function parser.
6) It searches the attributes of all objects in your contents, your vicinity (including yourself), then if it still receives no match, looks at 'global' objects for the commands. These global objects exist in a room (typically #2).
7) If no command matches, you get an error. (Huh? (Type "help" for help))

If it does find a matching $-command, it inserts its command list into the queue for processing. - The object queue by default, but if the player typed the $-command at the console, then it inserts into the player queue.

(For new mushcoders: make sure you're !no_command and permitted to run commands on yourself):
> @set me=!no_command
> @lock/use me==me
> @lock/command me==me

> &cmd_hello me=$hello:think Good morning!

That sets an attribute on you that will match the string "hello" (the entire string, and in any case). So with this, you can do:

> hello
Good morning!

You can use glob patterns and regexps for this, which will let you obtain user input.

> &cmd_reverse me=$reverse *:think The reverse of '%0' is '[reverse(%0)]'
> reverse hello
The reverse of 'hello' is 'olleh'

Regexp-style commands are available as well: They give you more power and more headaches.

And now we come to another reason for second-class citizenhood for $-commands: The dreaded queue.

> &cmd_hello me=$hello:think Good morning!
> &cmd_test me=$test:think before hello ; hello ; think after hello
> think before test ; test ; think after test
before test
after test
before hello
after hello
Good morning!

So what's happening here? "think before test ; test ; after test" gets queued, and executed as a command list. During that time, 'test' searches for and finds the matching cmd_test on yourself, and queues "think before hello ; hello ; think after hello". After it completes the current test, it pops out that one and runs 'think before hello', finds and queues "think Good Morning!" then runs "think after hello". Then finally pops out "think Good morning!" and executes that. So be careful with the queue!

Pemits and listen patterns:

It wouldn't be a game if there wasn't anything to read, see, etcetera! So that's why we have pemit messages. You can send messages to any object. If that object is a connected player, it's further sent to the socket for your reading pleasure! Because of the frequency of this, players do not have listen patterns.

But objects *do* have listen patterns. You can look at it as a noisy version of the command matching program, as it is otherwise identical! Instead of using $s for commands, you use ^s to denote listen patterns.

> @create Walmart Greeter
> @set Walmart Greeter=monitor
> &listen_wave Walmart Greeter=^* says, "Hello":say "Hello, [name(%#)]
> say Hello
You say, "Hello"
Walmart Greeter says, "Hello, AndStateYourName!"

(Replace AndStateYourName as appropriate. ;) Again, this suffers from the same problems of the queue as $-commands do.

The Semaphore Queue

The Semaphore Queue is dependent on object/attributes. (What? Yup! Wow!). A semaphore attribute with a positive value has commands waiting. A 0 value is no commands waiting. A negative value is special - It means "When a semaphore entry is created for this attribute, pop it immediately!" When an entry to a semaphore queue is created, it adds one to the attribute. When it leaves, it removes one. If an @notify is sent to a semaphore attribute, then it causes one to pop. Rather than setting an attribute to 0, the semaphore queue will remove it. If no semaphore name is given, the default is used: 'semaphore' (Clever, eh?)

As such:

> @wait me/q1=think one
> ex me/q1
Q1 [#1ic+]: 1
> @wait me/q1=think two
> ex me/q1
Q1 [#1ic+]: 2
> @notify me/q1
one
> @notify me/q1
two
> ex me/q1
No such attribute.
> @notify me/q1
> ex me/q1
Q1 [#1ic+]: -1
> @wait me/q1=think one
one <-- This is executed immediately.
> ex me/q1
No such attribute.

The Semaphore allows for some more fine grained control over programming. Remember @dolist above? What if you wanted some code to run when the list was finished processing? Adding the "/notify" switch to the @dolist causes it to notify the default semaphore queue when the list is completed processing.

> @wait me=think All done! ; think Starting dolist: ; @dolist/notify one two three=think ##
Starting dolist
one
two
three
All done!

Permission
There's many different ways in which permission is needed and checked for. If you have permission to do one thing, you don't necessarily have it to do another.

General categories for permission:
- Default: Who can pick this object up or drop it. Or who can enter an exit.
- Use: This covers commands, listens, ufuns and the ability to 'use' an object.
- Examine: Who can see the attributes of an object?
- Control: Who can change things on an object?
- Evaluate: Who can evaluate things on an object? (using u())
- Command: Who can run the $-commands an on object?
- Listen: Who can trigger the listen patterns on an object?

Permission control flows like this:

1) Wizard objects and players control everything except God or other Wizards they don't own. (God actually has very few powers, but is able to set others Wizard. If he sets himself wizard, that's a different story) They have read-write access to everything.
2) Royalty flag and see-all power grant objects the ability to see everything, but not control.
3) Owners of an object control an object. An object has one owner, but can be zoned to a zone that has multiple owners, and thus have multiple owners.
4) An object set visual can have everything on it read.
5) You control an object if you pass its @lock/control (Caveat: Unless the object is wizard.)
6) You can read attributes on an object if you pass its @lock/examine
7) You can call and evaluate attributes on any object you control, or if the attribute is set visual.
8) An object set HALT cannot use the function parser and will not be checked for anything.

In addition, there are permission controls related to being able to run an objects' $-commands and listen patterns.
To run an $-command: pass @lock/command, @lock/use and the object must not be set no_command.
To trigger a listen pattern: pass @lock/listen, @lock/command, @lock/use, and the object must be set monitor.

There are a large number of flags that impact permission in different areas. Really - Control, Permission and flags deserves almost a whole document all to itself.

Locks:
"Locks" allow fine grained control over an object's permissions. They aren't used to determine what the object can control, but what can can control the object. This is a very simple language that looks similar to logic statements, because that's all they are.

Syntax: @lock [/]= - Sets lock on to

Sample locks:
=#dbref -- Only the object identified by #dbref can pass it.
!=#dbref - Everybody except the object identified by #dbref can pass it.
#dbref -- Only the object idenfied by #dbref, or somebody carrying it, can pass it.
=#dbref1 | =#dbref2 - The | is an 'or'.
FLAG^WIZARD - Only objects with the wizard flag can pass this lock
$#dbref - Only objects owned by #dbref can pass this lock.
FOO:1 - Only objects that have an attribute named "FOO" with the value "1" can pass this.
FOO:>1 - Only objects with attribute named "FOO" that have a numeric value higher than 1 can pass this.
FOO/1 - This is different. It evaluates attribute FOO on the object that the lock is on, and if it returns 1, then that's a pass.

Logical control structures: !, &, |, and ().
@lock womens bathroom=(job:janitor&sex:male)|sex:female