On Software Engineering for MUSHCode

On Software Engineering for MUSHCode Mark Sun, 2007-06-24 16:55

Taken from my writeup on ES in response to a question on courses for the non-CS person.

--

Tough question. Software design, from my experience, comes down to learning how to apply methodologies and techniques based on the problem. The application is often as much art as it is a science so most academic courses appear useless until experience is gained. A bit of a catch-22 but the coursework is often useful later on when it finally dawns on you what the point was.

If I were to suggest something, it would general introduction to software engineering and object oriented design. The latter doesn't always fit the bill for software. However, the techniques are useful for taking system requirements and dividing the system out into discrete components no matter what the final decision is on implementation.

My general thoughts on the topic include:

Preliminary Design

Before coding, I highly suggest a written description with a length matching the complexity of the system. It should not contain any discussion of how it works ... just what it needs to do. Then start listing the requirements of the system -- what it should do and what it should not do. Both are very important. For multiple authority levels such as globals, dividing the list into discrete sections for mortals and staff/wizards/etc will help

After you get that written, hand it off to others to review if you can. Since it is nothing more than a description you can give it to anyone -- even those that don't code. Use people with a good eye for details especially for the requirements review. Including the end-users of the system will greatly aid in the design process. They may run off on a tangent but hopefully they'll identify any weaknesses in the description or requirements you overlooked. Incorporate all the relevant comments and update the document. Repeat the process if significant changes are introduced.

To demonstrate the usefulness of doing up front preliminary design, jot down what you'd do for a common +who or +where command. Compare it to an operational system. Those commands are not complex systems but do contain significantly different functionality depending on who executes the command -- a player or a staffer.

Detailed Design

After the basic design is complete, the detailed design begins. The system is subdivided into smaller elements or modules of functionality. This step is the beginning of the code but it is not functional code...it is pseudo-code -- somewhat descriptive and somewhat code. List every element of the system and describe precisely how it will work. Refer back to the preliminary design to validate that it does what is intended and doesn't do things that wer not intended. Watch for functionality that is repeated in several areas and move anything that is reused into a discrete function.

Write a detailed specification for every command based on its intended purpose and restrictions it imposes. Document any and all shared functionality ... these bits, even if trivial will become the building blocks of the system. Learn to build complex functionality based on smaller elements.

Detailed design is often skipped by experienced coders because they can mentally assemble what is needed and how it will work. It is key to learning how to divide code into necessary elements and shared functionality. Making the decision to divide shared functionality into discrete elements is experienced based. If you see the same code more than once in a system, it is better to make it a discrete element.

Consider something as benign as check to see if a user of a command is a wizard. While it is trivial code, if that function is separated tomorrow you can expand the code to check if the user is a wizard, or has the royalty or staff flag without much effort. A simple function to check for wizards is better off as checking for an authorized user rather than assuming wizards will be the only users of the functionality.

Coding

Pick a style. Stick with it. Never deviate from the style within a system. Many articles have been written on this topic. If modifying a preexisiting system, match the style of the original coder to the best of your ability to keep the system cohesive for the person after you.

Verbosity in naming is your friend. It is tedious when writing the initial code but is immensely useful when you need to fix something in a few months. Consider a function named like FN_USER_OK vs FN_AUTHORIZED_USER. The former might be a validation check or an authority check. The latter makes it clear what the function does.

Always prefer to refactor code as you write it. If you write the same block of code more than once or see an opporunity to cut and paste....move that code into its own function -- and FIX the original block to use the new function. Do not allow yourself the luxury of cut and paste. Keep all code short and simple.

Along the same lines, if there are minor differences between several similar functions, consider how you can make the basic code more generic to support all the cases. Identify if passed arguments are moot are unique. Fix the earlier code.

Document the code to the best of your ability. It is difficult inline with MUSHcode but can be accomplished by using good names for attributes. Consider utilizing one of the many offline formatting techniques for larger systems. Do NOT mix online coding with offline coding. Eventually you will forget to merge a change and chaos will ensue.

Prefer to write obvious code rather than intricate optimizations unless absolutely necessary. Code that is readable is supportable. Most optimizations are neither necessary or supportable when you think of them as you code. If the code is well written, optimization can be done as needed for scalability. The key is to make it work in a fashion that anyone can read it. Premature optimization is most likely to make it less maintainable by you and by others.

Do not make assumptions. If you do not know what the answer on a requirement, find it before you code it. Avoiding assumptions makes the system flexible, maintainable and hopefully scalable. If you must make an assumption, parameterize that assumption independently of the code.

Testing

Test both code and the system. I tend to write code in increments and poke at functions and commands as they are completed. That ensures I have functional code that does what I expect but it does not ensure the code is correct for the system.

Incremental testing is extremely useful and avoids a lot of code problems. Test every bit of code you write. Make sure it does what it expected but also test with garbage to make sure it handles that as well.

Then test with the expected and unexpected. I find the latter hard to do but there are many individuals who are skilled at it. Ask other staffers and trusted players to do their best to break it. Let them find nuances you didn't expect.

Rinse, Wash, Repeat

None of this comes overnight. The capability increases with experience. Even after years of experience, you are unlikely to be skilled in all areas. You'll do better in all of them but will probably find yourself lacking in one or more. That's okay. Concentrate on improving that area and ask for help if you discover you aren't very good at it.

Code tip: Data Factory

Code tip: Data Factory Amberyl Wed, 2009-01-14 06:57

Data handling is one of the most awkward things in MUSH. You want your data to be compact, so you want to try to avoid splattering it across a zillion individual attributes. But you also need your data model to be flexible, so that you can add fields to your data structure over time. If you shove the entirety of a data structure into a list, you can often end up with code that's hard to write and debug, because you're constantly trying to find and edit elements embedded within that list.

My belief is that one of the reasons that people find MUSHcode extremely time-consuming to write, as well as hard to maintain, is that their data models, and the way they handle, store, and manipulate data simply isn't very good. Moreover, it is incredibly easy to write obfuscated MUSHcode.

My solution to this is a layer of what I call Data Factory code. What follows is an explanation plus the code for it.

The Data Factory Concept

At minimum, every data entity consists of a data structure type, its ID number (specific to type), the dbref of the object that it is stored on, the structure version number, an owner, the last object to edit it, and the last edit time. Every data entity also contains a set of named fields, specific to its type (and to the version number of that type).

So, for instance, I can have a data structure type called "ship". Version 1 of the ship structure might just contain "xpos ypos" as its two named additional elements, indicating the ship's coordinates. Version 2 might be "xpos ypos homeport"; using this system, I can make that modification and have the data migration to the new format
be automatic and invisible.

In addition to a list of named fields, data structures can be customized with a set of labels for those fields (used for the default print function for that type), and a set of defaults to populate in those fields when a new instance of that type is created. I can also set a maximum number of IDs to store on a data object for entities of that
type. If this is set, the MUSH automatically creates an additional data object every time that maximum is exceeded.

Every time I create or load a data entity, I can give it an arbitrary identifier, like "myship". All fields within the entity are accessed as identifier.field -- so, for instance, if identifier "myship" is of type "ship", then the global register myship.homeport is the value of its homeport field.

My generic MUSHcode SAVE_FN takes an identifier and saves it. Since the fields are just global registers, you can change them with setq() and friends, and the save will write them out automatically; since the entity carries its ID and datastore object on it as part of its fields, the programmer does not have to worry about the details. Similarly,
when you call LOAD_FN, you just need to provide a type and the ID; it knows how to derive the data object it's stored on.

Because I'm working with named global registers, and each such register is very clearly associated with a particular entity (attacker.weapon is obviously different than defender.weapon, say), making any changes to data is just a matter of writing to a register, then saving the whole blob. It saves me a gigantic amount of time, and the resulting code is a lot easier to read, too.

So, to take the ship example, if I wanted to create a new ship, owned by me (the enactor), and give it initial coordinations of (4201, 6250) and a home port of Mars, and be able to readily refer to it as 'newship' in the remainder of this block of code, I'd do this sequence of functions (the setq calls can be combined if desired):

[u(NEW_FN, %#, newship, ship)]
[setq(newship.xpos, 4201)]
[setq(newship.ypos, 6250)]
[setq(newship.homeport, Mars)]
[u(SAVE_FN, newship)]

I can access the newly-saved ship's ID with r(newship.id) and that number is the way I can access that data later. So at some future point, if I know that this is ship ID 123, I could do [u(LOAD_FN, oldship, ship, 123)] to load it with the identifier "oldship", allowing me to access the data with r(oldship.xpos), and so on. If I want to save any modifications, it's as easy as calling [u(SAVE_FN, oldship)]

This kind of technique can not only save you a lot of coding time, but it can also make your code much more readable.

The Data Factory Code

This code is for TinyMUSH 3.1, but can be adapted for any current MUSH flavor. It's intended to be quoted through Adam Dray's Unformat. (There are places where %q substitutions have been done with r() instead, to make this easier to post in HTML format.)

If you make heavy use of this, instantiating lots of different entities in a single pass, you may need to bump up the number of global registers that an object is allowed to use.

For convenience, let's start by defining a couple of global named references, which allow us to assign names to dbrefs. #67 and #68 just happen to be the dbrefs on my database; substitute whatever objects for yours. IF your codebase doesn't support nrefs, just use normal dbrefs; it's just going to allow us to use #__factory and #__meta to refer to the objects.

@reference _factory = #67
@reference _meta = #68

#__factory is the object that's going to contain our data factory code, along with our data definitions. #__meta is for our metadata; it's the object that is going to contain all the actual runtime data, like where the actual entity data is stored, and how many IDs of a particular type we've created. We're also going to use the metadata object as a storage container for the data objects. Both of these objects should be flagged HALT and SAFE.

We'll begin with the function that creates a brand-new entity. Every entity has an ID number, which is permanent; this is distinct from its identifier, which is the handle for the instance. We can normally allow the function to simply take care of where to store the data, but it can take an optional final parameter. It will also populate the new entity with the defaults for that type.

# Call as: u(NEW_FN, owner_dbref, identifier, type)
# Returns: Nothing.
# Side-effects:
#   - Sets identifier.various_fields registers to type defaults.
#   - Increments the top ID of the type.
#   - May create a new data object, if we've exceeded a max-IDs breakpoint.
#
# %0 - owning player, %1 - string identifier
# %2 - data structure type, %3 - data object (optional)
#
&NEW_FN #__factory=
[setq(%1.owner,%0)]
[setq(%1.editor,#-1)]
[setq(%1.etime,secs())]
[setq(%1.type,%2)]
[setq(%1.obj,usetrue(%3,last(setr(lo,get(#__meta/%2_OBJ)))))]
[setq(%1.id,inc(get(#__meta/%2_TOP)))]
[set(#__meta,%2_TOP:[r(%1.id)])]
[nonzero(cand(notbool(%3),
              setr(ld,v(%2_MAX)),
              gt(r(%1.id),r(ld)),
              eq(mod(r(%1.id),r(ld)),0)
         ),
  [setq(%1.obj,create([capstr(lcstr(%2))] Data [inc(words(r(lo)))],10))]
  [set(#__meta,%2_OBJ:[r(lo)] [r(%1.obj)])]
  [set(r(%1.obj),halt)][set(r(%1.obj),safe)][tel(r(%1.obj),#__meta)]
)]
[qvars(iter(v(%2_DATA),%1.##),v(%2_DEF),`)]
-

Creating an entity doesn't actually save it permanently; the assumption is that you'll create, alter the fields as need be, and then save it. So our next thing is our save function, which we call with the entity's identifier, and the dbref of the player (or object) that we want to note is responsible for the change.

# Call as: u(SAVE_FN, identifier, editor_dbref)
# Returns: Nothing.
# Side-effects: Saves the entity to an object, as attr type_ID
#
# %0 - identifier, %1 - editor
#
&SAVE_FN #__factory=
[case(,
  r(%0.obj),#-1 NO OBJECT,
  r(%0.id),#-1 NO ID,
  set(r(%0.obj),
    [r(%0.type)]_[r(%0.id)]:
      [default([r(%0.type)]_V,1)]`[r(%0.owner)]`[usetrue(%1,%#)]`[secs()]`
      [iter([v([r(%0.type)]_DATA)],edit(r(%0.##),`,'),,`)]
  )
)]
-

As a word of warning, because the backtick ` is used to separate data fields, you need to make sure to clean all ` out of your data before saving it. The code automatically does this for you, at the moment, replacing ` with ' at save time. If you want to worry about doing that yourself, replace the line:

[iter([v([r(%0.type)]_DATA)],edit(r(%0.##),`,'),,`)]

with:

[iter([v([r(%0.type)]_DATA)],r(%0.##),,`)]

and keep your data clean by checking it before saving it.

Now that we can save data, we need to be able to load it. We call this with the identifier we want to associate with this instance of the entity, the entity's type, and the entity's ID. We can normally allow it to just figure out what data object to read it from, but we can also specify it with an optional final parameter. Our load function is also able to automatically migrate data in an old format to the current version of that type. (Note that when we update the data format, we need to keep the attribute type_DATA_version on #__factory in order to know how to load that previous version.)

# Call as: u(LOAD_FN, identifier, type, ID)
# Returns: Nothing.
# Side-effects:
#   - Success: Sets identifier.various_fields registers to data.
#   - Failure: Sets identifier.various_fields registers to null.
#
# %0 - string identifier, %1 - data structure type, %2 - ID, %3 - data object
#
&LOAD_FN #__factory=
[nonzero(neq(words(setr(lo,usetrue(%3,get(#__meta/%1_OBJ)))),1),
  setq(lo,extract(r(lo),inc(div(%2,v(%1_MAX))),1))
)]
[nonzero(setr(ld,get(r(lo)/%1_%2)),
  nonzero(qvars(iter(v owner editor etime [v(%1_DATA)],%0.##),r(ld),`),
    /@@ read failed - wrong data version - upgrade automatically @@/
    [qvars(iter(v(%1_DATA),%0.##),v(%1_DEF),`)]
    [qvars(iter(v owner editor etime [v(%1_DATA_[first(r(ld),`)])],%0.##),
           r(ld),`
    )]
  ),
  /@@ no data, return empty @@/
  [setq(%0.v,-1)][setq(%0.owner,#-1)][setq(%0.editor,#-1)][setq(%0.etime,-1)]
  [null(iter(v(%1_DATA),setq(%0.##,)))]
)]
[setq(%0.type,%1)][setq(%0.id,%2)][setq(%0.obj,r(lo))]
-

It's useful to have a wrapper function that does a load, and tells us whether the load succeeded or not. So we make a function that returns 0 or 1, indicating failure and success. We'll probably rarely call LOAD_FN directly, since we usually care about knowing whether or not we have an error to handle.

# Call as: u(OK_LOAD_FN, identifier, type, ID)
# Returns: 0 if the load failed, and 1 if the load succeeded.
# Side-effects:
#   - Success: Sets identifier.various_fields registers to data.
#   - Failure: Sets identifier.various_fields registers to null.
#
# %0 - string identifier, %1 - data structure type, %2 - ID, %3 - data object
#
&OK_LOAD_FN #__factory=[u(LOAD_FN,%0,%1,%2,%3)][gt(r(%0.v),0)]
-

In many cases, we'll want to check whether a particular entity exists or not, before attempting to do some operation on it. So we have a function that simply checks if a given ID number of a specific type, exists (where "exists" is "has been saved and exists as an attribute on the data object"). Like usual, we can let the function just take care of finding the appropriate data object, but it can be specified as an optional final parameter if desired.

# Call as: u(EXISTS_FN, type, ID)
# Returns: 0 if ID of type does not exist, and 1 if it does.
# Side-effects: None.
#
# %0 - data structure type, %1 - ID, %2 - data object
#
&EXISTS_FN #__factory=
[nonzero(neq(words(setr(lo,usetrue(%2,get(#__meta/%0_OBJ)))),1),
  setq(lo,extract(r(lo),inc(div(%2,v(%1_MAX))),1))
)]
[hasattr(r(lo),%0_%1)]
-

One last piece of magic: Every data type can have up to 32 flags; "flags" must be one of the field names chosen in order to enable this. These flags are stored as a bitfield. So we need a couple of functions to manipulate flags.

We create a generic function that we use to set and unset flags, calling it with an identifier and a list of flags that we want to set or unset; to unset a flag, just precede its name with a !.

# Call as: u(FLAGMOD_FN, identifier, list_of_flags)
#          list_of_flags can contain flag and !flag lists
#          This is used to set and unset flags, respectively.
# Returns: Nothing.
# Side-effects: Modifies identifier.flags global register.
#
# %0 - identifier, %1 - flag list
#
&FLAGMOD_FN #__factory=
[setq(%0._f,v([r(%0.type)]_FLAGS))]
[setq(%0._d,elements(%1,matchall(%1,!*)))]
[setq(%0._u,setdiff(%1,r(%0._d)))]
[nonzero(r(%0._d),
  setq(%0.flags,
    bnand(r(%0.flags),
          ladd(iter(r(%0._d),
               iftrue(match(r(%0._f),delete(##,0,1)),power(2,dec(#$)),0)))
    )
  )
)]
[nonzero(r(%0._u),
  setq(%0.flags,
    bor(r(%0.flags),
          ladd(iter(r(%0._d),iftrue(match(r(%0._f),##),power(2,dec(#$)),0)))
    )
  )
)]
[setq(%0._f,,%0._d,,%0._u,)]
-

Then we need a function to check if an entity possesses a flag. We can just call it with the identifier and the flag we want to check for.

# Call as: u(FLAGGED_FN, identifier, flag_name)
# Returns: 0 if the entity doesn't have the flag, 1 if it does.
# Side-effects: None.
#
# %0 - identifier, %1 - flag to check for
#
&FLAGGED_FN #__factory=
[iftrue(match(v([r(%0.type)]_FLAGS),%1),
  band(r(%0.flags),power(2,dec(#$))),
  0
)]
-

Finally, we want to have a quick-and-dirty way to display all data associated with an entity. We'll almost certainly write our own custom data views, but this is very handy for debugging purposes, and we'll try to make the format nice enough that it's a reasonable view until you get around to writing something nicer for a given data type. For a bit of customization without having to write something totally different, you can set the register identifier.show, which should be formatted text to show between separators, after the main body of data is shown.

# Call as: ulocal(SHOW_FN, identifier)
# Returns: Displays dump of data for an entity.
# Side-effects: None intended; call with ulocal(). 
#
&SHOW_FN #__factory=
[setq(f,iter(v([r(%0.type)]_DATA),capstr(##),%b,`))]
[setq(l,usetrue(v([r(%0.type)]_LABELS),%qf))]
[nonzero(setr(m,match(%qf,flags,`)),
  [setq(f,replace(%qf,%qm,flagwords,`))]
  [setq(%0.flagwords,
    iter2(setr(b,v([r(%0.type)]_FLAGS)),iter(%qb,power(2,dec(#@))),
      nonzero(band(r(%0.flags),#+),##)
    )
  )]
)]
[setq(w,add(2,lmax([strlen([r(%0.type)] ID)] [iter(%ql,strlen(##),`)])))]
[setq(r,sub(40,%qw))]
%xb[repeat(-,78)]%xn%r
[ljust(%xr[capstr(r(%0.type))] ID:%xn,%qw)] [ljust(r(%0.id),%qr)] /@@ @@/
[ljust(%xrEditor:%xn,11)] [Color(r(%0.editor))]%r
[ljust(%xrOwner:%xn,%qw)] [ljust(Color(r(%0.owner)),%qr)] /@@ @@/
[ljust(%xrEdit Time:%xn,11)] [convsecs(r(%0.etime))]%r
[iter2(%ql,%qf,
  [ljust(%xr##:%xn,%qw)] [r(%0.#+)],
  `,%r
)]%r
[nonzero(r(%0.show),
  %xb[repeat(=,78)]%xn%r[r(%0.show)]%r
)]
%xb[repeat(-,78)]%xn
-

And that's it. All we have to do now is to define data types.

Defining a Data Type

All information for data types is stored on #__factory. A type name is a single word; for convenience, it should probably be a short word, like "ship". The definitions consist of the following attributes:

type_DATA: a space-separated list of field names
type_LABELS: a `-separated list of user-friendly field labels
type_DEF: a `-separated list of field defaults
type_FLAGS: optional; a space-separated list of flag names
type_MAX: optional; the maximum IDs to store on one data object

All field and flag names should be lowercased. Also, make sure that no field name ever starts with an underscore _, because that's used for variables internal to the factory code.

An Example of Usage

Here's a ship example:

&SHIP_DATA #__factory = name xpos ypos homeport
-
&SHIP_LABELS #__factory = Ship Name`X Coord`Y Coord`Home Port
-
&SHIP_DEF #__factory = Unnamed Ship`100`100`Earth
-
&SHIP_FLAGS #__factory = needs_repair in_hyperspace stolen
-
&SHIP_MAX #__factory = 100
-

You also need to seed the data objects by creating a data object, and doing:

&SHIP_OBJ #__meta = dbref

Some very crude examples of use (#__globals is the global command object, which we @parent to #__factory), that allow us to create, display, change the home port of a ship, and violently take a ship out of hyperspace and flag it as needing to be repaired:

# Command: +ship/create ship_name for player
#
&DO_SHIP_CREATE #__globals = $+ship/create * for * : @pemit %#=
case(0,
  hasflag(%#,Wizard),Only wizards can create ships.,
  t(setr(0,num(*%1))),'%1' is not a valid player.,
  [u(NEW_FN,%q0,this,ship)]
  [setq(this.name,%0)]
  [u(SAVE_FN,this,%#)]
  New ship created for [name(%q0)]. ID number is [r(this.id)].
)
-

# Command: +ship/show ID
#
&DO_SHIP_SHOW #__globals = $+ship/show *: @pemit %#=
case(0,
  u(OK_LOAD_FN,this,ship,%0),That is not a valid ship ID.,
  controls(%#,r(this.owner)),Permission denied.,
  u(SHOW_FN,this)
)
-

# Command: +ship/port ID at port
#
&DO_SHIP_PORT #__globals = $+ship/port * at *: @pemit %#=
case(0,
  hasflag(%#,Wizard),Only wizards can change the home port of ships.,
  u(OK_LOAD_FN,this,ship,%0),That is not a valid ship ID.,
  [setq(this.homeport,%1)]
  [u(SAVE_FN,this,%#)]
  Home port of ship '[r(this.name)]' changed.
)
-

# Command: +ship/crash ID
#
&DO_SHIP_CRASH #__globals = $+ship/crash *: @pemit %#=
case(0,
  hasflag(%#,Wizard),Only wizards can crash ships.,
  u(OK_LOAD_FN,this,ship,%0),That is not a valid ship ID.,
  u(FLAGGED_FN,this,in_hyperspace),That ship is not in hyperspace.,
  [u(FLAGMOD_FN,this,!in_hyperspace needs_repair)]
  [u(SAVE_FN,this,%#)]
  You crash the ship '[r(this.name)]'.
)
-

Easy, yes? Hopefully you'll find this kind of approach useful in your own coding.