Changing the softcode parser

Submitted by raevnos on Thu, 2012-07-19 22:38

Every softcode function is implemented by a corresponding C function (Here called a FUNCTION, after the macro that sets up its arguments). Arguments passed to the function are give in an array of strings, and it returns things by appending to another string. For functions that work with numbers or dbrefs or other things besides strings, there's a lot of conversions from strings to another type and then back. This seems like a waste.

I want to change this so that FUNCTIONs take arguments and return values in variant types. Call them pvs, short for penn values. If a function's arguments are already in the right data type, no conversion is needed. add(mul(2,3), sub(6,3)) would mean mul() and sub() would have to convert from the default string arguments, but add() would get two doubles as args, instead of strings.

For each mush data type (Integers, doubles, strings, dbrefs, booleans, etc.), there would be a set of pv creation, manipulation and predicate functions. For example, pv_new_int, pv_is_int, pv_to_int, pv_int_val would respectively make a new tagged integer value, test if a pv is an int, convert any other pv type to a newly returned int one, and return the integer stored in the pv.

I'm planning on making them immutable, which means that string pvs will only have enough space allocated to store one particular string, instead of the current 8k chunks that we throw around all over the place. Less memory! They'd also get garbage collected as needed, to avoid having to deal with keeping track of when they should be freed. Less programmer time!

Lists would be a pv type too, represented as an array of pvs. I'm toying with unboxed numeric arrays, for use with things like vector functions, as well. Instead of an array of pvs, it'd be an array of doubles that aren't wrapped in a pv struct. More space savings.

This is a big project; it requires fiddling with the Lovecraftian monstrosity that is process_expression(), and every single FUNCTION. Everybody with custom hardcode that adds functions would have to rewrite theirs too. It might be a PennMUSH 1.9 or 2.0 level change.

It might make sense to add unicode support at the same time, since that requires touching even more stuff. At the very least, the changes will make it easier to convert later.