INTRODUCTION
------------
Extensible Script Language (ESL) is a general purpose compiled script language
which is designed to be extensible with new commands for target application
purposes. It provides the following features:

    32 (or 16) bit arithmetic with a full set of operators.
    String manipulation.
    Numeric and string variables.
    IF/ELSE, WHILE, DO/WHILE, SWITCH/CASE, GOTO/CALL/RETURN constructs.
    Easily expanded with additional opcodes/functions.
    Very small and fast run-time execution engine:
      Approx. 200 lines of C (+ extensions).
      Base engine uses only 10 "opcodes" (+ extensions).

A given ESL implementation will add it's own extension keywords, opcodes and
handlers to provide the specific capabilities needed by the target application.


SCRIPT LANGUAGE NOTES
---------------------
All elements of the script language excepting for the content of quoted
strings are CASE INSENSITIVE. In other words, keywords, symbols and
numeric base specifiers may be entered in any combination of upper and
lower case.

The script compiler recognizes // as a comment indicator. Any text which
occurs after // on any line is ignored (except within quoted strings).

Statements are one per line, however you may place multiple statements
on the same line by separating them with ';' (You do NOT need a trailing
';' after every statement, only those that have additional statements
following on the same line).


ARITHMETIC
----------
Anywhere a numeric value is required, you can use a full expression
consisting of the following value elements:
    #               = Decimal number        digits: 0-9         eg: 1234
    %# or 0b#       = Binary number         digits: 0-1         eg: %1010
    @# or 0o#       = Octal number          digits: 0-7         eg: @377
    $# or 0x#       = heXidecimal number    digits: 0-9,A-F     eg: $2F3D
    name            = Named variable or constant
    name[expression]= Limited indexing capability is available
    &name           = Variable slot index (*1)
    [name,...]      = Mask of specified variable slots (*1 *2)
    [SECTION]       = Mask of all variable slots in current section (*1 *2)
    (expression)    = Brackets may be used to force subexpression grouping
Any of these value elements may be preceeded by these monadic operators,
which have the same meaning as they do in C:
    -           = Negate
    !           = Logical compliment
    ~           = Bitwise compliment
Two or more value elememts may be combined with these dyadic operators
which have the same meaning as they do in C:
    +           = Add
    -           = Subtract
    *           = Multiply
    /           = Divide
    %           = Modulus (remainder after division)
    &           = Bitwise AND
    |           = Bitwise OR
    ^           = Bitwise Exclusive-OR
    <<          = Shift Left
    >>          = Shift Right
    =           = Assignment to numeric variable
    ==          = Test for equal
    !=          = Test for NOT equal
    <           = Test for lower
    <=          = Test for lower or equal
    >           = Test for greater
    >=          = test for greater or equal
The logical operators are compatible with the C versions, however they
differ slightly so as to provide increased functionality:
    a && b      = Return 0 if a==0, otherwise return b
    a || b      = Return a if a!=0, otherwise return b
Operations are done from left to right(*3) with no precedence. Operator
precedence can be forced with the use of brackets:
  5+2*10        = 70
  5+(2*10)      = 25

*1 These functions are not generally useful in normal arithmetic, however
   they may be useful with extensions which manipulate variable slots
   directly (such as PUSH/POP variable functions).

*2 Mask is a binary value which has a '1' bit corresponding to each
   applicable variable. It is therefore limited to the first 32 variable
   slots (16 in 16-bit implementations).

*3 The '=' operation groups right to left.


VARIABLES:
----------
The script interpreter supports a set number of numeric variables (default
32). Variables must be predeclared with the "VARIABLE" command (see below).

There is a limited ability to perform indexed (array) operations on numeric
variables. Each variable is assigned a "slot" number - by indexing a variable, 
you can add to the "slot" number referenced. You must therefore be careful NOT
to reference such indexed slots by other variable names (unless you intend to).

NOTE: Variable slots 0-31 are more efficient to access than higher numbered
      slots.


STRINGS
-------
Some commands use a string operand value. String operands can be delimited
with any of the following characters, which must NOT occur within the
string: " ' ` / \
   'this is a string'
   "this is another string"
   /so is this/
Several strings can be concatinated together simply by placing them
adjacent on the same line:
   'Dave' " was " /Here/
The ASCII equivalent of a numeric value can be converted to a string
by enclosing it in () on the string line, following the closing ')' is
a character which describes the type of output:
   (<value>)b   = insert Binary string
   (<value>)d   = insert Decimal string
   (<value>)u   = ""
   (<value>)x   = insert heXidecimal string
You can also specify a field width when you insert the string:
   (<value>)8x  = Inserts 8 heXidecimal digits
If a field width is specified, leading '0's are prepended if needed to
fill the the specified width. eg:
   "/user" (USER)4d "/msg" (MESSAGE)8x

Strings may contain control characters in the form {n} where n is a number
between 1 and 127. eg: {13}

The script interpreter supports a set number of string variables (default
4), each of which may contain up to 127 characters. String variables must
be predeclared with the "STRING" command (see below). The content of a
string variable may be included in any other string simply by placing it's
name (without string delimiters) in the string definition.


COMMANDS
--------
The following is a brief description of the basic commands which are
available in every ESL implementation:


INCLUDE <string>

  Includes the content of the filename contained with <string> as if it
  had been included in the source file. This is a convienent way to keep
  predefined constants, variables or commonly used blocks of code.


VARIABLE <name> <slot>

  Defines a named numeric variable <name> which references the internal
  variable slot <slot>. Values may be assigned to the variable with the '='
  operator, and may be retrieved by name at any later time.
  Variables persist for the duration of script execution.

  Variable names must begin with 'A-Z' or '_', and may include those
  characters as well as the digits '0-9'.


CONSTANT <name> <const-value>

  Defines a named constant <name> which can be used within a numeric
  expression exactly as entering the numeric value determined by the
  evaluation of <value>.
  NOTE: Expressions within <value> are evaluated at the time the constant
        is created, NOT when it is referenced.


STRING <name> <slot>

  Defines a named string variable <name> which references an internal
  string slot <slot>. Strings may be assigned to the variable with the
  SET command, and may be retrieved at any later time by using it's name
  within a <string> element.
  String slots can hold a maximum of 127 characters each.


FIXED <name> <const-string>

  Defines a fixed string <name> which can be used within any other string
  exactly as if you had entered it directly into the string.


SET <string-variable> <string>

  Sets the string variable to contain the specified string.
  NOTE: Since <string> may include concatinated portions, string variables
        and numeric->ASCII conversions, all of these things may be included
        in the string assigned to the string variable.


IF <value>
 ....
[ELSE
 ....]
END

  Performs conditional execution of a script block. The (first) block
  is performed only if the evaluation of <value> results in a non-zero
  result. ELSE and the second block are optional.


WHILE <value>
 ....
END

  Performs a script block repeatedly as long as the <value> expression
  remains TRUE (non-zero).
  NOTE: <value> is tested at the top of the loop, so it is possible for
        the loop to not execute at all.


DO
 ...
WHILE <value>

  Performs the script block repeatedly as long as the <value> expression
  remains TRUE. 
  NOTE: <value> is tested at the bottom of the loop, so the loop will
        always execute at least once.
  NOTE: There is an inconsistancy in the ESL language in that if you try
        to use a WHILE loop within a DO/WHILE loop, the compiler will
        see the WHILE as the end of the DO/WHILE. You can work around
        this by placing the inner WHILE loop inside a SECTION.


SWITCH <value>
 ... default ...
CASE <value1>
  ...
CASE <value2>
 ...
END

  Evaluates <value> and transfers script execution to the first block
  which has a matching <value>.
  NOTE: This differs from 'C's SWITCH in several significant ways:
   - The CASE <value> operands do NOT have to be constants. They may
     contain variables and expressions, which are tested at the time
     the SWITCH is performed. If <value#> would evaluate to the same
     quantity for more than one case statement at the time the SWITCH
     is performed, the first one found will be chosen.
   - There are no BREAK statements, and CASEs do NOT automatically fall
     through to the next one. At the end of a CASE block, script execution
     will automatically transfer to immediately after the END statement.
     If you need one CASE to transfer into another, you must use an
     explicit GOTO.
   - There is no explicit DEFAULT label. The script block immediately
     after the SWITCH (before the first CASE) is executed if no matching
     CASE is found.


GOTO <label>

  Transfer script execution to the named label. Any script command may
  be labled by placing a symbol name followed by ':' at the beginning
  of the line, eg:
     GODND: prompt "do not disturb"


CALL <label>
RETURN

  Similar to GOTO, except that current script position is saved onto an
  internal 8-level stack, allowing you to continue execution following
  the CALL by executing RETURN.


RESET <expression>

  Resets the internal 8-level call stack, to the specified level. This
  effectively removes pending return location. Note: use 0 to completely
  clear the stack.


STOP

  Stops execution of the script and returns to the invoking process.


SECTION
 ...
END

  Defines a logical section. Within a section, any variables, constants,
  string buffers, fixed strings or labels which are defined may duplicate
  and will superceed any same-named entities in outer sections .. at the
  end of the section, all local symbols are destroyed.

  This is useful to allow you to access local symbols without conflict
  with other same-named symbols in the system. **NOTE: This DOES NOT
  protect the actual content of variable and string buffer slots which
  must be preserved (by extensions) if you wish to use the same slots
  independantly in nested sections.
  
 
<unrecognized command>

  Anything not recognized as a known command is evaluated as a numeric
  expression. This is normally used to assign value to numeric variables.
  eg: count = count + 1



LANGUAGE EXTENSIONS
-------------------
Language extensions are defined in the file ESL.ESL and provide the compiler
with details of the functions available in a specific ESL implementation.
These MUST match the extensions implemented in the corresponding ESL runtime.

EXT 'text'
  Sets the source file extension (default '.ES').

BEXT 'text'
 Sets the BINARY output file extension (default '.ESB').

CEXT 'text'
 Sets the C output file extension (default '.ESC').

SEXT 'text'
 Sets the String file extension (default '.ESS').

NAME 'text'
 Sets the name shown when the command runs.

NUMVAR <value>
 Sets the number of numeric variables (no maximum, default 32).

NUMSTR <value>
 Sets the number of string variables (maximum 32, default 4).

BIT16
 Causes the compiler to output 16-bit values instead of 32-bit
 values when encoding values greater than 8191 - this provides
 the ability to create 16-bit ESL run-time implementions.

CONSTANT <symbol> <value>
  Defines a constant symbol.

SYSVAR <symbol> <0-6>
  Define a system variable.

NVARIABLE <symbol> <0-...>
SVARIABLE <symbol> <0-31>
  Defines Numeric or String variable.


FUNCTION[(opcode)] <symbol> <operands>
  Encodes an extended language function.
  The next available opcode value will be generated, followed by
  the specified <operands>:
    VALUE       = A numeric expression
    STRING      = A text string
    NVARIABLE   = A numeric variable slot value
    SVARIABLE   = A string variable slot value
    LABEL       = A code transfer address (16 bit LE)
    SWITCH      = Case index table address will be output after
                  all other operands
        If 'SWITCH' is specified, the compiler will treat the
        function as a SWITCH, expecting CASE and END statements.
    MAP         = This opcode will be mapped "See MAPPING"
  If (opcode) is supplied, this numeric value (in brackets) will set
  the the next available opcode value which will be generated for this
  function, and will increment for further function definitions.


MAPPING
-------
If is often desirable to place multiple scripts into a single file, and
provide a way for the host system to select which script will be executed
for a given function.

ESLC allows you to designate a function/opcode to be "mapped". This means
that ESLC will generate a table with the addresses of all occurances of
the mapped opcode in the order in which they occur in the source.

For BINARY output, the table occurs at the beginning of the .ESB file

For C output, an additional "unsigned" array with "_MAP" appended to the
symbol name is generated.

In either case, the table consists of one word (2 bytes) entry for each
occurance of the mapped opcode, followed by a final entry of 0x0000.

Note:
- Only one function/opcode can be defined with the "MAP" attribute.

- The address contained in the entry is the opcode address +1. In other
  words, the address indicates the address of first operand to the mapped
  opcode or the next instruction if there were no operands.

  This is done for a couple of reasons:

  - Mapping the first opcode in the file does not conflict with the 0x0000
    terminating entry.

  - Mapped opcodes are used to identify sections in the file, and are not
    normally executed. Example, function called "MAP":
        MAP             <- No operands, address is next instruction.
        MAP 10          <- Simple numeric ID, address is ID expression
        MAP "Section1"  <- Text ID, address is text string.
    Note that when an ID value is used as an operand to the mapped opcode,
    the host must parse it according to the expression/string encoding
    described below. This is very simple to do if you keep the ID a constant.

    Opcodes with the MAP attribute are NOT checked to be unreachable, and
    the opcode immediately following is always assumed to be reachable.


COMPILING SCRIPTS
-----------------
You must enter your script into a text file using the Extensible Script
Language described above and the extended functions and features defined
in the ESL.ESL file. This file has the extension .ES, eg: <name>.ES

Then, compile the .es script into a .ESB binary file using the command:

   ESLC <name>

For a list of the command line options available to the compiler, run the
compiler with no operands:

   ESLC



DEBUG AIDS
----------
Most of the time scripts are very straightforward, and error messages from
the compiler tell you of any problems in your code. Occationally with a
complex script, you may hit a runtime error that only gives you the hex
address of the offending instruction. Here are some tips on how determine
the offending line of source code from that address.

The following ESLC command line options are useful:

  F=address             - Find line generating code at address
    NOTE: ESL values default to decimal, use $ or 0x to specify hexidecimal.
  /L[I]                 - Generate listing  [I=+Include files]
  /S [>file]            - Display all symbol names, types and values


SCRIPT DECOMPILER
-----------------
You can obtain a detailed listing of the script code contained in a .ESB
or .ESC file, showing the address of each instruction and the virtual stack
machine operations with the script decompiler:

   ESLD <name>

For a list of the command line options available to the decompiler, run the
decompiler with no operands:

   ESLD



SCRIPT ENCODING:
----------------
Compiled scripts are encoded in a compact binary format. The general form
is: <opcode> <operands>

Currently defined opcodes are:

STOP    0x00
  Cause script execution to terminate
EVAL    0x01 <expression>
  Evaluates expression (normally used for side-effect assignments)
SWITCH  0x02 <expression> <address>
  Performs a "SWITCH" function.
  Following the SWITCH is the code to be executed for the default condition.
  <address> points to a table which contains the address of each
  CASE section, ending with 0000 for the end of the table.
  Each case section begins with an <expression> containing the case
  match value, followed by the code to be executed if a match with
  that value occurs.
CALL    0x03 <address>
  Performs a subroutine call to the specified <address>.
BRANCH  0x04 <address>
  Transfers script execution to the specified <address>.
BFALSE  0x05 <expression> <address>
  Transfers script execution to the specified address only if the
  result of <expression> is ZERO.
BTRUE   0x06 <expression> <address>
  Transfers script execution to the specified address only if the
  result of <expression> is NOT ZERO.
RETURN  0x07
  Returns from a subroutine, picking up at the script instruction
  immediately following the CALL. RETURN at top level performs STOP.
RESET   0x08 <expression>
  Resets the internal subroutine stack pointer to the value of the
  supplied expression. Zero (0) effectvely removes all pending CALL/RETURN
  addresses. One (1) removes all but the top level CALL/RETURN address etc.
SET     0x09 <string-slot> <string>
  Sets the indicated string variable slot to contain <string>.

Opcodes 0x0A (10) and higher are available for script extension opcodes.

Expression encoding:
--------------------
When a numeric expression is encoded in the script, it is represented
as follows:

 m00vvvvv = 5-bit constant or assigned variable
 m01vvvvv = 13-bit constant (next-byte << 5) | vvvvv
 m10vvvvv = Content of variable
 m1100000 = 32-bit constant followed by 32-bit little-endian value (*1)
 m1100sss = System register 1-7
 m1101000 = Monadic -       TOS = 0-TOS             (negate)
 m1101001 = Monadic !       TOS = TOS ? 0 : 1       (logical compliment)
 m1101010 = Monadic ~       TOS = TOS ^ 0xFFFFFFFF  (bitwise compliment)
 m1101011 = Monadic @       TOS = Var[TOS]          (monadic index)
 m1101100 = Dyadic +        TOS = l + r             (addition)
 m1101101 = Dyadic -        TOS = l - r             (subtraction)
 m1101110 = Dyadic *        TOS = l * r             (multiplication)
 m1101111 = Dyadic /        TOS = l / r             (division)
 m1110000 = Dyadic %        TOS = l % r             (remainder)
 m1110001 = Dyadic ==       TOS = (l == r) ? 1 : 0  (test equal)
 m1110010 = Dyadic !=       TOS = (l != r) ? 1 : 0  (test not equal)
 m1110011 = Dyadic <        TOS = (l < r) ? 1 : 0   (test less than)
 m1110100 = Dyadic >        TOS = (l > r) ? 1 : 0   (test greater than)
 m1110101 = Dyadic <=       TOS = (l <= r) ? 1 : 0  (test less or equal)
 m1110110 = Dyadic >=       TOS = (l >= r) ? 1 : 0  (test greater or equal)
 m1110111 = Dyadic <<       TOS = l << r            (shift left)
 m1111000 = Dyadic >>       TOS = l >> r            (shift right)
 m1111001 = Dyadic &        TOS = l & r             (bitwise AND)
 m1111010 = Dyadic |        TOS = l | r             (bitwise OR)
 m1111011 = Dyadic ^        TOS = l ^ r             (bitwise XOR)
 m1111100 = Dyadic &&       TOS = l ? r : 0         (logical AND)
 m1111101 = Dyadic ||       TOS = l ? l : r         (logical OR)
 m1111110 = Dyadic =        Var[l] = TOS = r        (assignment)
 m1111111 = Dyadic []       TOS = Var[l+r]          (dyadic Index)
   'm' bit means "more" - Every expression opcode within a single
   expression with have this bit set except for the LAST opcode.
   In other words, keep processing opcodes until you have processed
   one that does not have the 'm' bit set.
 *1 - 16-bit in 16-bit implementations.

 Expressions are evaluated on an RPN stack:
 - The value types are placed on the stack directly.
   For 5-bit,  use bits 4-0 of the expression opcode.
   For 13-bit, also retrieve bits 12-5 from the following byte.
   For 32-bit, retrieve 31-0 from following 4 bytes (little endian). (*1)
 - The variable type should be fetched from the variable slot indicated
   by bits 4-0 of the expression opcode, and then placed on the stack.
 - Variables >31 are fetched by value and monadic-index.
 - The monadic operators modify the value on the top of the stack, and
   do not remove anything from the stack.
 - The dyadic operators remove the right operator from the top of the
   stack, and modify the left operator (which is now TOS):
   =   assigns the right value to the variable indicated by the left,
       then sets the left (TOS) to that same value.
   []  Adds the right value to the left, then retrieves the value in
       the variable slot indicated by the resulting value, and assigns
       it to the left (TOS).
 Always start processing an expression with an empty stack. When you
 have completely processed the expression, you should be left with a
 single value on the stack which will be the result. *2
 *1 16-bit/15-0/2 bytes in 16-bit implementations,
 *2 It is possible for the compiler to produce code which leaves more
    than one value on the stack if it is performing redundant code
    elimination for expressions in which the results are not actually
    used. For example, consider the statement: (a=1)+(b=2)
    The addition operation is redundant since the final value of the
    expression is not used, however the the assignment side-effects
    must be kept. In this case the compiler will emit a warning and
    remove the redundant Dyadic+, which results in two values on the
    stack. This is OK since the result of this expression is never
    used, and the stack is always emptied before beginning a new
    expression.

String encoding:
----------------
When a string value is encoded in the script, it is represented as
follows:

    0ccccccc                = ASCII character 'c'
    00000000                = End of string
    100sssss                = String variable 's'
    1ttwwwww <expression>   = Expression value to string, w=FillWidth-1
        01 = Special (Binary in default implementations)
        10 = Decimal
        11 = Hexidecimal
