Contact: zeng  @  zegraph.com      Last update: August 2014

ZeScript Language

Introduction

ZeScript is a small, simple, and embeddable scripting language with a C-like syntax. It's philosophy is to keep the core part light and fast and to make extending the core very easy. The sources for dynamic link libraries of CAIRO (cairo.cpp), SQLITE (sqlite.cpp), netCDF (netcdf.cpp), HDF (hdf.cpp), and wxWidget (zsw.cpp) speak for themselves. Not only can you add primitive functions for user objects, you can also assign a primitive function to re-define operator behaviors for user objects. The source code for the matrix library shows how to make operators work on data array of char, short, integer, float, and double and how to define __get and __set functions to access matrix values through the array access expressions of ZeScript.

ZeScript uses re2c to generate its token scanner, resulting in fast script parsing.

Variable and Object Type

ZeScript variables are dynamic and are created or updated by assignment. The object type that a variable represents may be null, boolean, integer, real, string, hash array, class, or user object. A user object holds a pointer created by a user's primitive function. For example,

/******************************************************
 *   Like C, contents between /* and */ are comments
 * and a long comments like this one may be extended
 * to multiple lines.
 *
 *  As shown bellow, any thing after // becomes a
 * one-line comment.
 ******************************************************/

a = 135;                     // a is integer type
a = 0x07FF;                  // hexadecimal equivalent of 2047
a = 135.0;                   // a is now real type
b = true;                    // b is boolean type
c = false;                   // c is boolean type
d = e = f = null;            // d, e, f are set to null
s = "I love ZeScript.";      // s is string type

An expression must end with a simi-column ";".

Internally, a boolean of true is equivalent to an integer of 1 and false to 0. A real is a double floating number and an integer is 64-bits. A string, i.e., text between a pair of single quote (') or double quote (") characters, may contain C escape characters. In a single quoted string \' can be used to include ' in the string, but including " does not require escaping. Similarly in a double quoted string the " cherecter requires escaping, but ' does not. Since \ is used for escaping, it must be presented as \\ in a string. A string may also occupy multiple lines, but such a long string must not exceed 8 KB. A string may also include \x?? to embed a character using a hex number whose value is ??.

A hash array created by the operator [] contains a collection of key-value pairs, of which the key is a string and the value can be an object of any type.

a = ["name"="My Name", "Age"=5];  // , is used to separate array items

When the key is omitted, the string equivalent of position index, counting from zero, will be used, e.g.,

a = [1, 3, 5];
// is the same as
a = ["0"=1, "1" = 3, "2"=5];
Note that ["name"=..., ...] is not the same as [name=..., ...]: While "name" in the former is a string, name in the latter is a variable that must be defined previously.

Assignment

An assignment expression sets or defines variable. The left of "=" will get a copy of the right if the right is a type of null, boolean, integer, real, string, or array; and get the reference to the right for class and user types unless the __copy function is defined for a class or registered as a primitive function for a user object. In that case, the __copy function will be called with the user or class object as the first parameter and the object resulted from the right expression as the second parameter.

The array creation expression may be used for multiple variable assignment:

[a, b, c] = [1, 2, 3];        // a=1, b=2, c=3
[a, b, c] = [1, 2];           // a=1, b=2, c=null
[a, b, c] = [1, "a"=2, 3];    // a=1, b=null, c=3
[a, b, c] = 1;                // a=1, b=1, c=1

Assigning to multiple variables works as follows:

  1. Items in the left array list must be variable names. If the "::" operator is used to indicate a global variable, that variable must be defined in an ancestor in the hierarchy that contains the expression.
  2. If the right object is not array, assign it to all variables on the left;
  3. Otherwise get objects in the right array using variables' positions as keys and assign obtained objects to corresponding variables in the left array.

This feature offers a convenient way to receive multiple values returned as an array from a function.

Array Access

The expressions expr.expr and expr[expr] are called array access or member access expression.

Objects in an array may be set or gotten through their keys, e.g.,

a.addr = "123 Street";      // update or set a key-value pair of an array

a["addr"] = "123 Street";   // same as above

b = a.addr;                 // b is now "123 Street"
b = a["addr"];              // same as above

When the key of an item is an integer, the integer's string equivalent is used as the key, e.g.,

a = [1, 3, 5];   // create array
b = a[0];        // b contains 1
a[1] = 10;       // now a contains [1, 10, 5]

Array access returns null if the array does not have the key.

When an array contains only numerical values and all keys are positional (indexed by integer starting from 0), it will be treated like a collection of numbers by operators and functions, e.g.,

a = [1, 2.1, 3.5, 5];
a++;             // a-array now contains [2, 3.1, 4.5, 6];
b = sin(a);      // b-array contains [sin(2), sin(3.1), sin(4.5), sin(6)]

Multiple keys are also allowed in getting and setting values:

a = [10, 11, 12, 13, "hi"];   // create a array
b = a[1,4];                   // b becomes an array containing [11,"hi"]
a[0,1] = [100,200];           // now a contains [100,200,12,13,"hi"]
a[2] = b;                     // now a contains [100,200,[11,"hi"],13,"hi"]

Multiple assignment to array items works as follows:

  1. Expressions inside [] of the left expression must produce strings or integers to represent array keys.
  2. If the right object is not an array, assign it to all variables in the left array;
  3. Otherwise, get objects in the right array using variable positions as keys and assign obtained objects to corresponding variables in the left array.

Array Access for String

The array access expression a[...] may be used to get and set characters in a string, e.g.,

s = "ABCDEFG";    // create a string
a = s[1];         // a contains "B"
s[0] = a;         // now s contains "BBCDEFG"
b = s[1,3];       // b contains "BD";
b = s[1:3];       // b contains "BCD";
s[1,3] = "A";     // now s contains "BACAEFG"
s[1,3] = "XYZ";   // now s contains "BXCYEFG"
s[5] = 90;        // now s contains "BXCYEZG";

Accessing sub-string through the array access expression works as follows:

  1. Expressions inside [] of the left expression must produce integers in the range of 0 to the left string length to represent indices of characters and positions of those expressions are used as indices in finding characters in the right string.
  2. The right object must be either a string or an integer. An integer is treated as the decimal ASCII code of a character.
  3. Find the character in the right string and assign it to the left string. If an index is larger than the right string length in finding characters, the index will be modulated by the length.

Array Access for User Object

For a user object, the expression

a = user.member; 

Will call the __get primitive function registered for that user type with the user object as the first parameter and a string object of "member" as the second parameter. And

user.member = expr; 

Will call the __set primitive function with the user object as the first parameter, a string object of "member" as the second parameter, and the object resulted from the right expression as the third parameter.

Similarly,

a = user[exr,...];

Will call the __get primitive function with the user object as the first parameter followed by objects resulted from expressions inside []. And

user[expr,...] = expr; 

Will call the __set primitive function with the user object as the first parameter followed by objects resulted from expressions inside [] and the object from the right expression as the last parameter.

The Range Operator ":"

A range expression, such as a:b or a:b:c, produces an array of two or three numbers, e.g.,

r1 = 1:10;          // r1 is [1, 10]
r2 = 1:10:2;        // r2 is [1, 10, 2]

When an range expression is used in array creation, it means to fill the array with numbers from the first number to the second. The increment step is 1 for the a:b form and is the third number for the a:b:c form. For example,

a = [1:11];        // a is [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

And when such an expression is used as index in accessing elements of array, string, and user object, it means to get elements with indices from the first number to the second with increment of the third; and all numbers resulted from the range expression must be integers. A null or * may be used to replace the second number to mean the last index. For example,

b = a[0:*:2];      // b is [1, 3, 5, 7, 9, 11]

The "+" Operator

In addition to its normal outcome when the operands are integer or real, the operator is extended to null, string, and array as follows:

  1. When an operand is null, the result is another operand object of any type.
  2. When an operand is a string, the other operand object is converted to string and concatenated with the string; and the result is a string.
  3. When an operand is an array, the operator is applied to each array element according to rule 1 and 2; and the result is an array.

Loop and Control

ZeScript supports three loop structures (while, do, for):

n = 0;
while (n >= 0) {
    n++;
    if (n == 50) continue;        // skip this number
    if (n > 100) break;           // jump out of the loop
    csv(n);
}

/////////////////////////////////////

n = 0;
do {
    n++;
    if (n == 50) continue;        // skip this number
    if (n > 100) break;           // jump out of the loop
    csv(n);
} while (n >= 0);

/////////////////////////////////////

for (i = 0; i < 100; i++) {
    if (i == 10) continue;        // Skip the rest when i=0
    csv(i);
}

And three control/redirect structures (if, switch, goto):

a = 1;
b = 2;
c = 3;

/////////////////////////////////////
if (a > 0) {
    csv(a);
    // more expressions may follow.
}

/////////////////////////////////////

if (a > b) csv(a);        // {} is optional if there is only one expression
else       csv(b);        // after "if" or "else".

/////////////////////////////////////

if (a > b) {
    csv(b);
}
else if (a > c) {
    csv(c);
}
else {
    csv(a);
}

/////////////////////////////////////

switch (flag) {
    case C1:
        csv("C1");
        break;
    case C2:
    case C3:
        csv(flag);
        break;
    default:
        csv("default");
}

The switch argument and the expressions after case must evaluate to integer or string. A case marks where switch starts execution according to the value of a switch argument. The default expression is mandatory inside a switch block.

Module and Function

ZeScript code in a file comprises a module. You may define variables and functions in a module. A module may import other modules to access functions defined in them, e.g.,

////////// hello.zs //////////////////


n = 100;


function hello(a, b, c)
{
    csv(a, b, c, n, "Hello!");
}

////////// main module ///////////////

import "hello.zs";

hello(1, 2, 3);    // call function in hello.zs

The import command is like the include macro in C. Because it is processed by ZeScript at compile time, it may be placed anywhere in a script. You may also use the import() function to load a module dynamically. While an imported module by the import command persists throughout the existance of the importer, an imported module by the import function is removed when its return variable is out of scope.

Importing looks for script files in the current directory first, and then in subdirectories of lib, cls, or cgi; or other subdirectories included internally in the ZeScript virtual machine.

As is shown above, a function is declared by the keyword "function" followed by the function name, arguments in (), and expressions in {}. An argument may have a default value. When an argument's default value is not set, null is assumed implicitly. In a function call, parameters are passed to arguments at corresponding positions, but when an assignment expression is used, the object resulted from the right will be set to the argument that has the same name as the left.

function f1(a, b=1, c=2.1, d="hi!", e=[1, 2, 3])
{
    csv(a, b, c, d, e);
}

f1();   // call f1 with no argument
        // output: null, 1, 2.000000, hi!, [0=1, 1=2, 2=3]

f1(a=1, 2, e="Hi!", d=[1, 2, 3]);    // call f1 with positional and assignment arguments
                                     // output: 1, 2, 2.000000, [0=1, 1=2, 2=3], Hi!

Objects of null, boolean, integer, real, string, and array are passed to functions by value; and objects of class and user are passed by reference.

Calling script function iteratively is allowed:

function Ack(m, n)
{
    if (m == 0) {
        return n + 1;
    }
    if (n == 0) {
        return Ack(m - 1, 1);
    }
    return Ack(m-1, Ack(m, n - 1));		
}


csv(Ack(3, 4));

A function may be declared inside another function. In that case, the inside function may access the private variables of the parent function. For example,

function f(a, b)
{
    a += 10;

    // more expressions here

    ff();    // call function defined internally

    function ff()
    {
        csv(a, b);  // access variable of parent function 
    }
}

In a sense, a module is like an anonymous function.

Primitive Functions

Primitive functions in a dynamic link library may be loaded to the global primitive function table of ZeScript by any module. For example:

load("my.dll");       // load primitive functions in my.dll

hello();              // suppose hello() is a primitive function in my.dll

Primitive functions precede script functions in function call. That is when a loaded primitive has the same name as a script function, the primitive function will be used.

Object Method

The expression:

object.method(...);

calls the function that belongs to the object.

When an object is a class type (more discussions later), a script function named "method" must be declared in the class.

When object is any other types, a primitive function named "method" must be registered for that type ( see API reference) and the library containing the function must be loaded. The primitive function will always get the object as the first parameter. Because you can use the same function name for different types of objects, the expression is like calling class methods in C++.

Call Back

ZeScript has implemented a mechanisms for calling script functions from a primitive function. See API reference for details.

Class

A class is a structured object declared at the module level and must be instantiated before being used. A class is like a special module in that class-level variables are shared by its internal functions, but protected from functions of other modules or classes. The difference is that each instance of a class has its own variable context while a module has only one variable context.

A very special feature of ZeScript's class is that operator functions may be defined to process such an expression as a + b. For example,

class Point {

    cx = 0;    // initialize class level variables
    cy = 0;
    cz = 0;

    function set(x, y, z)
    {
        ::cx = x; ::cy = y; ::cz = z;
    }

    function add(x, y, z)
    {
        cx += x; cy += y; cz += z;
    }

    function csv()
    {
        csv(cx, cy, cz);
    }

    // this is a operator function for +
    function __add(a, b)
    {
        c = new Point;
        c.cx = a.cx + b.cx;
        c.cy = a.cy + b.cy;
        c.cz = a.cz + b.cz;
        return c;
    }
}

a = new Point;          // create a point
a.set(10, 10, 10);     // call a's function

b = new Point;
b.cx += 5;             // access b's variable directly

c = a + b;              // because a is a class, call a's
                        // operator function for "+".
                        // c is now a class object.
c.csv();

For a binary operators, Z-script will calls the operator function of the higher rank in the order of null, boolean, integer, real, string, array, class, and user. In the above example, if a is a number and b is a class, b's __add() function will be called with b as the first argument and a as the second.

A class may redefine operator functions listed as follows:

__neg(a) <==> -a            __not(a) <==> !a           __cmpl(a) <==> ~a
__incr(a) <==> a++          __decr(a) <==> a--
__mul(a,b) <==> a + b       __add(a,b) <==> a + b
__div(a,b) <==> a - b       __div2(a,b) <==> b - a
__mod(a,b) <==> a % b       __mod2(a,b) <==> b % a
__sub(a,b) <==> a - b       __sub2(a,b) <==> b - a
__le(a,b) <==> a <= b       __lt(a,b) <==> a < b
__ge(a,b) <==> a >= b       __gt(a,b) <==> a > b
__eq(a,b) <==> a == b       __ne(a,b) <==> a != b
__and(a,b) <==> a & b       __or(a,b) <==> a | b        __xor(a,b) <==> a ^ b
__nn(a,b) <==> a && b       __oo(a,b) <==> a || b
__rsh(a,b) <==> a >> b      __lsh(a,b) <==> a << b
__mul_eq(a,b) <==> a *= b   __div_eq(a,b) <==> a /= b   __mod_eq(a,b) <==> a %= b
__add_eq(a,b) <==> a += b   __sub_eq(a,b) <==> a -= b
__rsh_eq(a,b) <==> a >>= b  __lsh_eq(a,b) <==> a <<= b
__and_eq(a,b) <==> a &= b   __or_eq(a,b) <==> a |= b    __xor_eq(a,b) <==> a ^= b

Overloading Operators

All operators of ZeScript may be redefined to act on any types of user objects. Please refer to the API page and source code for matrix library on how to achieve that.

load("matrix.dll");

A = double(10, 10);       // 10x10 double matrix
A.fill(1, 1);                       // A contains numbers from 1 to 100
A *= 10;                            // A contains numbers from 10 to 1000;
A += 1;                             // A contains numbers from 11 to 1001;

Variable Scope

A variable defined in a module is accessible only to expressions, functions, and classes declared in that module; a variable defined in a class is accessible only to expressions and functions declared in that class; and a variable defined in a function is only accessible to expressions and functions declared in that function.

Each module has its own variable context; each instance of class has its own variable context; and each function executes in its own variable context.

When an expression tries to get value from a variable, it starts searching in the function, class, or module that the variable belongs to and then, if failed, starts searching in the owner of functions or classes. But when the :: operator is used, the initial search starts in the owner.

When an expression tries to set value to a variable and the variable does not exist in the function, class, or module that the expression belongs to, a new variable will be defined locally. But when the :: operator is used, the expression tries to find the variable in the owner the function or class that defines the variable and set the value to it.

The following example shows the concept of variable scope:

a = 0;
b = 0;

function f()
{
    a = 10;
    ::a = 100;
    csv(a, b, ::a);
    ff();

    function ff()
    {
        a = 1;
        ::b = -1;
        csv(a, ::a, ::b);
    }
}

csv(a, b);
f();
csv(a, b);

try {...} catch (error) {...}

In case of using ZeScript in a server environment, it may not be desirable to allow any ZeScript runtime error to interrupt the server service. The try-catch feature of ZeScript may be used to catch and process the error message, e.g.,

try {
    a++;
    ...
}
catch (error) {
    cgi_error(error)
}

Reserved Keywords

addpath   break     case     catch     class     continue
default   do       else      false     for
function  goto     if        import    new
null      return   switch    true      try
while

Tips

Use a++ instead of a = a+ 1, a *= b instead of a = a + b, and so on for efficiency. Try to re-use variable names so that memories allocated for un-used variable will be released immediately.