ZeScript/ZeGraph

ZeScript Language

Introduction

ZeScript is a small, simple, and embeddable scripting language with a C-like syntax. It's philosophy is to keep the light and fast and to make it simple to add extensions. The sources for dynamic link libraries of CAIRO (cairo.cpp), SQLITE (sqlite.cpp), netCDF (netcdf.cpp), HDF (hdf.cpp), and wxWidget (zsw.cpp) speak for themselves. Not only can you add primitive functions for user objects, you can also assign a primitive function to re-define operator behaviors for user objects. The source code for the matrix library shows how to make operators work on data array of char, short, integer, float, and double and how to define __get and __set functions to access matrix values through the array access expressions of ZeScript.

ZeScript uses re2c to as its token scanner, resulting in fast script parsing.

Variable and Object Type

ZeScript variables are dynamic and are created or updated by assignment. A variable represents may be null, boolean, integer, real, string, hash array, class, or user defined object. For example,

/******************************************************
 *   C-Like multi-line comments.
 ******************************************************/

a = 135;                     // a is integer type (C-like singe-line comment)
a = 0x07FF;                  // hexadecimal equivalent of 2047
a = 135.0;                   // a is now real type
b = true;                    // b is boolean type
c = false;                   // c is boolean type
d = e = f = null;            // d, e, f are set to null
s = "I love ZeScript.";      // s is string type

An expression must end with a simi-column ";".

Internally, a boolean of true is equivalent to an integer of 1 and false to 0. A real is a double floating number and an integer is 64-bits. A string, i.e., text between a pair of single quote (') or double quote (") characters, may contain escape characters (as in C).

In a single quoted string, \' can be used to include ', but " in the string does not require escaping. Similarly in a double quoted string the " cherecter requires escaping, but ' does not. Since \ is used for escaping, it must be presented as \\ in a string. A string may also occupy multiple lines, but such a long string must not exceed 8 KB. A string may also include \x## to embed a character using a hex number whose value is ##.

A hash array created by the operator [] contains a collection of key-value pairs, of which the key is a string and the value can be an object of any type.

a = ["name"="My Name", "Age"=5];

When the key is omitted, the string equivalent of position index (counting from zero) will be used, e.g.,

a = [1, 3, 5];
// is the same as
a = ["0"=1, "1" = 3, "2"=5];

Note that ["name"=..., ...] is not the same as [name=..., ...]: While "name" in the former is a string, name in the latter is a variable that must be defined somewhere before the statement.

Assignment

An assignment expression sets or defines a variable. The left of "=" will get a copy of the right if the right is a type of null, boolean, integer, real, string, or array; and get the reference to the right for class and user types unless the __copy function is defined for a class or registered as a primitive function for a user object.

The array creation expression may be used for multiple variable assignment:

[a, b, c] = [1, 2, 3];        // a=1, b=2, c=3
[a, b, c] = [1, 2];           // a=1, b=2, c=null
[a, b, c] = 1;                // a=1, b=1, c=1

Assigning to multiple variables works as follows:

Items in the left array list must be variable names.
If the right object is not array, assign it to all variables on the left;
Otherwise get objects in the right array using variables' positions as keys and assign them to corresponding variables in the left array.

This feature offers a convenient way to receive multiple objects returned as an array by a function.

Array Access

The expressions expr.expr and expr[expr] are called array access or member access expression.

Objects in an array may be set or obtained through their keys, e.g.,

a.addr = "123 Street";      // update or set a key-value pair of an array

a["addr"] = "123 Street";   // same as above

b = a.addr;                 // b is now "123 Street"
b = a["addr"];              // same as above

When the key of an item is an integer, the integer's string equivalent is used as the key, e.g.,

a = [1, 3, 5];   // create array
b = a[0];        // b contains 1
a[1] = 10;       // now a contains [1, 10, 5]

Array access returns null if the array does not have the key.

When an array contains only numerical values and all keys are positional (indexed by integer starting from 0), it will be treated like a collection of numbers by operators and functions, e.g.,

a = [1, 2.1, 3.5, 5];
a++;             // a-array now contains [2, 3.1, 4.5, 6];
b = sin(a);      // b-array contains [sin(2), sin(3.1), sin(4.5), sin(6)]

Multiple keys are also allowed in getting and setting values:

a = [10, 11, 12, 13, "hi"];   // create a array
b = a[1,4];                   // b becomes an array containing [11,"hi"]
a[0,1] = [100,200];           // now a contains [100,200,12,13,"hi"]
a[2] = b;                     // now a contains [100,200,[11,"hi"],13,"hi"]

Multiple assignment to array items works as follows:

Expressions inside [] of the left expression must produce strings or integers to represent array keys.
If the right object is not an array, assign it to all variables in the left array;
Otherwise, get objects in the right array using variable positions as keys and assign them to corresponding variables in the left array.

Array Access for String

The array access expression a[...] may be used to get and set characters in a string, e.g.,

s = "ABCDEFG";    // create a string
a = s[1];         // a contains "B"
s[0] = a;         // now s contains "BBCDEFG"
b = s[1,3];       // b contains "BD";
b = s[1:3];       // b contains "BCD";
s[1,3] = "A";     // now s contains "BACAEFG"
s[1,3] = "XYZ";   // now s contains "BXCYEFG"
s[5] = 90;        // now s contains "BXCYEZG";

Accessing sub-string through the array access expression works as follows:

Expressions inside [] of the left expression must produce integers to represent positions of thse string.
The right object must be either a string or an integer. An integer is treated as the ASCII code of a character.
Find the character in the right string and assign it to the left string. If an index is larger than the right string length in finding characters, the index will be modulated by the length.

The Range Operator ":"

When an range expression, such as "a:b" or "a:b:c, is used in array creation, it means to fill the array with numbers from the first number to the second. The increment step is 1 for the "a:b" form and is the third number for the "a:b:c" form. For example,

a = [1:11];        // a is [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

And when such an expression is used as index in accessing elements of array, string, and user object, it means to get elements with indices from the first number to the second with increment of the third. A null or * may be used to replace the second number to indicate to the end. For example,

b = a[0:*:2];      // b is [1, 3, 5, 7, 9, 11]

The "+" Operator

In addition to its normal outcome when the operands are integer or real, the operator is extended to null, string, and array as follows:

When an operand is null, the result is another operand object of any type.
When an operand is a string, the other operand object is converted to string and concatenated with the string; and the result is a string.
When an operand is an array, the operator is applied to each array element according to rule 1 and 2; and the result is an array.

Loop and Control

ZeScript supports three loop structures (while, do, for):

n = 0;
while (n >= 0) {
    n++;
    if (n == 50) continue;        // skip this number
    if (n > 100) break;           // jump out of the loop
    csv(n);
}

/////////////////////////////////////

n = 0;
do {
    n++;
    if (n == 50) continue;        // skip this number
    if (n > 100) break;           // jump out of the loop
    csv(n);
} while (n >= 0);

/////////////////////////////////////

for (i = 0; i < 100; i++) {
    if (i == 10) continue;        // Skip the rest when i=0
    csv(i);
}

And three control/redirect structures (if, switch, goto):

a = 1;
b = 2;
c = 3;

/////////////////////////////////////
if (a > 0) {
    csv(a);
    // more expressions may follow.
}

/////////////////////////////////////

if (a > b) csv(a);        // {} is optional if there is only one expression
else       csv(b);        // after "if" or "else".

/////////////////////////////////////

if (a > b) {
    csv(b);
}
else if (a > c) {
    csv(c);
}
else {
    csv(a);
}

/////////////////////////////////////

switch (flag) {
    case C1:
        csv("C1");
        break;
    case C2:
    case C3:
        csv(flag);
        break;
    default:
        csv("default");
}

The switch argument and the expressions after case must evaluate to integer or string. A case marks where switch starts execution according to the value of a switch argument. The default expression is mandatory inside a switch block.

Module and Function

ZeScript code in a file comprises a module. You may define variables and functions in a module. A module may import other modules to access functions defined externally, e.g.,

////////// hello.zs //////////////////


n = 100;


function hello(a, b, c)
{
    csv(a, b, c, n, "Hello!");
}

////////// main module ///////////////

import "hello.zs";

hello(1, 2, 3);    // call function in hello.zs

The "import" command is like the include macro in C. Because it is processed by ZeScript at compile time, it may be placed anywhere in a script. You may also use the import() function to load a module dynamically. While an imported module by the import command persists throughout the existance of the importer, an imported module by the import function is removed when its return variable is out of scope.

Importing looks for script files in the current directory first, and then in subdirectories of lib, cls, or cgi; or other subdirectories included internally in the ZeScript virtual machine.

As is shown above, a function is declared by the keyword "function" followed by the function name, arguments in (), and expressions in {}. An argument may have a default value. When an argument's default value is not set, null is assumed implicitly. In a function call, parameters are passed to arguments at corresponding positions, but when an assignment expression is used, the object resulted from the right will be set to the argument that has the same name as the left.

function f1(a, b=1, c=2.1, d="hi!", e=[1, 2, 3])
{
    csv(a, b, c, d, e);
}

f1();   // call f1 with no argument
        // output: null, 1, 2.000000, hi!, [0=1, 1=2, 2=3]

f1(a=1, 2, e="Hi!", d=[1, 2, 3]);    // call f1 with positional and assignment arguments
                                     // output: 1, 2, 2.000000, [0=1, 1=2, 2=3], Hi!

Objects of null, boolean, integer, real, string, and array are passed to functions by value; and objects of class and user are passed by reference.

Calling script function iteratively is allowed:

function Ack(m, n)
{
    if (m == 0) {
        return n + 1;
    }
    if (n == 0) {
        return Ack(m - 1, 1);
    }
    return Ack(m-1, Ack(m, n - 1));		
}


csv(Ack(3, 4));

A function may be declared inside another function. In that case, the child function may access the private variables of the parent function. For example,

function f(a, b)
{
    a += 10;

    // more expressions here

    ff();    // call function defined internally

    function ff()
    {
        csv(a, b);  // access variable of parent function 
    }
}

In a sense, a module is like an anonymous function.

Primitive Functions

Primitive functions in a dynamic link library may be loaded to the global primitive function table of ZeScript by any module. For example:

load("my.dll");       // load primitive functions in my.dll

hello();              // suppose hello() is a primitive function in my.dll

Primitive functions precede script functions in function call. That is, when a loaded primitive has the same name as a script function, the primitive function will be used.

Object Method

The expression:

object.method(...);

calls the emthod function registered for the object. When an object is a class type (more discussions later), a script function named "method" must be declared in the class. When an object is any other types, a primitive function named "method" must be registered for that type (refer to the API reference). Because you can use the same function name for different types of objects, the expression is like calling class methods in C++.

Call Back

ZeScript has implemented a mechanisms for calling script functions from a primitive function. Refer to the API reference for details.

Class

A class is a structured object declared at the module level and must be instantiated. A class is like a special module in that class-level variables are shared by its internal functions, but protected from functions of other modules or classes. The difference is that each instance of a class has its own variable context while a module has only one variable context.

A very special feature of ZeScript's class is that operator functions may be defined to process such an expression as a + b. For example,

class Point {

    cx = 0;    // initialize class level variables
    cy = 0;
    cz = 0;

    function set(x, y, z)
    {
        ::cx = x; ::cy = y; ::cz = z;
    }

    function add(x, y, z)
    {
        cx += x; cy += y; cz += z;
    }

    function csv()
    {
        csv(cx, cy, cz);
    }

    // this is a operator function for +
    function __add(a, b)
    {
        c = new Point;
        c.cx = a.cx + b.cx;
        c.cy = a.cy + b.cy;
        c.cz = a.cz + b.cz;
        return c;
    }
}

a = new Point;          // create a point
a.set(10, 10, 10);     // call a's function

b = new Point;
b.cx += 5;             // access b's variable directly

c = a + b;              // because a is a class, call a's
                        // operator function for "+".
                        // c is now a class object.
c.csv();

For a binary operators, Z-script will calls the operator function of the higher rank in the order of null, boolean, integer, real, string, array, class, and user. In the above example, if a is a number and b is a class, b's __add() function will be called with b as the first argument and a as the second.

A class may redefine operator functions listed as follows:

__neg(a) <==> -a            __not(a) <==> !a           __cmpl(a) <==> ~a
__incr(a) <==> a++          __decr(a) <==> a--
__mul(a,b) <==> a + b       __add(a,b) <==> a + b
__div(a,b) <==> a - b       __div2(a,b) <==> b - a
__mod(a,b) <==> a % b       __mod2(a,b) <==> b % a
__sub(a,b) <==> a - b       __sub2(a,b) <==> b - a
__le(a,b) <==> a <= b       __lt(a,b) <==> a < b
__ge(a,b) <==> a >= b       __gt(a,b) <==> a > b
__eq(a,b) <==> a == b       __ne(a,b) <==> a != b
__and(a,b) <==> a & b       __or(a,b) <==> a | b        __xor(a,b) <==> a ^ b
__nn(a,b) <==> a && b       __oo(a,b) <==> a || b
__rsh(a,b) <==> a >> b      __lsh(a,b) <==> a << b
__mul_eq(a,b) <==> a *= b   __div_eq(a,b) <==> a /= b   __mod_eq(a,b) <==> a %= b
__add_eq(a,b) <==> a += b   __sub_eq(a,b) <==> a -= b
__rsh_eq(a,b) <==> a >>= b  __lsh_eq(a,b) <==> a <<= b
__and_eq(a,b) <==> a &= b   __or_eq(a,b) <==> a |= b    __xor_eq(a,b) <==> a ^= b

Overloading Operators

All operators of ZeScript may be redefined to act on any user objects. Please refer to the API page and source code for matrix library on how to achieve that.

load("matrix.dll");

A = double(10, 10);       // 10x10 double matrix
A.fill(1, 1);                       // A contains numbers from 1 to 100
A *= 10;                            // A contains numbers from 10 to 1000;
A += 1;                             // A contains numbers from 11 to 1001;

Variable Scope

A variable defined in a module is accessible only to expressions, functions, and classes declared in that module; a variable defined in a class is accessible only to expressions and functions declared in that class; and a variable defined in a function is only accessible to expressions and functions declared in that function.

Each module has its own variable context; each instance of class has its own variable context; and each function executes in its own variable context.

When an expression tries to get value from a variable, it starts searching in the function, class, or module that the variable belongs to and then in the owner of functions or classes. But when the :: operator is used, the initial search starts in the owner.

When an expression tries to set value to a variable and the variable does not exist in the function, class, or module that the expression belongs to, a new variable will be defined locally. But when the :: operator is used, the expression tries to find the variable in the owner the function or class that defines the variable and set the value to it.

The following example shows the concept of variable scope:

a = 0;
b = 0;

function f()
{
    a = 10;
    ::a = 100;
    csv(a, b, ::a);
    ff();

    function ff()
    {
        a = 1;
        ::b = -1;
        csv(a, ::a, ::b);
    }
}

csv(a, b);
f();
csv(a, b);

try {...} catch (error) {...}

In case of using ZeScript in a server environment, it may not be desirable to allow any ZeScript runtime error to interrupt the server service. The try-catch feature of ZeScript may be used to catch and process the error message, e.g.,

try {
    a++;
    ...
}
catch (error) {
    cgi_error(error)
}

Reserved Keywords

addpath   break     case     catch     class     continue
default   do       else      false     for
function  goto     if        import    new
null      return   switch    true      try
while

Tips

Use a++ instead of a=a+1, a*= b instead of a=a*b, and so on for efficiency. Try to re-use variable names so that memories allocated for un-used variable will be released immediately.