Contact: zeng  @  zegraph.com      Last update: 10 May 2011

Z-Script Language

Introduction

Z-Script is a small, simple, embeddable, and thread-safe scripting language with C-like syntax. It is a by-product in the search for a scripting language for ZeGraph. Of the many wonderful languages out there, Lua and C-Talk have been used. Both have their merit but lack features that make it natural to call functions of or to apply operators to user objects. For example, re-defining operators for user objects is limited in Lua; and while more may be done in C-Talk, it does not has the mechanism of Lua to call functions like they are methods of a user object.

Z-Script is light and fast -- the whole executable, including the standard library, is about 100 K. The language is easy to learn too, especially for those with C experience.

Z-Script is easy to extend. The sources for dynamic link libraries of Excel COM (excel.cpp), EXPAT(expat.cpp), SQLITE (sqlite.cpp), PostgreSQL (pgsql.cpp), MySQL (mysql.cpp), netCDF (netcdf.cpp), HDF (hdf.cpp), wxWidget (wdget.cpp) speak for themselves. Not only can you add primitive functions that can only be called with a specified type of user object, you can also assign a primitive function to re-define operator behaviors for any type of user object. The source code for the matrix library shows how to make operators work on data array of char, short, integer, float, and double; and shows how to define __get and __set functions to access matrix values through the array access expressions of Z-Script.

Z-Script supports primitive object types of null, boolean, integer, real, string, hash array, and user. Uppon those, a structured object type, i.e., a class, may be defined to encapsulate variables of primitive objects and script functions.

Z-Script uses re2c to generate its token scanner, resulting in very fast script compiling.

Variable and Object Types

Z-Script variables are dynamic and are created or updated by assignment. The object type that a variable represents may be null, boolean, integer, real, string, hash array, class, or user object that holds a pointer created by user's primitive function. For example,

/******************************************************
 *   Like C, contents between /* and */ are comments
 * and a long comments like this one may be extended
 * to multiple lines.
 *
 *  As shown bellow, any thing after // becomes a
 * one-line comment.
 ******************************************************/

a = 135;                     // a is integer type
a = 0x07FF;                  // hexadecimal equivalent of 2047
a = 135.0;                   // a is now real type
b = true;                    // b is boolean type
c = false;                   // c is boolean type
d = e = f = null;            // d, e, f are set to null
s = "I love Z-Script.";      // s is string type

Internally, a boolean of true is equivalent to an integer of 1 and false, to 0. A real is a double floating number and an integer is 64-bits (32-bits prior to version 2.0). A string, i.e., text between a pair of double quotation marks, may contain C escape characters. Since " is used to define string and \ is used for escaping, they must be presented as \" and \\ respectively in a string. A string may also occupy multiple lines, but such a long string must not exceed 8 KB.

Note that an expression must end with ";".

A hash array created by the operator [] contains a collection of key-value pairs, of which the key is a string and the value can be an object of any type.

a = ["name"="Jiye Zeng", "Age"="Secret"];  // , is used to separate array items

When the key is omitted, the string equivalent of position index counted from zero will be used, e.g.,

a = [1, 3, 5];
// is the same as
a = ["0" = 1, "1" = 3, "2" = 5];
Note that ["name"=..., ...] is not the same as [name=..., ...]: While "name" in the former is a string, name in the latter is a varivale that may represent a string or any other type of object.

Assignment

An assignment expression set the right to the left or defines a new variable if the left does not exist. The left will get a copy of the right if the right is a type of null, boolean, integer, real, string, or array; and get the reference to the right for class and user types unless the __copy function is defined for a class or registered as a primitive function for a user object. In that case, the __copy function will be called with the user object as the first parameter and the object resulted from the right expression as the second parameter.

The array creation expression may be used for multiple variable assignment:

[a, b, c] = [1, 2, 3];        // a=1, b=2, c=3
[a, b, c] = [1, 2];           // a=1, b=2, c=null
[a, b, c] = [1, "a"=2, 3];    // a=1, b=null, c=3
[a, b, c] = 1;                // a=1, b=null, c=null

Assigning to multiple variables works as follows:

  1. Items in the left array list must be variable names. And if :: operator is used to indicate a global variable, that variable must be defined in an ancestor of the expression.
  2. If the right object is not array, assign it to all variables;
  3. Otherwise, get objects in the array using variables' positions as keys and assign obtained objects to corresponding variables.

This feature offers a convenient way to receive multiple values returned as an array from a function.

Array Access

The expressions expr.expr and expr[expr] are call array access or member access.

Objects in an array may be set or obtained through their keys, e.g.,

a.addr = "123 Street";      // a new key-value pair is created
                            // if a is an array and the address key 
                            // does not exist.

a["addr"] = "123 Street";   // same as above

b = a.addr;                 // b is now "123 Street"
b = a["addr"];              // same as above

When the key of an item is an integer, it's string equivalent is used as the key, e.g.,

a = [1, 3, 5];   // create array
b = a[0];        // b contains 1
a[1] = 10;       // now a contains [1, 10, 5]

Array accessing returns null if the array does not have the key being searched.

When an array contains only numerical values and all keys are positional, it will be treated like a collection of numbers by operators and functions, e.g.,

a = [1, 2.1, 3.5, 5];
a++;             // a-array now contains [2, 3.1, 4.5, 6];
b = sin(a)       // b-array contains [sin(2), sin(3.1), sin(4.5), sin(6)]

Multiple keys are also allowed in getting and setting values:

a = [10, 11, 12, 13, "hi"];   // create a array
b = a[1,4];                   // b becomes an array containing [11,"hi"]
a[0,1] = [100,200];           // now a contains [100,200,12,13,"hi"]
a[2] = b;                     // now a contans [100,200,[11,"hi"],13,"hi"]

Multiple assignment to array items works as follows:

  1. Expressions inside [] of the left expression must produce strings or integers to represent array keys.
  2. If the right object is not an array, assign it to all keys;
  3. Otherwise, get objects in the right array using positions of key expressions on the left as keys and assign obtained objets to keys of the left array.

Array Access for String

The array access expression a[...] may also be used to get and set the string characters, e.g.,

s = "ABCDEFG";    // create a string
a = s[1];         // a contains "B"
s[0] = a;         // now s contains "BBCD"
b = s[1,3];       // b contains "BD";
s[1,3] = "A";     // now s contains "AACADEFG"
s[1,3] = "XYZ"    // now s contains "AXCYDEFG"
s[5] = 90;        // now s contains "AXCYDZFG";

Accessing sub-string through the array access expression works as follows:

  1. Expressions inside [] of the left expression must produce integers in the range of 0 to the left string length to represent indices of characters in the left string. And positions of those expressions are used as indices in finding characters in the right string.
  2. The right object must be either a string or an integer. An integer is treated as the decimal ASCII code of a character.
  3. Find the character in the right string assign it to the left string. If an index is larger than the right string length in finding characters, the index will be modulated by the length.

Array Access for User Object

For a user object, the expression

a = user.member; 

Will call the __get primitive function registered for that user type with the user object as the first parameter and a string object of "member" as the second parameter. And

user.member = expr; 

Will call the __set primitive function with the user object as the first parameter, a string object of "member" as the second parameter, and the object resulted from the right expression as the third parameter.

Similarly,

a = user[exr,...];

Will call the __get primitive function with the user object as the first parameter followed by objects resulted from expressions inside []. And

user[expr,...] = expr; 

Will call the __set primitive function with the user object as the first parameter followed by objects resulted from expressions inside [] and the object from the right expression as the last parameter. And * may be used inside [] to represent null. That is u[i,*] is equivalent to u[i,null]. Refer to the API reference for more details.

Range and Option

A range expression, suwch as a:b or a:b:c, produces an array of two or three numbers, e.g.,

r1 = 1:10;          // r1 is [1, 10]
r2 = 1:10:2;        // r2 is [1, 10, 2]

When an range expression is used in array creation, it means to fill the array with numbers from the first bumber to the second. The increment step is1 for the a:b form and is the firsd number for the a:b:c form. For example,

a = [1:11];        // a is [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

And when such an expression is used as index in accessing elements of array, string, and user object, it means to get elements with indices from the first number to the second with increment of the third; and all numbers resulted the range expression must be integers. A null or * may be used to replace the second number to mean the last index. For example,

b = a[0:*:2];      // b is [1, 3, 5, 7, 9, 11]

A range expression may appear as the righ operand of an option expression:

a = b > c ? b+1 : b+c;

This is equivalent to

if (b > c)
    a = b + 1;
else
    a = b + c;

Loop and Control

Z-Script supports three loop structures (while, do, for):

n = 0;
while (n >= 0) {
    n++;
    if (n == 50) continue;        // skip this number
    if (n > 100) break;           // jump out of the loop
    csv(n);
}

/////////////////////////////////////

n = 0;
do {
    n++;
    if (n == 50) continue;        // skip this number
    if (n > 100) break;           // jump out of the loop
    csv(n);
} while (n >= 0);

/////////////////////////////////////

for (i = 0; i < 100; i++) {
    if (i == 10) continue;        // Skip the rest when i=0
    csv(i);
}

And three control/redirect structures (if, switch, goto):

a = 1;
b = 2;
c = 3;

/////////////////////////////////////
if (a > 0) {
    csv(a);
    // more expressions may follow.
}

/////////////////////////////////////

if (a > b) csv(a);        // {} is optional if there is only one expression
else       csv(b);        // after "if" or "else".

/////////////////////////////////////

if (a > b) {
    csv(b);
}
else if (a > c) {
    csv(c);
}
else {
    csv(a);
}

/////////////////////////////////////

switch (flag) {
    case C1:
        csv(C1);
        break;
    case C2:
        csv(C2);
        goto 100;
    default:
        csv("default");
}

case 100:
    ....

The switch argument and the expressions after case and goto must evaluate to integer or string. A case marks where switch starts execution according to the value of switch argument. A case may also be used in a function or module for goto redirection. That is the goto expression above will make execution jumping to case 100. Although goto is not recommended in common programming practice, it nevertheless provides convenient redirection in certain situations.

Module and Function

, or by function arguments

Z-Script code in a file comprises a module. You may define variables and functions in a module. A module may import modules to get access to functions defined in imported modules, e.g.,

////////// hello.zs //////////////////


n = 100;


function hello(a, b, c)
{
    csv(a, b, c, n, "Hello!");
}

////////// main module ///////////////

import hello;

hello(1, 2, 3);    // call function in hello.zs

The import expression is like the include macro in C. Because it is processed by Z-Script virtual machine at compile time, it may be placed any where in a script.

When a module name has any operator character, the module name must be quoted in importing. For example,

import "hellor.zs";              // import hello.zs
import "my-module/say-hello";    // import say-hello.zs in the my-module subdirectory

Importing looks for script files in the current directory first, and then in subdirectories of lib, cls, or prog; or other subdirectories included internally in the Z-Script virtual machine. You may also used the addpath macro in a script to include paths for searching script files. Like import, addpath is processed at compile time; therefore it may be put anywhere in a script, but you should put it before import.:

As shown above, a function is declared by the keyword "function" followed by the function name, arguments in (), and expressions in {}. An argument may have default value. When an argument's default value is not set, null is assumed implicitly. In a function call, parameters are passed to arguments at corresponding positions, but when an assignment expression is used, the object resulted from the right will be set to the argument that has the same name as the left.

function f1(a, b=1, c=2.1, d="hi!", e=[1, 2, 3])
{
    csv(a, b, c, d, e);
}

f1();   // call f1 with no argument
        // output: null, 1, 2.000000, hi!, [0=1, 1=2, 2=3]

f1(a=1, 2, e="Hi!", d=[1, 2, 3]);    // call f1 with positional and assignment arguments
                                     // output: 1, 2, 2.000000, [0=1, 1=2, 2=3], Hi!

Objects of null, boolean, integer, real, and string are passed to functions by value; and objects of array, class, and user are passed by reference.

Calling script function iteratively is allowed:

function Ack(m, n)
{
    if (m == 0) {
        return n + 1;
    }
    if (n == 0) {
        return Ack(m - 1, 1);
    }
    return Ack(m-1, Ack(m, n - 1));		
}


csv(Ack(3, 4));

A function may be declared inside another function. In that case, the inside function may access the private variables of the parent function. For example,

function f(a, b)
{
    a += 10;

    // more expresions here

    ff();    // call function defined internally

    function ff()
    {
        csv(a, b);  // access variable of parent function 
    }
}

Primitive Functions

Primitive functions in a dynamic link library may be loaded to the global primitive function table by any module. For example:

load("my.dll");       // load primitive functions in my.dll

hello();              // suppose hello() is a primitive function in my.dll

Primitive functions precede script functions in function call. That is when a loaded primitive has the same name as a script function, the primitive function will be used

Object Method

The expression:

object.method(...);

calls the function that belongs to the object.

When object is a class type (more discussions later), a script function named method must be declared in the class.

When object is any other type, a primitrive function named method must be registered for that type ( see API reference) and the library containing the function must be loaded. The primitive function will always get the object as the first parameter. Because you can use the same function name for different types of objects, the expression is like calling class methods in C++.

Call Back

Z-Script has implemented two call-back mechanisms: calling script functions from a primitive function (see API reference) and use script function as the argument of another script function shown as follows:

function f1(x, y)
{
	return [x, y];
}

function f2(x, y)
{
	return [x*10, y*100];
}

function dynamic1(f, x, y)
{
 	// call is ascript buid-in function
  // and f is a pointer to a script function
	csv(call(f, x, y));
}

function dynamic2(name, x, y)
{
 	// func is a buid-in funct
  // and name is a script function name
	f = func(name);
	csv(call(f, x, y));
}

f = func("f1");
dynamic1(f, 1, 10);

f = func("f2");
dynamic1(f, 1, 10);

dynamic2("f1", 1, 10);

dynamic2("f2", 1, 10);

Class

A class is a structured object declared at the module level and must be instantiated before being used. A class is like a special module in that class-level variables are shared by inside functions, but protected from functions of other modules or classes. The difference is that each instance of a class has its own variable context while a module has only one variable context.

A very special feature of Z-Script's class is that operator functions may be defined to process such an expression as a + b. For example,

class Point {

    cx = 0;    // initialize class level variables
    cy = 0;
    cz = 0;

    function set(x, y, z)
    {
        ::cx = x; ::cy = y; ::cz = z;
    }

    function add(x, y, z)
    {
        cx += x; cy += y; cz += z;
    }

    function csv()
    {
        csv(cx, cy, cz);
    }

    // this is a operator function for +
    function __add(a, b)
    {
        c = new Point;
        c.cx = a.cx + b.cx;
        c.cy = a.cy + b.cy;
        c.cz = a.cz + b.cz;
        return c;
    }
}

a = new Point;          // create a point
a.set(10, 10, 10);     // call a's function

b = new Point;
b.cx += 5;             // access b's variable directly

c = a + b;              // because a is a class, call a's
                        // operator function for "+".
                        // c is now a class object.
c.csv();

For binary operators, Z-script will calls the operator function of the higher rank in the order of null, boolean, integer, real, string, array, class, and user. In the above example, if a is number and b is a class, b's __add() function will be called with b as the first argument and a as the second. Redefinable operator functions are listed as follows:

__neg(a) <==> -a            __not(a) <==> !a           __cmpl(a) <==> ~a
__incr(a) <==> a++          __decr(a) <==> a--
__mul(a,b) <==> a + b       __add(a,b) <==> a + b
__div(a,b) <==> a - b       __div2(a,b) <==> b - a
__mod(a,b) <==> a % b       __mod2(a,b) <==> b % a
__sub(a,b) <==> a - b       __sub2(a,b) <==> b - a
__le(a,b) <==> a <= b       __lt(a,b) <==> a < b
__ge(a,b) <==> a >= b       __gt(a,b) <==> a > b
__eq(a,b) <==> a == b       __ne(a,b) <==> a != b
__and(a,b) <==> a & b       __or(a,b) <==> a | b        __xor(a,b) <==> a ^ b
__nn(a,b) <==> a && b       __oo(a,b) <==> a || b
__rsh(a,b) <==> a >> b      __lsh(a,b) <==> a << b
__mul_eq(a,b) <==> a *= b   __div_eq(a,b) <==> a /= b   __mod_eq(a,b) <==> a %= b
__add_eq(a,b) <==> a += b   __sub_eq(a,b) <==> a -= b
__rsh_eq(a,b) <==> a >>= b  __lsh_eq(a,b) <==> a <<= b
__and_eq(a,b) <==> a &= b   __or_eq(a,b) <==> a |= b    __xor_eq(a,b) <==> a ^= b

Overwriting Operators

All operators of Z-Script may be redefined to act on any types of user objects. Please refer to the API page and source code for matrix library on how to achieve that.

load("matrix.dll");

A = matrix("double", 10, 10);       // 10x10 double matrix
A.fill(1, 1);                       // A contains numbers from 1 to 100
A *= 10;                            // A contains numbers from 10 to 1000;
A += 1;                             // A contains numbers from 11 to 1001;

Variable Scope

A variable defined in a module is accessible only to expressions, functions, and classes declared in that module; a variable defined in a class is accessible only to expressions and functions declared in that class; and a variable defined in a function is only accessible to expressions and functions declared in that function.

Each module has its own variable context; each instance of class has its own variavle context; and each function executes in its own variable context.

When an expression tries to get value from a variable, it starts searching the variable the function, class, or module that it belongs to and then, if failed, starts searching in the ancestors of function or class. But when the :: operator used, the initial search starts in the owner of a function or class.

When an expression tries to set value to a variable and the variable does not exist in the function, class, or module that the expression belongs to, a new variable is defined locally. But when the :: operator is used, the expression tries to find the ancestor of the function or class that defines the variable and then set the value to it.

The following example shows the concept of variable scope:

a = 0;
b = 0;

function f()
{
    a = 10;
    ::a = 100;
    csv(a, b, ::a);
    ff();

    function ff()
    {
        a = 1;
        ::b = -1;
        csv(a, ::a, ::b);
    }
}

csv(a, b);
f();
csv(a, b);

try {...} catch (error) {...}

In case of using Z-Script in a server, allowing any Z-Script runtime error to interrupt server service may not be desirable. The try-catch feature of Z-Script may be used to catch and process the error message, e.g.,

try {
    a++;
    ...
}
catch (error) {
    cgi_error(error)
}

Debugging

You can put csv(...) and return anywhere in a script to control execution to that point and check vaiable values. Alternatively, you can use a variable as flag and insert trace(flag) in your code. If flag=true, the trace function will display vaiable values and halt the execution until the enter key is pressed. Here is a script example that evaluate user input interactively:

/*********************************************
 * Interactive program in z-script
 *********************************************/

exec("cls");

csv("Z-Script version 1.4 by Jiye Zeng\n");
csv("Commands:");
csv("    @run    -- execute script");
csv("    @who    -- show variables");
csv("    @reset  -- clear script and screen");
csv("    @clear  -- clear screen");
csv("    @script -- show script");
csv("\nA input not starting with @ is treated as script.\n");

script = "";

while(1) {
    try {
        s = input("zs>");
        // When control-C is used to interrupt
        // the program, s will not be string.
        // So checking s is necessary.
        if (isstring(s)) {
            switch(s) {
            case "@run":
                // execute script
                ret = eval(script, false);
                if (ret != null) csv(ret);
                break;
            case "@who":
                // execute and show vaiables
                ret = eval(script, true);
                if (ret != null) csv(ret);
                break;
            case "@reset":
                // reset script code and clear screen
                script = "";
                exec("cls");
                break;
            case "@clear":
                // clear screen only
                exec("cls");
                break;
            case "@script":
                // show script code
                csv(script);
                break;
            default:
                // script code;
                if (size(s) > 0) script += s;
                break;
            }
        }
    }
    catch(msg) {
        csv(msg);
    }
}

Executing the scritp will bring up an interactive DOS window shown below.

Reserved Keywords

addpath   break     case     catch     class     continue
default   do       else      false     for
function  goto     if        import    new
null      return   switch    true      try
while

Tips

Use a++ instead of a = a+ 1, a *= b instead of a = a + b, and so on for efficiency. Try to re-use variable names so that memories allocated for un-used variable will be released immediately.