UserFunc-expression

expression

Describes a function with an string expression, e.g., sin(k*x).

A <UserFunc> block of kind = expression and an <Expression> block take all the same attributes, except that <Expression> blocks do not take the inputOrder attribute, nor <Input> blocks.

Generally, UserFuncs or Expressions are required by some object to compute a user-defined value; each object will specify whether it requires a UserFunc or an Expresssion. However, UserFuncs may also be defined at the top level of the input file; these UserFuncs may then be referenced in other UserFuncs throughout the input file. For example, one could define, at the top level, a <UserFunc rotate> that takes \((x,y,z)\) and performs a specific three-dimensional rotation, returning \((x',y',z')\). The rotate function can then be called conveniently from other UserFuncs and Expressions. Expessions may not be defined at the top-level.

Expression Parameters

UserFuncs with kind = expression take several attributes in addition to the ones for general UserFuncs:

expression (string, required)

The function expression, e.g., \(x * y - sin(z)\). The following sections describe how to write an expression.

inputOrder (vector of strings: UserFunc only)

The order of the arguments, by name; e.g., inputOrder=[x y z numElephants]. Each element of inputOrder should be the name of an Input block. (This attribute is not given to <Expression> blocks.)

Input (code block, required in UserFuncs for each member of inputOrder)

For each input variable named in inputOrder, there must be a corresponding <Input> block with the same name as the variable; the <Input> block specified the length and types of each input. For example, if inputOrder=[x y z numElephants] then one would also specify:

<Input x>
  kind = arbitraryVector
  types = [float]
</Input>
<Input y>
  kind = uniformVector
  types = float
  length = 3
</Input>
<Input z>
  kind = arbitraryVector
  types = [boolean integer]
</Input>
<Input numElephants>
  kind = arbitraryVector
  types = [integer]
</Input>

to tell Vorpal that x is a scalar floating point number, y is a 3-vector of floats, z is a vector of length 2, the first element a boolean, the second an integer, and numElephants is a scalar integer.

See Input Block.

UserFunc (code block, optional)

A local function, called by the expression. A local <UserFunc> (unlike a <Term>) can be called with different arguments in different places in the expression; if its result is not needed, it may not be evaluated at all.

See UserFunc Block and Local UserFuncs.

Term (code block, optional)

A local UserFunc that is evaluated with the same arguments as the containing <UserFunc>; the <Term> is evaluated exactly once each time the <UserFunc> is evaluated, and the result of the <Term> can be used in the expression like an input variable. The attributes of a <Term> block are identical to those of a <UserFunc>.

See Term Block.

Most users will never need to worry about the following optimization options, which should rarely be altered from their default values:

optimize (bool, default true)

The default value for the following optimizations (unless one suspects a bug, only useMemory and shortCircuitUnneededArgs should ever be turned off; the other optimizations can only improve things)

useMemory (bool, default = optimize)

Whether functions that take a long time to evaluate (e.g., \(\sin\), but not \(+\)) should remember their previous argument and result, and re-use the result if the same argument is called twice in a row; invoking this optimization increases run-time evaluation overhead, but may save time if the same argument is passed for multiple evaluations (however, Vorpal is semi-intelligent about this optimization; if the same argument doesn’t often appear twice in a row, it stops using the memory feature).

shortCircuitUnneededArgs (bool, default = optimize)

Whether certain functions (such as if and multiplication) should bypass arguments that don’t matter. For example, if(x > y, sin(x), cos(y)) needs to evaluate sin(x) but not cos(y) when x > y. Similarly, H(x) * sin(x) needs to evaluate sin(x) only if \(x \geq 0\) (hence \(H(x) \neq 0\)). Like useMemory, this optimization may increase evaluation costs in some cases.

pruneConsts (bool, default = optimize)

Whether to replace constant sub-expressions with a single constant; e.g., whether to replace 3+4+cos(0) with 8.

pruneCSEs (bool, default = optimize)

Whether to perform common sub-expression elimination.

pruneIfs (bool, default = optimize)

Whether to bypass if-functions when the first argument can be determined (independent of the function arguments); for example, whether to replace if(0 < 1, x, y) with x.

pruneSelects (bool, default = optimize)

Whether to bypass select-functions when the indices (in the second argument) are constant and sequential.

Basic built-in scalar functions

UserFuncs specified at the top level of the simulation can be used in other expression UserFuncs (anywhere); and some UserFuncs recognize local functions (defined in the same UserFunc block as the expression). In addition, there are many built-in functions. The basic functions are implemented with the C++ standard library. The recognized unary scalar functions follow, along with descriptions of the non-standard ones:

identity identity(\(x\)) = \(x\)
inv inv(\(x\)) = \(-x\)
sqr sqr(\(x\)) = \(x^2\)
cube cube(\(x\)) = \(x^3\)
sqrt  
sin  
cos  
tan  
exp  
asin result in [-pi/2, pi/2]
acos result in [0, pi]
atan result in [-pi/2, pi/2]; cf. binary function atan2(y,x)
sinh  
cosh  
ln  
H H(x) = 0 if x<0, 1/2 if x=0, 1 if x>0
J0 Bessel function \(J_0\)
J1 Bessel function \(J_1\)
J2 Bessel function \(J_2\)
J3 Bessel function \(J_3\)
abs  
erf  
rand rand(x) = a random number in [0,1) (x is irrelevant)
gauss gauss(x) = a random number gaussian-distributed with std. dev. x (and mean 0)
print the identity, but prints the result to stdout
ceil  
floor  
modmod modmod(x) = y in (-0.5,0.5] such that y = x (mod 1)
not the logical not operator: not(0) = 1, not(1) = 0
bool bool(x)=1 for all x, except bool(0)=0
int casts to an integer (floor and ceil are usually preferable)
assertBool the identity, but raises an exception at run-time if the argument is not 0 or 1
assertInt the identity, but raises an exception at run-time if the argument is not an integer

The following standard binary operators are recognized: ==, !=, >, >=, <, <=, + (also sum(x,y)), -, * (also prod(x,y)), / (N.B. this is always float division), \(\wedge\) and ** (two notations for exponent). Other built-in binary functions are:

min  
max  
sum sum(x,y) = x+y (addition)
prod prod(x,y) = x*y (multiplication)
pow pow(x,y) = x^y
mod mod(x,y) = x mod y (N.B.: mod(-4,3)=-1, mod(4,-3)=1, mod(-4,-3)=-1.)
atan2 atan2(y,x) = arctan(y/x) in [-pi, pi]
gaussRand gaussRand(x,y) is a random number, gaussian-distributed with mean y and std. dev. x
and logical and
or logical or

Built-in functions generally have obvious argument and return types. For example, logical operators (such and not and and) take only boolean arguments and return booleans; floor will take a float and return an integer. Built-in functions are fairly smart about argument and return types; for example, the plus operator will return a float if either of its arguments are float-type, but will return an integer if both of its arguments have boolean or integer types.

Built-in functions for use with GridBoundaries

There are a handful of special functions meant for dealing with results related to whether something (such as a point or a cell) is inside our outside a GridBoundary, or whether something is cut by the GridBoundary.

In the following, \(x\) and \(y\) are “interiorness” values, representing the “interiorness” of a point or a cell or some other region. See gridBoundaryFunc for sources of interiorness values. Interiorness values are integers that represent whether something is inside, outside, or on (or cut by) a boundary: one should use the following functions rather than interpreting the values, but typically, 1 is interior, 0 is on the boundary, and -1 is exterior.

Usually (when we remember) we use the language “inside” and “outside” when we want to separate objects into 2 disjoint categories; and we use “interior,” “exterior,” and “cut by the boundary” when we want to separate objects into 3 disjoint categories.

isInside\((x)\) returns 1/0: whether \(x\) is inside a boundary
isOutside\((x)\) returns not(isInside\((x)\))
isInterior\((x)\) returns 1/0: whether \(x\) is inside but not on or cut by boundary
isExterior\((x)\) returns 1/0: whether \(x\) is outside but not on or cut by boundary
isCutByBndry\((x)\) returns 1/0: whether \(x\) is on or cut by the boundary
onSameSideOfBndry\((x,y)\) returns 0/1 if \(x\) and \(y\) are on [different sides/the same side] of a boundary, according to isInside()

Special built-in functions

Some special functions are: if, vector, select, and len.

  • if(a, x, y) returns \(x\) if \(a==1\) (true) and \(y\) if \(a==0\) (false). \(x\) and \(y\) must have the same length, and \(a\) must be a scalar boolean value.

  • The vector functions concatenates vectors (taking any number of arguments, each of any length and types)

    \[\begin{aligned} \textrm{vector} ([x_0,\ldots, x_k], [y_0, \ldots, y_\ell], \ldots, [z_{0}, \ldots, z_{m}]) &=& [x_0,\ldots, x_k, y_0, \ldots, y_\ell, \ldots, z_{0}, \ldots, z_{m}] \nonumber .\end{aligned}\]

    The bracket notation [0,1,2] is not recognized by UserFunc expressions; we simply use it for convenience here. In an expression, one must write vector(0,1,2).

  • The select function selects certain elements from a vector; it takes 2 arguments, a vector of arbitrary types, and a vector of arbitrary (non-negative) integers:

    \[\textrm{select}([x_0, \ldots, x_k], [i_0, \ldots, i_n]) = [x_{i_0}, \ldots, x_{i_n}] .\]
  • The len function returns the (integer) length of an arbitrary vector.

    \[\textrm{len}([x_0, x_1, \ldots, x_{k-1}]) = k .\]

Special treatment of scalar functions for vector arguments

Functions defined to be scalar (taking arguments of length 1 and returning a vector of length 1) are automatically altered to take non-scalar arguments in two very natural ways: threading and folding (terms from Mathematica).

Threading:
For an example of threading, we “thread” the scalar function \(\sin(x) = y\) through its vector argument:
\[\sin ([x_0, \ldots, x_n]) = [\sin(x_0), \ldots , \sin(x_n)] .\]

Multi-argument scalar functions can be threaded if each argument is a vector of the same length or length 1. Thus the scalar plus function, \(x+y=z\) can be used as follows:

\[[x_0, \ldots, x_n] + [y_0, \ldots, y_n] = [x_0 + y_0 , \ldots , x_n + y_n]\]

or

\[[x_0, \ldots, x_n] + [y] = [x_0 + y , \ldots , x_n + y] .\]
Folding:
Folding allows any binary scalar function \(f(x,y)=z\) to take a single vector argument of arbitrary length and return a scalar value:
\[f([v,w,x,\ldots,y,z]) = f(f( \cdots f(f(v,w),x) \ldots ,y),z) .\]

By definition, a folded function applied to a single scalar is the identity:

\[f([x]) = [x] .\]

For example, the sum function is simply an alias for scalar plus: sum\((x,y)= x+y\). Therefore,

\[\textrm{sum}([x_0, \ldots, x_n]) = \sum_{i=0}^n x_i .\]

The prod function is an alias for scalar times, and so can be folded to calculate the product of all elements of a vector. Folding may also be especially useful with the and, or, min, and max functions.

Defining \(f(x)=x\) for a binary scalar function \(f\) and scalar argument \(x\) makes sense for binary functions that one typically wants to fold; for example, sum\((x)=x\), prod\((x)=x\), min\((x)=x\). Be forewarned, however, that this behavior may sometimes hide a mistake; if one has a (scalar) function of 2 scalar arguments \(f(x,y)\), and accidentally calls it with a single scalar argument, \(f(x)\), then Vorpal treats that as a folded function (which is the identity, \(f(x)=x\)), rather than giving an error message as usual when a function is called with the wrong number of arguments.

A word on optimization

Expression UserFuncs are pretty good at performing simple optimizations, so it is generally recommended to write expressions in a way that is simplest to read and understand, rather than trying to guess which of several equivalent expressions will evaluate fastest.

For example, constant expressions are evaluated: x + 3*4*0.1 + y will be reduced to x + 1.2 + y.

Common sub-expressions are evaluated only once, so the sin function is evaluated only once in the expression sin(x) + exp(sin(x)).

Functions that take a relatively long time to evaluate (such as transcendental functions) will remember the previous argument and result, and avoid a new evaluation if the argument did not change. This memory feature tends to be less helpful when performing multiple simultaneous evaluations (see Speed and Multiple Evaluations of UserFuncs).

Also, the expression evaluator can sometimes determine when part of an expression need not be evaluated; in such a case, it can skip (or short-circuit) an evaluation whose result is unneeded. The most important example is the special if(a, expr1, expr2) function—-if \(a=1\) then expr1 will be evaluated, but not expr2, and vice-versa if \(a=0\). Again, this feature is less helpful when performing multiple simultaneous evaluations (see Speed and Multiple Evaluations of UserFuncs).