Yacas programming pitfalls

No programming language is without programming pitfalls, and Yacas has its fair share of pitfalls.


All rules are global

All rules are global, and a consequence is that rules can clash or silently shadow each other, if the user defines two rules with the same patterns and predicates but different bodies.

For example:

In> f(0) <-- 1
Out> True;
In> f(x_IsConstant) <-- Sin(x)/x
Out> True;

This can happen in practice, if care is not taken. Here two transformation rules are defined which both have the same precedence (since their precedence was not explicitly set). In that case Yacas gets to decide which one to try first. Such problems can also occur where one transformation rule (possibly defined in some other file) has a wrong precedence, and thus masks another transformation rule. It is necessary to think of a scheme for assigning precedences first. In many cases, the order in which transformation rules are applied is important.

In the above example, because Yacas gets to decide which rule to try first, it is possible that f(0) invokes the second rule, which would then mask the first so the first rule is never called. Indeed, in Yacas version 1.0.51,

In> f(0)
Out> Undefined;

The order the rules are applied in is undefined if the precedences are the same. The precedences should only be the same if order does not matter. This is the case if, for instance, the two rules apply to different argument patters that could not possibly mask each other.

The solution could have been either:

In> 10 # f(0) <-- 1
Out> True;
In> 20 # f(x_IsConstant) <-- Sin(x)/x
Out> True;
In> f(0)
Out> 1;
or
In> f(0) <-- 1
Out> True;
In> f(x_IsConstant)_(x != 0) <-- Sin(x)/x
Out> True;
In> f(0)
Out> 1;

So either the rules should have distinct precedences, or they should have mutually exclusive predicates, so that they do not collide.


Objects that look like functions

An expression that looks like a "function", for example AbcDef(x,y), is in fact either a call to a "core function" or to a "user function", and there is a huge difference between the behaviors. Core functions immediately evaluate to something, while user functions are really just symbols to which evaluation rules may or may not be applied.

For example:

In> a+b
Out> a+b;
In> 2+3
Out> 5;
In> MathAdd(a,b)
In function "MathAdd" : 
bad argument number 1 (counting from 1)
The offending argument a evaluated to a
CommandLine(1) : Invalid argument

In> MathAdd(2,3)
Out> 5;

The + operator will return the object unsimplified if the arguments are not numeric. The + operator is defined in the standard scripts. MathAdd, however, is a function defined in the "core" to performs the numeric addition. It can only do this if the arguments are numeric and it fails on symbolic arguments. (The + operator calls MathAdd after it has verified that the arguments passed to it are numeric.)

A core function such as MathAdd can never return unevaluated, but an operator such as "+" is a "user function" which might or might not be evaluated to something.

A user function does not have to be defined before it is used. A consequence of this is that a typo in a function name or a variable name will always go unnoticed. For example:

In> f(x_IsInteger,y_IsInteger) <-- Mathadd(x,y)
Out> True;
In> f(1,2)
Out> Mathadd(1,2);
Here we made a typo: we should have written MathAdd, but wrote Mathadd instead. Yacas happily assumed that we mean a new and (so far) undefined "user function" Mathadd and returned the expression unevaluated.

In the above example it was easy to spot the error. But this feature becomes more dangerous when it this mistake is made in a part of some procedure. A call that should have been made to an internal function, if a typo was made, passes silently without error and returns unevaluated. The real problem occurs if we meant to call a function that has side-effects and we not use its return value. In this case we shall not immediately find that the function was not evaluated, but instead we shall encounter a mysterious bug later.


Guessing when arguments are evaluated and when not

If your new function does not work as expected, there is a good chance that it happened because you did not expect some expression which is an argument to be passed to a function to be evaluated when it is in fact evaluated, or vice versa.

For example:

In> p:=Sin(x)
Out> Sin(x);
In> D(x)p
Out> Cos(x);
In> y:=x
Out> x;
In> D(y)p
Out> 0;

Here the first argument to the differentiation function is not evaluated, so y is not evaluated to x, and D(y)p is indeed 0.


The confusing effect of HoldArg

The problem of distinguishing evaluated and unevaluated objects becomes worse when we need to create a function that does not evaluate its arguments.

Since in Yacas evaluation starts from the bottom of the expression tree, all "user functions" will appear to evaluate their arguments by default. But sometimes it is convenient to prohibit evaluation of a particular argument (using HoldArg or HoldArgNr).

For example, suppose we need a function A(x,y) that, as a side-effect, assigns the variable x to the sum of x and y. This function will be called when x already has some value, so clearly the argument x in A(x,y) should be unevaluated. It is possible to make this argument unevaluated by putting Hold() on it and always calling A(Hold(x), y), but this is not very convenient and easy to forget. It would be better to define A so that it always keeps its first argument unevaluated.

If we define a rule base for A and declare HoldArg,
Function() A(x,y);
HoldArg("A", x);
then we shall encounter a difficulty when working with the argument x inside of a rule body for A. For instance, the simple-minded implementation
A(_x, _y) <-- (x := x+y);
does not work:
In> [ a:=1; b:=2; A(a,b);]
Out> a+2;
In other words, the x inside the body of A(x,y) did not evaluate to 1 when we called the function :=. Instead, it was left unevaluated as the atom x on the left hand side of :=, since := does not evaluate its left argument. It however evaluates its right argument, so the y argument was evaluated to 2 and the x+y became a+2.

The evaluation of x in the body of A(x,y) was prevented by the HoldArg declaration. So in the body, x will just be the atom x, unless it is evaluated again. If you pass x to other functions, they will just get the atom x. Thus in our example, we passed x to the function :=, thinking that it will get a, but it got an unevaluated atom x on the left side and proceeded with that.

We need an explicit evaluation of x in this case. It can be performed using Eval, or with backquoting, or by using a core function that evaluates its argument. Here is some code that illustrates these three possibilities:
A(_x, _y) <-- [ Local(z); z:=Eval(x); z:=z+y; ]
(using explicit evaluation) or
A(_x, _y) <-- `(@x := @x + y);
(using backquoting) or
A(_x, _y) <-- MacroSet(x, x+y);
(using a core function MacroSet that evaluates its first argument).

However, beware of a clash of names when using explicit evaluations (as explained above). In other words, the function A as defined above will not work correctly if we give it a variable also named x. The LocalSymbols call should be used to get around this problem.

Another caveat is that when we call another function that does not evaluate its argument, we need to substitute an explicitly evaluated x into it. A frequent case is the following: suppose we have a function B(x,y) that does not evaluate x, and we need to write an interface function B(x) which will just call B(x,0). We should use an explicit evaluation of x to accomplish this, for example
B(_x) <-- `B(@x,0);
or
B(_x) <-- B @ {x, 0};
Otherwise B(x,y) will not get the correct value of its first parameter x.


Special behavior of Hold, UnList and Eval

When an expression is evaluated, all matching rules are applied to it repeatedly until no more rules match. Thus an expression is "completely" evaluated. There are, however, two cases when recursive application of rules is stopped at a certain point, leaving an expression not "completely" evaluated:

The first possibility is mostly without consequence because almost all core functions return a simple atom that does not require further evaluation. However, there are two core functions that can return a complicated expression: Hold and UnList. Thus, these functions can produce arbitrarily complicated Yacas expressions that will be left unevaluated. For example, the result of
UnList({Sin, 0})
is the same as the result of
Hold(Sin(0))
and is the unevaluated expression Sin(0) rather than 0.

Typically you want to use UnList because you need to construct a function call out of some objects that you have. But you need to call Eval(UnList(...)) to actually evaluate this function call. For example:

In> UnList({Sin, 0})
Out> Sin(0);
In> Eval(UnList({Sin, 0}))
Out> 0;

In effect, evaluation can be stopped with Hold or UnList and can be explicitly restarted by using Eval. If several levels of un-evaluation are used, such as Hold(Hold(...)), then the same number of Eval calls will be needed to fully evaluate an expression.

In> a:=Hold(Sin(0))
Out> Sin(0);
In> b:=Hold(a)
Out> a;
In> c:=Hold(b)
Out> b;
In> Eval(c)
Out> a;
In> Eval(Eval(c))
Out> Sin(0);
In> Eval(Eval(Eval(c)))
Out> 0;

A function FullEval can be defined for "complete" evaluation of expressions, as follows:

LocalSymbols(x,y)
[
  FullEval(_x) <-- FullEval(x,Eval(x));
  10 # FullEval(_x,_x) <-- x;
  20 # FullEval(_x,_y) <-- FullEval(y,Eval(y));
];
Then the example above will be concluded with:
In> FullEval(c);
Out> 0;


Correctness of parameters to functions is not checked

Because Yacas does not enforce type checking of arguments, it is possible to call functions with invalid arguments. The default way functions in Yacas should deal with situations where an action can not be performed, is to return the expression unevaluated. A function should know when it is failing to perform a task. The typical symptoms are errors that seem obscure, but just mean the function called should have checked that it can perform the action on the object.

For example:

In> 10 # f(0) <-- 1;
Out> True;
In> 20 # f(_n) <-- n*f(n-1);
Out> True;
In> f(3)
Out> 6;
In> f(1.3)
CommandLine(1): Max evaluation stack depth reached.

Here, the function f is defined to be a factorial function, but the function fails to check that its argument is a positive integer, and thus exhausts the stack when called with a non-integer argument. A better way would be to write
In> 20 # f(n_IsPositiveInteger) <-- n*f(n-1);
Then the function would have returned unevaluated when passed a non-integer or a symbolic expression.


Evaluating variables in the wrong scope

There is a subtle problem that occurs when Eval is used in a function, combined with local variables. The following example perhaps illustrates it:

In> f1(x):=[Local(a);a:=2;Eval(x);];
Out> True;
In> f1(3)
Out> 3;
In> f1(a)
Out> 2;

Here the last call should have returned a, but it returned 2, because x was assigned the value a, and a was assigned locally the value of 2, and x gets re-evaluated. This problem occurs when the expression being evaluated contains variables which are also local variables in the function body. The solution is to use the LocalSymbols function for all local variables defined in the body.

The following illustrates this:

In> f2(x):=LocalSymbols(a)[Local(a);a:=2;Eval(x);];
Out> True;
In> f1(3)
Out> 3;
In> f2(a)
Out> a;

Here f2 returns the correct result. x was assigned the value a, but the a within the function body is made distinctly different from the one referred to by x (which, in a sense, refers to a global a), by using LocalSymbols.

This problem generally occurs when defining functions that re-evaluate one of its arguments, typically functions that perform a loop of some sort, evaluating a body at each iteration.