Essays on Yacas

Yacas: A do-it-yourself symbolic algebra environment

A much shorter and more polished version of this paper is published as: Ayal Z. Pinkus and Serge Winitzki, "YACAS: a do-it-yourself computer algebra system", Lecture Notes in Artificial Intelligence 2385, pp. 332 - 336 (Springer-Verlag, 2002).

by Ayal Zwi Pinkus and Serge Winitzki


Abstract

We describe the design and implementation of Yacas, a free computer algebra system currently under development. The system consists of a core interpreter and a library of scripts that implement symbolic algebra functionality. The interpreter provides a high-level weakly typed functional language designed for quick prototyping of computer algebra algorithms, but the language is suitable for all kinds of symbolic manipulation. It supports conditional term rewriting of symbolic expression trees, closures (pure functions) and delayed evaluation, dynamic creation of transformation rules, arbitrary-precision numerical calculations, and flexible user-defined syntax using infix notation. The library of scripts currently provides basic numerical and symbolic algebra functionality, such as polynomials and elementary functions, limits, derivatives and (limited) integration, solution of (simple) equations. The main advantages of Yacas are: free (GPL) software; a flexible and easy-to-use programming language with a comfortable and adjustable syntax; cross-platform portability and small resource requirements; and extensibility.


Introduction

Yacas is a computer algebra system (CAS) which has been in development since the beginning of 1999. The goal was to make a small system that allows to easily prototype and research symbolic mathematics algorithms. A secondary future goal is to evolve Yacas into a full-blown general purpose CAS.

Yacas is primarily intended to be a research tool for easy exploration and prototyping of algorithms of symbolic computation. The main advantage of Yacas is its rich and flexible scripting language. The language is closely related to LISP WH89 but has a recursive descent infix grammar parser ASU86 which supports defining infix operators at run time similarly to Prolog B86, and includes expression transformation (term rewriting) as a basic feature of the language.

The Yacas language interpreter comes with a library of scripts that implement a set of computer algebra features. The Yacas script library is in active development and at the present stage does not offer the rich functionality of industrial-strength systems such as Mathematica or Maple. Extensive implementation of algorithms of symbolic computation is one of the future development goals.

Yacas handles input and output in plain ASCII, either interactively or in batch mode. (A graphical interface is under development.) There is also an optional plugin mechanism whereby external libraries can be linked into the system to provide extra functionality. Basic facilities are in place to compile Yacas scripts to C++ so they can be compiled into plugins.


Basic design

Yacas consists of a ``core engine" (kernel), which is an interpreter for the Yacas scripting language, and a library of script code.

The Yacas engine has been implemented in a subset of C++ which is supported by almost all C++ compilers. The design goals for Yacas core engine are: portability, self-containment (no dependence on extra libraries or packages), ease of implementing algorithms, code transparency, and flexibility. The Yacas system as a whole falls into the ``prototype/hacker" rather than into the ``axiom/algebraic" category, according to the terminology of Fateman F90. There are relatively few specific design decisions related to mathematics, but instead the emphasis is made on extensibility.

The kernel offers sufficiently rich but basic functionality through a limited number of core functions. This core functionality includes substitutions and rewriting of symbolic expression trees, an infix syntax parser, and arbitrary precision numerics. The kernel does not contain any definitions of symbolic mathematical operations and tries to be as general and free as possible of predefined notions or policies in the domain of symbolic computation.

The plugin inter-operability mechanism allows extension of the Yacas kernel and the use of external libraries, e.g. GUI toolkits or implementations of special-purpose algorithms. A simple C++ API is provided for writing ``stubs'' that make external functions appear in Yacas as new core functions. Plugins are on the same footing as the Yacas kernel and can in principle manipulate all Yacas internal structures. Plugins can be compiled either statically or dynamically as shared libraries to be loaded at runtime from Yacas scripts. In addition, Yacas scripts can be compiled to C++ code for further compilation into a plugin. Systems that don't support plugins can then link these modules in statically. The system can also be run without the plugins, for debugging and development purposes. The scripts will be interpreted in that case.

The script library contains declarations of transformation rules and of function syntax (prefix, infix etc.). The intention is that all symbolic manipulation algorithms, definitions of mathematical functions etc. should be held in the script library and not in the kernel. The only exception so far is for a very small number of mathematical or utility functions that are frequently used; they are compiled into the core for speed.


Portability

Yacas is designed to be as platform-independent as possible. All platform-specific parts have been clearly separated to facilitate porting. Even the standard C++ library is considered to be platform-specific, as there exist platforms without support for the standard C++ library (e.g. the EPOC32 platform).

The primary development platform is GNU/Linux. Currently Yacas runs under various Unix variants, Windows environments, Psion organizers (EPOC32), Ipaq PDAs, BeOS, and Apple iMacs. Creating an executable for another platform (including embedded platforms) should not be difficult.


A self-contained system

Yacas should work as a standalone package, requiring minimum support from other operating system components. Yacas takes input and output in plain ASCII, either interactively or in batch mode. (An optional graphical interface is under development.) The system comes with its own (unoptimized) arbitrary precision arithmetic module but could be compiled to use another arbitrary precision arithmetic library; currently linking to gmp is experimentally supported. There is also an optional plugin mechanism whereby external libraries can be linked into the system to provide extra functionality.

Self-containment is a requirement if the program is to be easy to port. A dependency on libraries that might not be available on other platforms would reduce portability. On the other hand, Yacas can be compiled with a complement of external libraries on "production" platforms.


Ease of use

Yacas is used mainly by executing programs written in the Yacas script language. A design goal is to create a high-level language that allows the user to conveniently express symbolic algorithms. A few lines of user code should go a long way.

One major advantage of Yacas is the flexibility of its syntax. Although Yacas works internally as a LISP-style interpreter, all user interaction is through the Yacas script language which has a flexible infix grammar. Infix operators are defined by the user and may contain non-alphabetic characters such as "=" or "#". This means that the user interacts with Yacas using a comfortable and adjustable infix syntax, rather than a LISP-style syntax. The user can introduce such syntactic conventions as are most convenient for a given problem.

For example, the Yacas script library defines infix operators "+", "*" and so on with conventional precedence, so that an algebraic expression can be entered in the familiar infix form such as

(x+1)^2 - (y-2*z)/(y+3) + Sin(x*Pi/2)

Once such infix operators are defined, it is possible to describe new transformation rules directly using the new syntax. This makes it easy to develop simplification or evaluation procedures adapted to a particular problem.

Suppose the user needs to reorder expressions containing non-commutative creation and annihilation operators of quantum field theory. It takes about 20 lines of Yacas script code to define an infix operation "**" to express non-commutative multiplication with the appropriate commutation relations and to automatically normal-order all expressions involving these symbols and other (commutative) factors. Once the operator ** is defined (with precedence 40),
Infix("**", 40);
the rules that express distributivity of the operation ** with respect to addition may look like this:
15 # (_x + _y) ** _z <-- x ** z + y ** z;
15 # _z ** (_x + _y) <-- z ** x + z ** y;
Here, 15 # is a specification of the rule precedence, _x denotes a pattern-matching variable x and the expression to the right of <-- is to be substituted instead of a matched expression on the left hand side. Since all transformation rules are applied recursively, these two lines of code are enough for the Yacas engine to expand all brackets in any expression containing the infix operators ** and +.

Rule-based programming is not the only method that can be used in Yacas scripts; there are alternatives that may be more useful in some situations. For example, the familiar if / else constructs, For, ForEach loops are defined in the script library for the convenience of users.

Standard patterns of procedural programming, such as subroutines that return values, with code blocks and temporary local variables, are also available. (A "subroutine" is implemented as a new "ground term" with a single rule defined for it.) Users may freely combine rules with C-like procedures or LISP-like list processing primitives such as Head(), Tail().


Code clarity vs. speed

Speed is obviously an important factor. For Yacas, where a choice had to be made between speed and clarity of code, clarity was chosen. Yacas is mainly a prototyping system and its future maintainability is more important.

This means that special-purpose systems designed for specific types of calculations, as well as heavily optimized industrial-strength computer algebra systems, will outperform Yacas. However, special-purpose or optimized external libraries can be dynamically linked into Yacas using the plugin mechanism.


Flexible, "policy-free" engine

The core engine of the Yacas system interprets the Yacas script language. The reason to implement yet another LISP-based custom language interpreter instead of taking an already existing one was to have full control over the design of the system and to make it self-contained. While most of the features of the Yacas script language are "syntactic sugar" on top of a LISP interpreter, some features not commonly found in LISP systems were added.

The script library contains declarations of transformation rules and of function syntax (prefix, infix etc.). The intention is that all symbolic manipulation algorithms, definitions of mathematical functions and so on should be held in the script library and not in the kernel. The only exception so far is for a very small number of mathematical or utility functions that are frequently used; they are compiled into the core for speed.

For example, the mathematical operator "+" is an infix operator defined in the library scripts. To the kernel, this operator is on the same footing as any other function defined by the user and can be redefined. The Yacas kernel itself does not store any properties for this operator. Instead it relies entirely on the script library to provide transformation rules for manipulating expressions involving the operator "+". In this way, the kernel does not need to anticipate all possible meanings of the operator "+" that users might need in their calculations.

This policy-free scheme means that Yacas is highly configurable through its scripting language. It is possible to create an entirely different symbolic manipulation engine based on the same C++ kernel, with different syntax and different naming conventions, by simply using another script library instead of the current library scripts. An example of the flexibility of the Yacas system is a sample script wordproblems.ys that comes with the distribution. It contains a set of rule definitions that make Yacas recognize simple English sentences, such as "Tom has 3 apples" or "Jane gave an apple to Tom", as valid Yacas expressions. Yacas can then "evaluate" these sentences to True or False according to the semantics of the current situation described in them.

The "policy-free" concept extends to typing: strong typing is not required by the kernel, but can be easily enforced by the scripts if needed for a particular problem. The language offers features, but does not enforce their use. Here is an example of a policy implemented in the script library:

61 # x_IsPositiveNumber ^ y_IsPositiveNumber 
      <-- MathPower(x,y);
By this rule, expressions of the form x^y (representing powers x^y) are evaluated and replaced by a number if and only if x and y are positive numerical constants. (The function MathPower is defined in the kernel.) If this simplification by default is not desirable, the user could erase this rule from the library and have a CAS without this feature.


The Yacas kernel functionality

Yacas script is a functional language based on various ideas that seemed useful for an implementation of CAS: list-based data structures, object properties, and functional programming (a la LISP); term rewriting [BN98] with pattern matching somewhat along the lines of Mathematica; user-defined infix operators a la PROLOG; delayed evaluation of expressions; and arbitrary-precision arithmetic. Garbage collection is implemented through reference counting.

The kernel provides three basic data types: numbers, strings, and atoms, and two container types: list and static array (for speed). Atoms are implemented as strings that can be assigned values and evaluated. Boolean values are simply atoms True and False. Numbers are represented by objects on which arithmetic can be performed immediately. Expression trees, association (hash) tables, stacks, and closures (pure functions) are all implemented using nested lists. In addition, more data types can be provided by plugins. Kernel primitives are available for arbitrary-precision arithmetic, string manipulation, array and list access and manipulation, for basic control flow, for assigning variables (atoms) and for defining rules for functions (atoms with a function syntax).

The interpreter engine recursively evaluates expression trees according to user-defined transformation rules from the script library. Evaluation proceeds bottom-up, that is, for each function term, the arguments are evaluated first and then the function is applied to these values.

A HoldArg() primitive is provided to not evaluate certain arguments of certain functions before passing them on as parameters to these functions. The Hold() and Eval() primitives, similarly to LISP's QUOTE and EVAL, can be used to stop the recursive application of rules at a certain point and obtain an unevaluated expression, or to initiate evaluation of an expression which was previously held unevaluated.

When an expression can not be transformed any further, that is, when no more rules apply to it, the expression is returned unevaluated. For instance, a variable that is not assigned a value will return unevaluated. This is a desired behavior in a symbolic manipulation system. Evaluation is treated as a form of ``simplification", in that evaluating an expression returns a simplified form of the input expression.

Rules are matched by a pattern expression which can contain pattern variables, i.e. atoms marked by the "_" operator. During matching, each pattern variable atom becomes a local variable and is tentatively assigned the subexpression being matched. For example, the pattern _x + _y can match an expression a*x+b and then the pattern variable x will be assigned the value a*x (unevaluated) and the variable y will have the value b.

This type of semantic matching has been frequently implemented before in various term rewriting systems (see, e.g., [C86]). However, the Yacas language offers its users an ability to create a much more flexible and powerful term rewriting system than one based on a fixed set of rules. Here are some of the features:

First, transformation rules in Yacas have predicates that control whether a rule should be applied to an expression. Predicates can be any Yacas expressions that evaluate to the atoms True or False and are typically functions of pattern variables. A predicate could check the types or values of certain subexpressions of the matching context (see the _x ^ _y example in the previous subsection).

Second, rules are assigned a precedence value (a positive integer) that controls the order of rules to be attempted. Thus Yacas provides somewhat better control over the automatic recursion than the pattern-matching system of Mathematica which does not allow for rule precedence. The interpreter will first apply the rule that matches the argument pattern, for which the predicate returns True, and which has the least precedence.

Third, new rules can be defined dynamically as a side-effect of evaluation. This means that there is no predefined "ranking alphabet" of "ground terms" (in the terminology of [TATA99]), in other words, no fixed set of functions with predefined arities. It is also possible to define a "rule closure" that defines rules depending on its arguments, or to erase rules. Thus, a Yacas script library (although it is read-only) does not represent a fixed tree rewriting automaton. An implementation of machine learning is possible in Yacas (among other things). For example, when the module wordproblems.ys (mentioned in the previous subsection) "learns" from the user input that apple is a countable object, it defines a new postfix operator apples and a rule for its evaluation, so the expression 3 apples is later parsed as a function apples(3) and evaluated according to the rule.

Fourth, Yacas expressions can be "tagged" (assigned a "property object" a la LISP) and tags can be checked by predicates in rules or used in the evaluation.

Fifth, the scope of variables can be controlled. In addition to having its own local variables, a function can be allowed to access local variables of its calling environment (the UnFence() primitive). It is also possible to encapsulate a group of variables and functions into a "module", making some of them inaccessible from the outside (the LocalSymbols() primitive). The scoping of variables is a "policy decision", to be enforced by the script which defines the function. This flexibility is by design and allows to easily modify the behavior of the interpreter, effectively changing the language as needed.


The Yacas scripting language

The Yacas interpreter is sufficiently powerful so that the functions For, ForEach, if, else etc., as well as the convenient shorthand "...<--..." for defining new rules, can be defined in the script library itself rather than in the kernel. This power is fully given to the user, since the library scripts are on the same footing as any user-defined code. Some library functions are intended mainly as tools available to a Yacas user to make algorithm implementation more comfortable. Below are some examples of the features provided by the Yacas script language.

Yacas supports "function overloading": it allows a user to declare functions f(x) and f(x,y), each having their own set of transformation rules. Of course, different rules can be defined for the same function name with the same number of arguments (arity) but with different argument patterns or different predicates.

Simple transformations on expressions can be performed using rules. For instance, if we need to expand the natural logarithm in an expression, we could use the following rules:

log(_x * _y) <-- log(x) + log(y);
log(_x ^ _n) <-- n * log(x);
These two rules define a new symbolic function log which will not be evaluated but only transformed if one of these two rules are applicable. The symbol _, as before, indicates that the following atom is a pattern variable that matches subexpressions.

After entering these two rules, the following interactive session is possible:

In> log(a*x^2)

log( a ) + 2 * log( x )

Integration of the new function log can be defined by adding a rule for the AntiDeriv function atom,

AntiDeriv(_x,log(_x)) <-- x*log(x)-x;
Now Yacas can do integrations involving the newly defined log function, for example:

In> Integrate(x)log(a*x^n)

log( a ) * x + n * ( x * log( x ) - x ) + C18

In> Integrate(x,B,C)log(a*x^n)

log( a ) * C + n * ( C * log( C ) - C ) -

( log( a ) * B + n * ( B * log( B ) - B ) )

Rules are applied when their associated patterns match and when their predicates return True. Rules also have precedence, an integer value to indicate which rules need to be applied first. Using these features, a recursive implementation of the integer factorial function may look like this in Yacas script,

10 # Factorial(_n) _ (n=0) <-- 1;
20 # Factorial(n_IsInteger) _ (n>0) <--
  n*Factorial(n-1);
Here the rules have precedence 10 and 20, so that the first rule will be tried first and the recursion will stop when n=0 is reached.

Rule-based programming can be freely combined with procedural programming when the latter is a more appropriate method. For example, here is a function that computes ( Mod(x^n,m)) efficiently:

powermod(x_IsPositiveInteger, 
   n_IsPositiveInteger,
   m_IsPositiveInteger) <--
[
  Local(result);
  result:=1;
  x:=Mod(x,m);
  While(n != 0)
  [
    if ((n&1) = 1)
	[
      result := Mod(result*x,m);
    ];
    x := Mod(x*x,m);
    n := n>>1;
  ];
  result;
];

Interaction with the function powermod(x,n,m) would then look like this:

In> powermod(2,10,100)
Out> 24;
In> Mod(2^10,100)
Out> 24;
In> powermod(23234234,2342424234,232423424)
Out> 210599936;


Currently supported CAS features

Yacas consists of approximately 22000 lines of C++ code and 13000 lines of scripting code, with 170 functions defined in the C++ kernel and 600 functions defined in the scripting language. These numbers are deceptively small. The program is written in clean and simple style to keep it maintainable. Excessive optimization tends to bloat software and make it less readable.

A base of mathematical capabilities has already been implemented in the script library (the primary sources of inspiration were the books [K98], [GG99] and [B86]). The script library is currently under active development. The following section demonstrates a few facilities already offered in the current system.

Basic operations of elementary calculus have been implemented:

In> Limit(n,Infinity)(1+(1/n))^n

Exp( 1 )

In> Limit(h,0) (Sin(x+h)-Sin(x))/h

Cos( x )

In> Taylor(x,0,5)ArcSin(x)

     3        5
    x    3 * x 
x + -- + ------
    6      40  

In> InverseTaylor(x,0,5)Sin(x)

     5    3    
3 * x    x     
------ + -- + x
  40     6     

In> Integrate(x,a,b)Ln(x)+x

                   2   /                    2 \
                  b    |                   a  |
b * Ln( b ) - b + -- - | a * Ln( a ) - a + -- |
                  2    \                   2  /

In> Integrate(x)1/(x^2-1)

Ln( 2 * ( x - 1 ) )   Ln( 2 * ( x + 1 ) )      
------------------- - ------------------- + C38
         2                     2               

In> Integrate(x)Sin(a*x)^2*Cos(b*x)

Sin( b * x )   Sin( -2 * x * a + b * x )   
------------ - ------------------------- - 
   2 * b          4 * ( -2 * a + b )       

Sin( -2 * x * a - b * x )      
------------------------- + C39
   4 * ( -2 * a - b )          

In> OdeSolve(y''==4*y)

C193 * Exp( -2 * x ) + C195 * Exp( 2 * x )

Solving systems of equations has been implemented using a generalized Gaussian elimination scheme:

In> Solve( {x+y+z==6, 2*x+y+2*z==10, \ 
In> x+3*y+z==10}, \ 
In>      {x,y,z} ) [1]
Out> {4-z,2,z};
(The solution of this underdetermined system is returned as a vector, so x=4-z, y=2, and z remains arbitrary.)

A small theorem prover [B86] using a resolution principle is offered:

In> CanProve(P Or (Not P And Not Q))

Not( Q ) Or P

In> CanProve(a > 3 And a < 2)

False

Various exact and arbitrary-precision numerical algorithms have been implemented:

In> N(1/7,40);    // evaluate to 40 digits
Out> 0.1428571428571428571428571428571428571428;
In> Decimal(1/7);    // obtain decimal period
Out> {0,{1,4,2,8,5,7}};
In> N(LnGamma(1.234+2.345*I)); // gamma-function
Out> Complex(-2.13255691127918,0.70978922847121);

Various domain-specific expression simplifiers are available:

In> RadSimp(Sqrt(9+4*Sqrt(2)))

Sqrt( 8 ) + 1

In> TrigSimpCombine(Sin(x)^2+Cos(x)^2)

1

In> TrigSimpCombine(Cos(x/2)^2-Sin(x/2)^2)

Cos( x )

In> GcdReduce((x^2+2*x+1)/(x^2-1),x)

x + 1
-----
x - 1

Univariate polynomials are supported in a dense representation, and multivariate polynomials in a sparse representation:

In> Factor(x^6+9*x^5+21*x^4-5*x^3-54*x^2-12*x+40)

         3            2            
( x + 2 )  * ( x - 1 )  * ( x + 5 )

In> Apart(1/(x^2-x-2))

      1               1      
------------- - -------------
3 * ( x - 2 )   3 * ( x + 1 )

In> Together(%)

         9         
-------------------
     2             
9 * x  - 9 * x - 18

In> Simplify(%)

     1     
----------
 2        
x  - x - 2

Various "syntactic sugar" functions are defined to more easily enter expressions:

In> Ln(x*y) /: { Ln(_a*_b) <- Ln(a) + Ln(b) }

Ln( x ) + Ln( y )

In> Add(x^(1 .. 5))

     2    3    4    5
x + x  + x  + x  + x 

In> Select("IsPrime", 1 .. 15)
Out> {2,3,5,7,11,13};

Groebner bases [GG99] have been implemented:

In> Groebner({x*(y-1),y*(x-1)})

/           \
| x * y - x |
|           |
| x * y - y |
|           |
| y - x     |
|           |
|  2        |
| y  - y    |
\           /
(From this it follows that x=y, and x^2=x so x is 0 or 1.)

Symbolic inverses of matrices:

In> Inverse({{a,b},{c,d}})

/                                      \
| /       d       \ /    -( b )     \  |
| | ------------- | | ------------- |  |
| \ a * d - b * c / \ a * d - b * c /  |
|                                      |
| /    -( c )     \ /       a       \  |
| | ------------- | | ------------- |  |
| \ a * d - b * c / \ a * d - b * c /  |
\                                      /

This list of features is not exhaustive.


Interface

Currently, Yacas is primarily a text-oriented application with interactive interface through the text console. Commands are entered and evaluated line by line; files containing longer code may be loaded and evaluated. A "notebook" interface under the GNU Emacs editor is available. There is also an experimental graphical interface (proteus) for Unix and Windows environments.

Debugging facilities are implemented, allowing to trace execution of a function, trace application of a given rule pattern, examine the stack when recursion did not terminate, or an online debugger from the command line. An experimental debug version of the Yacas executable that provides more detailed information can be compiled.


Documentation

The documentation for the Yacas is extensive and is actively updated, following the development of the system. Documentation currently consists of two tutorial guides (user's introduction and programmer's introduction), a collection of essays that describe some advanced features in more detail, and a full reference manual.

Yacas currently comes with its own document formatting module that allows maintenance of documentation in a special plain text format with a minimal markup. This text format is automatically converted to HTML, LaTeX, PostScript and PDF formats. The HTML version of the documentation is hyperlinked and is used as online help available from the Yacas prompt.


Future plans

The long-term goal for Yacas is to become an industrial-strength CAS and to remain a flexible research tool for easy prototyping of various methods of symbolic calculations. Yacas is meant to be a repository and a testbed for such algorithm prototypes.

The plugin facility will be extended in the future, so that a rich set of extra additional libraries (especially free software libraries), system-specific as well as mathematics-oriented, should be loadable from the Yacas system. The issue of speed is also continuously being addressed.


References

[ASU86] A. Aho, R. Sethi and J. Ullman, Compilers (Principles, Techniques and Tools), Addison-Wesley, 1986.

[B86] I. Bratko, Prolog (Programming for Artificial Intelligence), Addison-Wesley, 1986.

[BN98] F. Baader and T. Nipkow, Term rewriting and all that, Cambridge University Press, 1998.

[C86] G. Cooperman, A semantic matcher for computer algebra, in Proceedings of the symposium on symbolic and algebraic computation (1986), Waterloo, Ontario, Canada (ACM Press, NY).

[F90] R. Fateman, On the design and construction of algebraic manipulation systems, also published as: ACM Proceedings of the ISSAC-90, Tokyo, Japan.

[GG99] J. von zur Gathen and J. Gerhard, Modern Computer Algebra, Cambridge University Press, 1999.

[K98] D. Knuth, The Art of Computer Programming (Volume 2, Seminumerical Algorithms), Addison-Wesley, 1998.

[TATA99] H. Comon, M. Dauchet, R. Gilleron, F. Jacquemard, D. Lugiez, S. Tison, and M. Tommasi, Tree Automata Techniques and Applications, 1999, online book: http://www.grappa.univ-lille3.fr/tata

[W96] S. Wolfram, The Mathematica book, Wolfram Media, Champain, 1996.

[WH89] P. Winston and B. Horn, LISP, Addison-Wesley, 1989.


M. Wester's CAS benchmark and Yacas

In his 1994 paper Review of CAS mathematical capabilities, Michael Wester has put forward 123 problems that a reasonable computer algebra system should be able to solve and tested the then current versions of various commercial CAS on this list.

Below is the list of Wester's problems with the corresponding Yacas code. "OK" means a satisfactory solution, "BUG" means that Yacas gives a wrong solution or breaks down, "NO" means that the relevant functionality is not yet implemented.

Yacas version: 1.0.57

Verify(Simplify(Integrate(x) Abs(x)),
  Abs(x)*x/2);


On Yacas programming


Example: implementing a formal grammar

To illustrate the use of rules, consider a theorem prover in a simple formal grammar. (The example is the "ABIN system" from the book: W. Robinson, Computers, minds and robots, Temple University Press, 1992. Warning: the book is about philosophy.)

Well-formed expressions consist of symbols A, B, I, N and are either

This defines a certain set of well-formed expressions (statements of the ABIN language); for example, NBII is a statement of the language but AB is not. The truth/falsehood interpretation of the ABIN language is the following. All well-formed expressions starting with NB are interpreted as true statements (they are "axioms" of the system). In addition, there is one deduction rule allowing one to prove "theorems":

Thus, NABIBI can be proved starting from the axiom NBI, but NANBB cannot be proved. The task at hand is to decide whether a given sequence of symbols is a provable statement of the ABIN language.

(The idea behind this interpretation is to assume that all B, BI etc. are some false statements that one could denote "B0", "B1" according to the number of "I" symbols; "N" is the logical Not and "A" is the logical And. Then the statement NABIB would mean "it is false that both B0 and B1 are true" and NANBB would mean "it is false that both B0 and negation of B0 are true". The NANBB statement is true in this interpretation but the deductive system of ABIN is too weak to obtain its proof.)


Implementation using predicates

The easiest way to model the ABIN language in Yacas is by using predicates. Our goal will be to define a predicate IsProvable(x) that will return True when x is a provable ABIN statement and False otherwise. We shall define IsProvable(x) recursively through several auxiliary predicates. Naturally, we would like to have a predicate to test well-formedness: IsExpr(x). It is necessary also to have predicates for B-expressions, N-expressions and A-expressions, as well as for axioms and theorems. We might implement expressions by lists of symbols, e.g. {"B", "I"} and begin to code by

IsExpr(x_IsList) <-- IsBExpr(x) Or
  IsNExpr(x) Or IsAExpr(x);
IsProvable(x_IsList) <-- IsAxiom(x) Or
  IsTheorem(x);
IsAxiom(x_IsList) <-- IsNExpr(x) And
  IsBExpr(Tail(x));

The definitions of IsBExpr(x) and IsNExpr(x) are simple recursion to express the rules 1 and 2 of the ABIN grammar. Note the use of Take to create a copy of a list (we'd better not modify the value of x in the body of the rule).

10 # IsBExpr({}) <-- False;
10 # IsBExpr({"B"}) <-- True;
20 # IsBExpr(x_IsList) <-- x[Length(x)]="I"
  And IsBExpr(Take(x, {1, Length(x)-1}));

10 # IsNExpr({}) <-- False;
20 # IsNExpr(x_IsList) <-- x[1] = "N" And
  IsExpr(Tail(x));

The predicate IsAExpr(x) is a little bit more complicated because our rule 3 requires to find two well-formed expressions that follow A. Also, for proving theorems we need to be able to extract the first of these expressions. With this in mind, we define another auxiliary function, FindTwoExprs(x), that returns the results of search for two well-formed expressions in the list x. The return value of this function will be a pair such as {True, 3} to indicate that two well-formed expressions were found, the first expression being of length 3. We shall use a For loop for this function:

FindTwoExprs(x_IsList) <-- [
 Local(iter, result);
 For( [ iter:=1; result:=False; ],
  iter < Length(x) And Not result,
    iter:=iter+1 )
   [
    result := IsExpr(Take(x, iter))
     And IsExpr(Take(x, {iter+1,
	   Length(x)}));
   ];
 {result, iter-1};
];

Now we can define the remaining predicates:

10 # IsAExpr(x_IsList)_(Length(x) <= 1)
  <-- False;
20 # IsAExpr(x_IsList) <-- x[1] = "A" And
  FindTwoExprs(Tail(x))[1];

IsTheorem(x_IsList) <-- If(IsNExpr(x) And
  IsAExpr(Tail(x)) And IsProvable(
    Concat({"N"}, Take(Tail(Tail(x)),
   FindTwoExprs(Tail(Tail(x)))[2]) ));

The ABIN language is now complete. Let us try some simple examples:

In> IsExpr({"A","B"});
Out> False;
In> IsExpr({"N","B","I"});
Out> True;
In> IsAxiom({"N","B","I"});
Out> True;
In> IsTheorem({"N","B","I"});
Out> False;
In> IsProvable({"N","B","I"});
Out> True;
In> IsProvable({"N","A","B","I","B"});
Out> True;

It is somewhat inconvenient to type long lists of characters. So we can create an interface function to convert atomic arguments to lists of characters, e.g. AtomToCharList(BII) will return {"B","I","I"} (provided that the symbol BII has not been given a definition). Then we define a function ABIN(x) to replace IsProvable.

AtomToCharList(x_IsAtom) <-- [
  Local(index, result);
  For( [ index:=Length(String(x));
     result:={}; ],
   index > 0, index:=index-1 )
   Push(result, StringMid(index, 1,
     String(x)));
  result;
];
Holdarg(AtomToCharList, 1);
ABIN(x) := IsProvable(AtomToCharList(x));

In> AtomToCharList(NBII);
Out> {"N", "B","I","I"};
In> ABIN(NANBB);
Out> False;

It is easy to modify the predicates IsTheorem() and IsAxiom() so that they print the sequence of intermediate theorems and axioms used for deriving a particular theorem. The final version of the code is in the file examples/ABIN.ys. Now we can try to check a "complicated" theorem and see an outline of its proof:

In> ABIN(NAAABIIBIBNB);
Axiom {"NBII"}
Theorem NABIIBI derived
Theorem NAABIIBIB derived
Theorem NAAABIIBIBNB derived
Out> True;


Example: Using rules with special syntax operators creatively

Any Yacas function can be declared to have special syntax: in other words, it can be made into a prefix, infix, postfix, or bodied operator. In this section we shall see how prefix, infix, and postfix operators understood by Yacas can be adapted to a problem that seems to be far removed from algebra. Nevertheless it is instructive to understand how rewriting rules are used with special syntax operators.

Suppose we want to build a system that understands a simple set of English sentences and will be able to answer questions. For example, we would like to say "Tom had an apple and Jane gave 3 apples to Tom"; the system should understand that Tom has 4 apples now. In the usual LISP-based treatments of artificial intelligence, this problem would be illustrated with a cumbersome list syntax such as (had (Tom apple 1)) but we would like to use the power of the Yacas syntax and use plain English.

We shall create a set of rules that will "simplify" sentences to atoms such as True or False. As a side-effect, these "simplifications" will maintain a "knowledgebase" of information about all existing persons and objects.


The talking machine

The full source of this example is in the file examples/wordproblems.ys. In the next subsection we shall discuss the basic issues of the implementation. For now, here is an example session that shows what functionality we have in mind:

Unix>   yacas
[editvi.ys] [gnuplot.ys] [unix.ys] 
True;
Numeric mode: "Internal"
To exit Yacas, enter  Exit(); or quit or Ctrl-c.
  Type ?? for help.
Or type ?function for help on a function.
Type 'restart' to restart Yacas.
To see example commands, keep typing Example();
In> Load("wordproblems.ys")
Out> True;
In> Jitse and Ayal are persons;
OK, Jitse is a person.
OK,  Ayal is a person.
Out> {True,True};
In> apple is an object;
OK,  apple is an object.
Out> True;
In> there are many apples and pears;
Note: we already know that  apple  is an object
OK, we assume that the plural of " apple " is
  " apples ".
OK,  pear is an object.
OK, we assume that the plural of " pear " is
  " pears ".
Out> {True,True};
In> Serge had an apple;
OK,  Serge is a person.
OK,  Serge  has  1   apples  now.
Out> True;
In> Jitse had (10!) pears;
OK,  Jitse  has  3628800   pears  now.
Out> True;
In> Ayal had (2+3) apples and Serge had \
  2 pears;
OK,  Ayal  has  5   apples  now.
OK,  Serge  has  2   pears  now.
Out> {True,True};
In> Serge ate the apple;
OK,  Serge  has  no   apples  now.
Out> True;
In> Ayal ate a pear;// this should fail
Error:  Ayal  does not have enough
  pears  at this time.
Out> False;
In> Ayal gave an apple to Serge and \
  Serge gave a pear to Ayal;
OK,  Ayal  has  4   apples  now.
OK,  Serge  has  1   apples  now.
OK,  Serge  has  1   pears  now.
OK,  Ayal  has  1   pears  now.
Out> {True,True};
In> Ayal ate a pear;
OK,  Ayal  has  no   pears  now.
Out> True;
In> soup is an object and Ayal had \
  some soup;
OK,  soup is an object.
OK,  Ayal  has some  soup  now.
Out> {True,True};
In> Ayal gave soup to Serge and Serge \
  ate the soup;
OK,  Ayal  has  no   soup  now.
OK,  Serge  has some  soup  now.
OK,  Serge  has  no   soup  now.
Out> {True,True};
In> Serge has soup
Out> no;
In> Serge has apples
Out> 1;
In> Ayal has apples
Out> 4;
In> Serge has some soup
Out> False;
In> Serge has some apples
Out> True;
In> Ayal has some pears
Out> False;
In> Knowledge();
OK, this is what we know:
Persons: Jitse, Ayal, Serge
Object names: soup, pear, apple
Countable objects: pears, apples
Jitse has: 
 3628800  pears 

Ayal has: 
 4  apples 
 no  pears 
 no  soup 

Serge has: 
 1  apples 
 1  pears 
 no  soup 

Out> True;
In> software is an object
OK,  software is an object.
Out> True;
In> Ayal made some software
OK,  Ayal  has some  software  now.
Out> True;
In> Ayal gave some software to everyone
OK,  everyone is a person.
OK,  Ayal  still has some  software 
OK,  everyone  has some  software  now.
Out> True;
In> Ayal gave some software to Serge
OK,  Ayal  still has some  software 
OK,  Serge  has some  software  now.
Out> True;
In> Serge ate the software
OK,  Serge  has  no   software  now.
Out> True;

The string "OK" is printed when there is no error, "Note" when there is a warning, and "Error" on any inconsistencies in the described events. The special function Knowledge() prints everything the system currently knows.

Now we shall see how this system can be implemented in Yacas with very little difficulty.


Parsing sentences

A sentence such as "Mary had a lamb" should be parsed as a valid Yacas expression. Since this sentence contains more than one atom, it should be parsed as a function invocation, or else Yacas will simply give a syntax error when we type it in.

It is logical to declare "had" as an infix operator and "a" as a prefix operator quantifying lamb. In other words, "Mary had a lamb" should be parsed into had(Mary, a(lamb)). This is how we can do it:

In> [ Infix("had", 20); Prefix("a", 10); ]
Out> True;
In> FullForm(Mary had a lamb)
(had Mary (a lamb ))
Out> Mary had a lamb;
Now this sentence is parsed as a valid Yacas expression (although we have not yet defined any rules for the functions "a" and "had").

Note that we declared the precedence of the prefix operator "a" to be 10. We have in mind declaring another infix operator "and" and we would like quantifiers such as "a", "an", "the" to bind more tightly than other words.

Clearly, we need to plan the structure of all admissible sentences and declare all necessary auxiliary words as prefix, infix, or postfix operators. Here are the patterns for our admissible sentences:

"X is a person" -- this declares a person. Parsed: is(X, a(person))

"X and Y are persons" -- shorthand for the above. Parsed: are(and(X, Y), persons). "person" and "persons" are unevaluated atoms.

"A is an object" -- this tells the system that "A" can be manipulated. Parsed: is(A, an(object))

"there are many As" -- this tells the system that "A" can be counted (by default, objects are not considered countable entities, e.g. "milk" or "soup"). Parsed: are(there, many(As)). Here "As" is a single atom which will have to be stripped of the ending "s" to obtain its singular form.

"X ate N1 As", for example, Tom ate 3 apples -- parsed as ate(Tom, apples(3)). Since we cannot make the number 3 into an infix operator, we have to make apples into a postfix operator that will act on 3.

"X gave N As to Y" -- Here "N" is a number and "A" is the name of an object. Parsed as: gave(X, to(As(N), Y)). So to and gave are infix operators and to binds tighter than gave.

Sentences can be joined by "and", for example: "Tom gave Jane an apple and Jane ate 3 pears". This will be parsed as the infix operator "and" acting on both sentences which are parsed as above. So we need to make "and" of higher precedence than other operators, or else it would bind (apple and Jane) together.

"X made some A" -- note that if "A" is not countable, we cannot put a number so we need to write some which is again a prefix operator. made is an infix operator.

"X ate some A" -- the interpretation is that some A is still left after this, as opposed to "X ate the A" or "X ate A".

"X gave some A to Y" -- similarly, X still has some A left after this.

After each sentence, the system should know who has what at that time. Each sentence is parsed separately and should be completely interpreted, or "simplified".

All knowledge is maintained in the variable Knowledge which is an associative list with three entries:
Knowledge := {
	{"objects", {} },
	{"countable objects", {} },
	{"persons", {} }
};
The values under the keys "objects" and "countable objects" are lists of names of declared objects. The values of the "persons" key is a doubly nested associative list that specifies which objects each person has and how many. So, for example, Knowledge["persons"]["Tom"]["apples"] should give the number of apples Tom has now, or the atom Empty if he has none.


Declaring objects

Declaring persons is easy: we just create a new entry in the "persons" list. This can be done by an auxiliary routine DeclarePerson(). Note that after we have declared the words "is", "a" to be operators, we can just write the rule using them:

Infix("is", 20);
Prefix("a", 10);
_x is a person <-- DeclarePerson(x);
Here "person" will be left as an unevaluated atom and we shall never have any rules to replace it. Some other words such as "object", "objects" or "there" will also remain unevaluated atoms.

The operator "and" will group its operands into a list:

Infix("and", 30);
10 # x_IsList and _y <-- Concat(x, {y});
15 # _x and _y <-- Concat({x}, {y});
So expressions such as "Lisa and Anna and Maria" will be automatically transformed into {Lisa, Anna, Maria}. We shall adapt our rules to operate on lists of operands as well as on simple operands and that will automatically take care of sentences such as "there are many apples and ideas".

10 # there are many xs_IsList <--
  MapSingle("DeclareCountable", xs);
20 # there are many _xs <-- DeclareCountable(xs);

However, in the present system we cannot simultaneously parse "there are many apples and ideas" and "Isaac had an apple and an idea" because we have chosen had to bind tighter than and. We could in principle choose another set of precedences for these operators; this would allow some new sentences but at the same time disallow some sentences that are admissible with the current choice. Our purpose, however, is not to build a comprehensive system for parsing English sentences, but to illustrate the usage of syntax in Yacas.

Declaring objects is a little more tricky (the function DeclareCountable). For each countable object (introduced by the phrase "there are many ...s") we need to introduce a new postfix operator with a given name. This postfix operator will have to operate on a preceding number, so that a sentence such as "Mary had 3 lambs" will parse correctly.

If x were an unevaluated atom such as "lambs" which is passed to a function, how can we declare lambs to be a postfix operator within that function? The string representation of the new operator is String(x). But we cannot call Postfix(String(x)) because Postfix() does not evaluate its arguments (as of Yacas 1.0.49). Instead, we use the function UnList to build the expression Postfix(String(x)) with String(x) evaluated from a list {Postfix, String(x)}, and we use the function Eval() to evaluate the resulting expression (which would actually call Postfix()):
Eval(UnList({Postfix, String(x)} ));
We also need to declare a rulebase for the operator named String(x). We use MacroRuleBase for this:
MacroRuleBase(String(x), {n});

Finally, we would need to define a rule for "had" which can match expressions such as
_Person had n_IsNumber _Objects
where _Objects would be a pattern matcher for an unknown postfix operator such as lambs in our previous example. But we discover that it is impossible to write rules that match an unknown postfix operator. The syntax parser of Yacas cannot do this for us; so we should find a workaround. Let us define a rule for each object operator that will transform an expression such as {5 lambs} into a list {lambs, 5}. In this list, "lambs" will just remain an unevaluated atom.

Incidentally, the parser of Yacas does not allow to keep unevaluated atoms that are at the same time declared as prefix operators but it is okay to have infix or postfix operators.

A rule that we need for an operator named String(x) can be defined using MacroRule:
MacroRule(String(x), 1, 1, True) {x, n};
Now, after we declare "lambs" as an operator, the routine will define these rules, and anything on which "lambs" acts will be transformed into a list.
In> 5 lambs;
Out> {lambs, 5};
In> grilled lambs
Out> {lambs, grilled};

But what about the expression "many lambs"? In it, many is a prefix operator and lambs is a postfix operator. It turns out that for Yacas it is the prefix operator that is parsed first (and remember, we cannot have unevaluated atoms with the same name as a prefix operator!) so "many lambs" will be transformed into many(lambs) and not into an illegal expression {lambs, many}.


Implementing semantics

After implementing all the syntax, the semantics of these sentences is very easy to transform into rules. All sentences are either about how something exists, or about someone "having", "making", "eating", or "giving" certain objects. With the rules described so far, a complicated sentence such as
Ayal gave soup to Serge and Serge ate the soup
will be already parsed into function calls
{gave(Ayal, to(soup, Serge)), ate(Serge,
  {soup, 1}}
So now we only need to make sure that all this information is correctly entered into the knowledgebase and any inconsistencies (e.g. eating something you do not have) are flagged.

Here is the simplest rule: "giving" is implemented as a sequence of "eating" and "making".
10 # _x gave _obj to _y <-- [
	x ate obj;
	y made obj;
];

One more subtlety connected with the notion of "countable" vs. "uncountable" objects is that there are two different actions one can perform on an "uncountable" object such as "soup": one can eat (or give away) all of it or only some of it. This is implemented using the keyword "some" which is a prefix operator that turns its argument into a list,

some _obj <-- {obj, True};
This list looks like the result of another quantifier, e.g.
the _x  <-- {x, 1};
but in fact the special value True in it is used in the definition of "ate" so that when you "eat" "some" of the object, you still have "some" of it left.

To implement this, we have made a special rule for the pattern
_x had {obj, True} <-- ...
separately from the general rule
_x had {obj, n_IsNumber} <-- ...
and its shorthand
(_x had _obj )_(Not IsList(obj)) <--
  x had {obj, 1};

Admittedly, the module wordproblems.ys has not very much practical use but it is fun to play with and it illustrates the power of syntactic constructions from an unexpected angle.


Creating plugins for Yacas


Introduction

Yacas supports dynamical loading of libraries ("plugins") at runtime. This allows to interface with other libraries or external code and support additional functionality. In a Yacas session, plugins are seen as additional Yacas functions that become available after a plugin has been loaded. Plugins may be loaded at any time during a Yacas session.

The libltdl library, part of the libtool package and distributed together with Yacas, is used to load the plugins. This means that plugins will only function on platforms supported by libltdl. This includes Linux, Mac OS X, Solaris, and HP-UX.

Plugins currently have to be written in C++ and require certain include files from the Yacas source tree. A plugin may be compiled as a shared object after Yacas itself has been compiled and installed successfully.


An example plugin

Here is what we need to do to make a new plugin:

// FILE: func1.h
double func1 (double x, double y);

// FILE: func1.cc
#include "stubs.h"	// required
#include "func1.h"	// our exported API

#include <math.h>
// we need math.h for sin()
double func1 (double x, double y) {
  return sin(x/y);
}

For our simple plugin, we shall only need a few lines of code in the Yacas stub:

/* FILE: func1_api.stub */
Use("cstubgen.rep/code.ys");

/* Start generating a C++ stub file */
StubApiCStart("Func1");

/* Write some documentation */
StubApiCRemark("This function computes
  beautiful waves.");

/* define a plugin-specific include file */
StubApiCInclude("\"func1.h\"");

/* Declare a plugin-specific function */
StubApiCFunction("double","func1","Func1",
  { {"double","x"},{"double","y"}});

/* generate a C++ stub file
  "func1_api.cc" */
StubApiCFile("libfunc1","func1_api");

Another example of a Yacas stub is found in the file plugins/example/barepluginapi.stub of the source tree.

yacas -pc func1_api.stub

libtool c++ -I/usr/local/include/yacas/
  -I/usr/local/include/yacas/plat/linux32/
  -c func1.cc

libtool c++ -I/usr/local/include/yacas/
  -I/usr/local/include/yacas/plat/linux32/
  -c func1_api.cc

libtool c++ -module -avoid-version 
  -no-undefined -rpath `pwd` -o libfunc1.la
  func1.lo func1_api.lo

If compilation succeeds, the dynamic library file libfunc1.la is created. This is our plugin; now it could be installed into the Yacas plugin path (usually /usr/local/lib/yacas/) with

libtool cp example.la /usr/local/lib/yacas/example.la

But we can also keep it in the working directory.

In> DllLoad("libfunc1");
Out> True;

If the plugin is not installed in the Yacas plugin path, we have to specify the path to the plugin, eg. DllLoad("./libfunc1").

In> Func1(2,3);
Out> 0.61837;
When we are finished using the function, we might unload the DLL:

In> DllUnload("libfunc1");
Out> True;


A dynamically generated plugin

Since Yacas can load plugins at runtime, why not have them generated also at runtime? Here is how we could dynamically create plugin functions to speed up numerical calculations.

Suppose we had a numerical function on real numbers, e.g.

In> f(x,y):=Sin(2*x*Pi)+Sqrt(2)*Cos(y*Pi);
Out> True;

We can generate some C++ code that calculates this function for given floating-point arguments (not for multiple-precision numbers):

In> CForm(f(x,y));
Out> "sin(2 * x * Pi) + sqrt(2)
* cos(y * Pi)";
(Note that we would need to define Pi in the C++ file.)

Now it is clear that all the steps needed to compile, link, and load a plugin that implements f(x,y) can be performed automatically by a Yacas script. (You would need a C++ compiler on your system, of course.)

This is implemented in the function MakeFunctionPlugin() (see file unix.ys in the addons/ subdirectory). To use it, we might first define a function such as f(x,y) above and then call

In> MakeFunctionPlugin("fFast", f(x,y));
Function fFast(x,y) loaded from
  ./plugins.tmp/libfFast_plugin_cc.so
Out> True;
In> fFast(2,3);
Out> -1.41421;

Now we can use the function fFast(x,y) which is implemented using an external plugin library. The function MakeFunctionPlugin() assumes that all arguments and return values of functions are real floating-point numbers. The plugin libraries it creates are always in the plugins.tmp/ subdirectory of current working directory and are named like libNNN_plugin_cc.so.

If MakeFunctionPlugin() is called again to create a plugin function with the same name (but different body), the DLL will be unloaded and loaded as necessary.


Embedding Yacas into a c or c++ application

The current version of Yacas (1.0.54) has preliminary facilities for embedding Yacas into your own programs, through linking with a normal library.

The Yacas header files are installed in /usr/local/share/yacas/include, and the libraries in /usr/local/share/yacas/lib. The main entry library is the cyacas library, which offers an interface which can be reached through a normal c or c++ program. The API it offers is the following:

void yacas_init();
Initialize Yacas. This function has to be called before calling the other functions. This function establishes a main evaluation environment for Yacas expressions to be simplified in.

void yacas_eval(char* expression);
Evaluate an expression. The result (or possible error) can be obtained through the yacas_error and yacas_result functions, if so desired.

char* yacas_error();
Return a pointer to a string explaining the error if an error occurred, or NULL otherwise.

char* yacas_result();
Return a string representation of the result of evaluating an expression. This function is only meaningful if there was no error. In the case of an error, the return value of yacas_result should be considered undefined.

char* yacas_output();
Return pointer to output printed while evaluating an expression.

void yacas_exit();
Clean up all things related to the main Yacas evaluation environment

void yacas_interrupt();
Interrupt a calculation.

The directory embed/ contains some examples demonstrating calling Yacas from your own programs. Use the makefile.examples makefile to compile the examples (it is in a separate makefile because it is not yet guaranteed to work on all platforms, so it might otherwise break builds on some platforms).

The simplest "hello world"-type example is example2, which reads:

#include <stdio.h>
#include "cyacas.h"

int main(int argc, char** argv)
{
  yacas_init();
  yacas_eval("D(x)Sin(x)");
  if (!yacas_error())
    printf("%s\n",yacas_result());  
  yacas_exit();
  return 0;
}

And yields on the command line:

$ ./example2
Cos(x);

The example example1 allows you to pass expressions on the command line:

$ ./example1 "D(x)Sin(x)" "Taylor(x,0,5)Sin(x)"
Input> D(x)Sin(x)
Output> Cos(x);
Input> Taylor(x,0,5)Sin(x)
Output> x-x^3/6+x^5/120;

The file example3 shows a piece of code accepting an expression in Lisp syntax, the result being printed to the console using PrettyForm.

The interface described above will probably not change very much in the future, but the way to compile and link might. In an ideal world, there would be one library libcyacas.a and one header file cyacas.h, which get installed in a place where all programs can find it. Perhaps a dynamic link library is even better.


Why -x^(-1) and -1/x are not the same in Yacas

Wouldn't it be wonderful if we had a program that could do all the mathematical problems for us we could ever need? Need to solve a set of equations? Just call Solve with the appropriate arguments. Want to simplify an expression? Just call Simplify and you will always get the form you would like to see. A program, simply, that could replace any mathematician. An expert system, the domain of expertise being mathematics. Wouldn't that be great?

The answer to the above is, at least according to the author, a resounding no. It is doubtful such a program will ever exist, but it is not even sure that such a program would be desirable.

Humans have a long history of making tools to make their lives easier. One important property of a tool is that it is clear conceptually to the user of that tool what that tool does. A tool should not be clever. The user of the tool can be clever about using the tool, or combining it with other tools. A 'clever' tool often results in a tool that is not useful. It is hard to understand what a clever tool does, or why it does what it does. In short: its behavior will be unpredictable.

This is a passionate plea against generic commands like Simplify and Solve.

Consider this bit of interaction in Yacas:

In> a:= -x^(-1)
Out> -x^(-1);
In> b:= -1/x
Out> (-1)/x;
In> a = b
Out> False;

Now, that can not be right, can it? Clearly, these are the same? No, they are not. They have a slightly different form, and are thus represented differently internally. the = sign compares the internal representation of two expressions , so a = b returns False because the internal representations of the expressions a and b are bound to are different. Note that this is behaviour that is simple to explain. The = operator is a 'tool', it is simple, and does one thing but does it well. It is easy to use, an important property of a tool.

To drive home this point further, suppose we did modify the = operator to detect that a and b are indeed equal. Great! Wonderful! a=b now returns True. But consider

In> c:=1+r+r^2
Out> r+r^2+1;
In> d:=(1-r^3)/(1-r)
Out> (1-r^3)/(1-r);
In> c=d
Out> False;

c and d are equal for all values of r where r != 1, but there a limit can be taken:

In> Limit(r,1)d
Out> 3;
In> Limit(r,1)c
Out> 3;

Now, we have to modify the = tool to also detect that these are the same. Actually, this will have to be done for all known identities! Or we shall have to explain for which expressions it can determine equality. This will be a complex story, it will be hard to explain. It will be a complex tool to use. And, more practically, it will be a slow tool.

So, how do we go about verifying that a and b are the same? Or that c and d are the same?

The solution lies in devising new tools.


Canonical and normal representations

A canonical representation for a group of expressions is a representation for each object in that group such that if two elements of the group are the same, they also have the same (internal) representation. Thus, when expressions are brought to their canonical representations, the = tool can be used to verify that they are the same.

A representation is called a normal representation if zero only has one representation. Thus nf(a-b)=0 should be something that should return True if a and b are the same mathematically.

Consider a normal form defined on rational functions:

In> MM(a)
Out> MultiNomial({x},{{-1,-1}});
In> MM(b)
Out> MultiNomial({x},{{0,-1}})/
MultiNomial({x},{{1,1}});

However:

In> MM(a-b)
Out> MultiNomial({x},{{0,0}})/
MultiNomial({x},{{1,1}});
In> NormalForm(%)
Out> 0;

So here we have found a combination of tools that together allow us to decide that the a and b defined in the beginning of this section are the same: convert a-b to a normal form of a-b, and verify with the = tool that they are the same:

In> NormalForm(MM(a-b)) = 0
Out> True;

Now consider the c and d defined above. c and d are both functions of r only, c=c(r) and d=d(r). Now, let us define a function f(r)=c(r)-d(r):

In> f(r):=Eval(c-d)
Out> True;
In> f(r)
Out> r+r^2+1-(1-r^3)/(1-r);

It is not quite clear yet that this is zero. But we can decide that this is zero (and thus c(r)=d(r)) by first noting that f(r) is zero for some r, and then that the first derivative of f(r) with respect to r is zero, independent of r:

In> f(0)
Out> 0;
In> D(r)f(r)
Out> 2*r+1-(-((1-r)*3*r^2+r^3-1))/(1-r)^2;
In> NormalForm(MM(%))
Out> 0;

So here we have avoided bringing c and d to canonical forms, by for example first discovering that c is a geometric series, and gone straight to detecting that c-d is in fact zero, and thus c(r)=d(r).

Here again we have combined tools that are simple, do one thing but do it well, and for which it is easy to understand for human beings what they do.


But how can we then build a powerful CAS?

A new problem is introduced when algorithms are written down that require more powerful comparison tools, tools that are more sophisticated than the = tool for detecting that two expressions are indeed the same. The solution to this is to write the algorithm, but leave the actual comparison tool to be used by the algorithm configurable. This makes algorithms more flexible: the comparison operator can be passed in as an argument, or the algorithm can perhaps detect to which group its arguments belong, and use the appropriate tool to detect equality between two expressions.


Conclusion

A CAS (or any other system built to be used by humans for that matter) should be built up from small, well understood building blocks. Yacas contains hundreds of functions that can be combined into more powerful algorithms. These tools are documented in the documentation that comes with Yacas. Yacas solves the problem in that way. Let the user be smart, and choose the tools he needs based on understanding what the tools do. Large, complicated, cumbersome calculations can be done that way by just using well understood tools and combining them appropriately.


For Yacas developers


A crash course in Yacas maintenance for developers

This document intends to give a concise description of the way Yacas is maintained. There are a few parts to maintenance to take into account:

http://www.xs4all.nl/~apinkus/backups/


The autoconf/automake system

The short story is as follows. You probably do not need to bother about this unless you introduce a new file. However, if you add a new file, it probably should be added to the Makefile.am file in the same directory. In many cases, it should be clear from the Makefile.am file where your new file should be added. For instance, new Yacas script files go into the huge list in scripts/Makefile.am that is assigned to the SCRIPTFILES variable. Similarly, test scripts should go in the list in tests/Makefile.am that is assigned to the TESTFILES variable. Note that you should probably also run the cvs add command, as explained in the section on CVS below. If you remove a file, then you should go through the inverse procedure.

The addition of new files to the Makefile.am ensures that it will be added to the tarball yacas-*.tar.gz which is uploaded to the backup repository. This has the nice side effect that you can have local files which don't automatically get added to the distribution (by not adding them to the Makefile.am file). Additionally, files which are not listed in Makefile.am may not be built and/or installed automatically. To make sure that the tar.gz distribution is complete, you can run the command
make distcheck
This may take a little while, as it needs to rebuild and test the whole system from the tar.gz tarball.

If you want to do more complicated things (like adding files which are not Yacas script or test files, or files which should be compiled or installed only conditionally), or if you are simply curious, you can read more in the chapter entitled "The Yacas build system".


Maintaining Yacas through a cvs repository

CVS provides an efficient way for developers to work together, automatically merging changes various developers make, and at the same tile is a back up system (uploading your changes to another computer from which you can easily obtain it at a later time). After a little effort setting it up it becomes very easy to use. This section describes the few commands needed for keeping your version and the version in the Yacas repository up to date.


How does cvs work?

CVS has a copy of the files in the repository somewhere in a directory on some system, possibly your computer. Then there is such a thing as a cvs server which you can talk to to synchronize your version of the source code with the version on the server.

CVS uses a diff-like scheme for merging differences: it looks at two text files, determines the different lines, and merges accordingly. It discovers the changes you made by looking at the version you checked out last and the version you have now, to discover which lines changed (it maintains an automatic version number for each file).

If the version of a file on your system and the version in the cvs repository has a line that has been changed by both you and some one else, the cvs repository will obviously not know what to do with that, and it will signal a 'collision' which you will have to solve by hand (don't worry, this rarely happens). More on that later.

In commands to be described in this document are in short:


Checking out an initial version of Yacas

There are two ways to check out a version of Yacas: as anonymous user and as maintainer. Anonymous users don't need to log in, but also have no right to commit changes. Maintainers first need to get an account (at sourceforge), and their account needs to be enabled so they are allowed by the maintainer to make changes. A maintainer needs to log in with every command. To be able to log in, you need ssh1 installed (ssh2 will not work). You can find this at http://www.ssh.org/download.html.

To check out Yacas as anonymous user, type:

cvs -d:pserver:anonymous@cvs.yacas.
  sourceforge.net:/cvsroot/yacas login
cvs -z3 -d:pserver:anonymous@cvs.yacas.
  sourceforge.net:/cvsroot/yacas co yacas

To check out as a maintainer, type:

export CVS_RSH=ssh1

This will tell CVS to use ssh1 for communication. Then, in order to download the yacas source tree, type

cvs -d:ext:loginname@cvs.yacas.sourceforge.
  net:/cvsroot/yacas co yacas
where loginname is your name on the sourceforge system. This creates a directory yacas/ with the full most recent distribution. You need to enter your password there, but other than that, that's it!

Those lines typed above are long and obscure, but it is also the last time you need to type them. From now on, if you want to do anything with cvs, just go into the yacas/ directory you just checked out, and type the cvs command without the -d:... flag. This flag just tells cvs where to find the repository. But future cvs commands will know where to find them, which is why you don't need that flag.


Use case scenario 1 : getting the latest version of Yacas

You haven't looked at Yacas for a while (shame on you!) and want to check out the latest version. Just type

cvs update -d

on the command line in the yacas directory, and that should essentially download the latest version for you in that directory (just the changes). The -d option here states that you are also interested in new directories that were added to the repository. Oddly enough, cvs will only get you changed and added files, not added directories, by default.

A command

cvs -q update -d

will print messages only about changed files.


Use case scenario 2 : you made changes to Yacas

You got the latest version, but saw this huge, glaring omission in Yacas, and start hacking away to add it yourself. After a while, after playing with the code you wrote, and if you think you are finished with it, you decide you like to add it to the cvs repository.

First, you should test the new Yacas system:
make test
If there are any failed tests, you need to fix them.

Now you can start entering your changes to the CVS. If you created some new files, you need to tell CVS to add them to the source tree:

cvs add [list of file names of ascii text files]

This adds ascii text files. If you added binary files (GIF images in the documentation directory, or something like that), you can add it to the CVS with

cvs add -kb [list of file names of binary files]

Note that, when adding files to the CVS, you should normally also add them to the Yacas tar.gz distribution. This is done by adding the file name to the EXTRA_DIST variable in the file Makefile.am in the directory where you were adding the file.

In case files need to be removed, there are two options:

There seems to be no easy way to rename or move files; you would have to remove them at their old location and add them at a new location.

Now, when finished with that, you might want to 'commit' all changes with

cvs commit

If the commit succeeds, an email is sent out to the maintainers, who can then scan the diff files for changes, to see if they agree with the changes, and perhaps fix mistakes made (if any).

If there is a collision, the commit fails (it will tell you so). This might happen because someone else also edited the same place in a file and their changes cannot be automatically merged with yours. In case of a collision, you need to invoke cvs update twice. The cvs update outputs a list of file names with a character in front of them. The important ones are the files with a 'C' before them. They have a collision. You can go into the file, and see the collision, which the cvs system conveniently marks as:

<<<<<<
old version
===========
new version
>>>>>>

You can edit the file by merging the two versions by hand. This happens very rarely, but it can happen. Use cvs commit afterwards to commit.

The commit and update commands can be performed in specific directories, and on specific files, if necessary, by stating them on the command line. Or you can go into a sub directory and do a cvs commit or cvs update there, if you are confident that is the only place that changed or whose changes you are interested in.

That is basically it, a quick crash course cvs. It is actually very convenient in that usually all that is needed is a cvs commit to fix small bugs. You type that in, and your version gets merged with the changes others made, and they get your changes, and you backed up your changes at the same time (all with that little command!).

You can find more information about cvs at http://cvsbook.red-bean.com/ .


Preparing and maintaining Yacas documentation


Introduction

Yacas documentation in HTML and PS/PDF formats is generated by Yacas scripts from Yacas source files. Prior to version 1.0.48, all documentation had to be written directly in the Yacas language. However, it was very cumbersome to write those source files in the Yacas language. The scripts txt2yacasdoc.pl, book2TeX.sh, book2ys.sh, ytxt2tex were created to help maintain the documentation in an easy-to-read form.

The "source" form of all documentation is maintained in a special plain text format. The format is such that it is clearly readable without any processing and is easy to edit. To compile the documents, the system processes the plain text docs with a script to prepare Yacas-language files and then runs other scripts to produce the final documentation in HTML and other formats.

The source text must be formatted in a certain fashion to delimit sections, code examples, and so on, but the format is easy enough to enter in a plain text editor. Text is marked up mostly by TAB characters, spaces, and asterisks "*" at the beginning of a line. The format is easy enough so that, for example, the text of the GNU GPL (the file COPYING) can be used as a documentation source file without changes. The script txt2yacasdoc.pl converts this markup into Yacas code for further processing.


Organization of the Yacas documentation

All documentation source files are kept in the subdirectory manmake. During compilation, Yacas language files as well as HTML, LaTeX and PS/PDF files are automatically generated in the manmake subdirectory. Contributors should only need to edit *.txt files in manmake.

Currently, the Yacas documentation consists of four "core" books (the introductory tutorial, the programming tutorial, the user's reference manual, and the programmer's reference manual), and three "extra" books (algorithms, Lisp programming, essays).

There is also a documentation book describing the Emacs interface to Yacas (the "Yacas Notebook" mode). This book is not available as online help but is installed separately in the TeXinfo or PostScript formats.

All Yacas documentation books are distributed under the GNU Free Documentation License (FDL). If you add a new documentation book to Yacas, please include the file FDL.chapt.

The documentation books are meant to be stand-alone texts, except the two "reference manual" books which are meant to be used together because they share a common hyperlinked table of contents. The Yacas Help() command will show a reference article from either of the two reference books.

Stand-alone books are free-form, but reference books must be written with a certain template that allows online hyperlinking. The file manmake/dummies is an example template for a reference manual section.

Books are divided into "chapters", "sections" and "subsections". The reference manuals contain descriptions of Yacas commands and functions, and each function is given in a separate (unnumbered) section which is marked by a special *CMD label (see below).

At the beginning of each book there must be a book title and a short book description (labeled *BLURB). The short description appears in the printed version as the subtitle on the title page. It also goes into the HTML top-level book index as the book description.

At the beginning of each chapter there may be a "chapter introduction" labeled *INTRO which is also a short description of the contents of that chapter. It may be one paragraph only. The "chapter intro" feature is only used in the HTML reference manual because it is the text that appears at the very top of a reference manual section. As you can see in the HTML version of reference docs, each chapter contains a list of all functions described in it. This list goes right after the first paragraph of "chapter intro". In the printed (PS/PDF) documentation, the *INTRO label is ignored and the "chapter intro" paragraph is shown exactly like any other text paragraph.

Each printed book contains a table of contents at the beginning and an index at the end. The index should help a reader to quickly find a particular concept. For example, all documented Yacas commands are automatically entered into the index in the reference manual. Additional index entries should be inserted by hand using the *A label (see below).


Translating the documentation

The Yacas system is under active development and its documentation is constantly updated. An effort to translate the Yacas documentation from English into other languages has been started.

Translators should try to prepare translated documentation in the same plain text format as the original documentation. Any problems with conversion to HTML and PS/PDF formats should be easy to solve (at least for European languages).

The most important documentation books to translate are the reference manual and the tutorials. There is substantial documentation aimed at developers, for instance the algorithms book or the essays book. It is probably not as important to translate such books, since Yacas developers have to speak English to communicate.


Formatting of source text files

Formatting of source text files uses TAB symbols; if your editor does not support them and converts them to spaces, you should convert the results back to contain real TAB symbols using the standard Unix unexpand utility or a custom perl script.

You may want to examine the source of this file (manmake/ YacasDocs.chapt.txt) to see how various features of the markup are used. Currently the following markup is implemented:

*	Item
*	Another item
This will produce:

Note that the text of an item continues until the next itemized line is given or until end of paragraph (does not have to be all in one line).

Nesting of enumerated or itemized environments is not supported, except for fringe cases of nesting just one itemized list at the very end of an enumerated list or vice versa.

The enumerated environment is currently only implemented in LaTeX docs; HTML docs render them as itemized.

Note: Currently, only the HTML documentation is hyperlinked, while the printed PS/PDF documentation contains only text.

There is a special feature for displayed equations: Any punctuation immediately following the second pair of dollar signs will be displayed on the same line. (This is used to get around the limitation of mathematical expressions that cannot end with a comma or a period or with another punctuation mark.) For example, the formula "$$x^2/2$$," will produce

x^2/2,

with the comma on the same line. A formula such as "$x+1;$" will generate an error; the semicolon should be moved out of the dollar signs.

As special exceptions, one can enter the symbols " TeX" and " LaTeX" as if they are Yacas expressions, i.e. "$TeX$" produces " TeX". One can also create a capitalized form of the name Yacas as "{Yacas}".

Please note: Mathematical expressions must be valid Yacas expressions, with no unbalanced parentheses, no undefined infix operators, no hanging periods and so on, or else the Yacas script that formats the docs will fail! (This limits the scope of mathematical formulae but is hopefully not critical.)

Also, please avoid putting equations into documentation as plain text. Expressions such as a>0 are not correctly typeset by TeX if included into the plain text without the dollar signs: "a>0". (The HTML documentation is currently not affected.)

Currently, when creating online HTML documentation, all mathematics is kept in Yacas notation and set in boldface font. (This may change in the future.) Of course, LaTeX typesets the documentation with correct mathematical symbols.

Another feature of the LaTeX exporter is that it will first try to represent all functions and infix operators according to their mathematical meaning, and if no such meaning is defined in Yacas, then it will show them exactly as they are written in Yacas. For infix operators to work, they have to be declared in the standard library, or else an error will occur when processing the manual.

For example, the Yacas operators = and == are represented in LaTeX by an equals sign (=), the operator := becomes "identically equal" ( a:=b), and the cosmetic operators <> and <=> become a<>b and a<=>b. But you cannot use an infix operator such as ":=*" because it is not defined in the standard library. A Yacas function which is not defined in the standard library, for example "func(x)", will appear just like that: func(x).

*INCLUDE ../essays/howto.chapt

Note that the included document is the file howto.chapt, not howto.chapt.txt, because it must be a Yacas-language file, not a .txt file. (The IncludeFile() call, an alias to Load(), will be used to execute the specified file.)

For example, the text
*FOOT This is an example footnote
generates the footnote

This is an example footnote.

For example,
*EVAL "Yacas version: " : Version()
will insert the string ` Yacas version: 1.0.57 ' into the manual. Note the spaces around the version string---these additional spaces are currently unavoidable.

If the absence of spaces is critically important, you can create the required text in a Yacas expression.


Formatting conventions for the reference manual

The formatting explained in the previous section is enough to create most of the user guide and tutorial documentation. The script txt2yacasdoc.pl also implements some additional markup features to help create the reference manual.

A typical reference manual subsection documenting a certain function may look like this in plain text:

*CMD PrintList --- print list with padding
*STD
*CALL
   {PrintList}(list)
   {PrintList}(list, padding);

*PARMS

{list} -- a list to be printed

{padding} -- (optional) a string

*DESC
Prints {list} and inserts the {padding} ...

*E.G.

	In> PrintList({a,b,{c, d}}, " .. ")
	Out> " a ..  b .. { c ..  d}";

*SEE Write, WriteString
Compare this with the reference manual section on the function PrintList to see how this plain text markup is rendered in the finished documentation.

Notes:

*CMD Sin, Cos, Tan --- Trigonometric ...

In addition to the above labels, there are the following tags:

Usage of the *A label currently does not directly affect the appearance of the docs. In the HTML docs, it inserts the insivible anchor tags <a></a>. In the printed LaTeX docs, the *A label adds an index entry. The *CMD tag generates all necessary HTML anchors and index entries for commands in the reference manual. So only non-command index entries need to be manually entered using *A.

The *INTRO and *BLURB tags only work for one paragraph. There must be no empty line between *INTRO/*BLURB and that paragraph. Also, there must be no empty line between the "blurb" and the book title (for technical reasons). There must be one and only one "blurb" paragraph in a "book" and no more than one "chapter intro" paragraph per chapter.

This markup should be sufficient for creating reference documentation in plain text.


Indexing the documentation books

It is not difficult to automatically generate an alphabetically sorted index for the books. An "index entry" is a piece of text that does not appear in the book where it is entered, but instead is printed in the alphabetical list at the end of the text with the relevant page number.

Currently the following facilities are provided for indexing:

After LaTeX generates a "raw" index file *.idx, the makeindex utility is used to post-process and sort the index into the .ind file. If you do not have makeindex on your system, the book indices will not be generated.

Note that makeindex is not always friendly to special (non-alphanumeric) characters. For example, it uses the symbol ! to separate index topics, which may conflict with Yacas commands. In other words, document index must be tested and sometimes debugged.

In the HTML docs, the index is currently not generated on a separate page, although HTML anchors are inserted in the text. The reference manual uses the HTML anchors to provide online help through the ? command.

An index entry may be a "topic" with "subtopics", which usually appears in book indices like this:
gnus, 51
   tame, 51
   wild, 52-341
This effect can be achieved with the ! topic separator:
*A gnus
*A gnus!tame
*A gnus!wild
This is a special feature of makeindex.

Currently, it is possible to include a command or an equation into an index entry, for example,
*A {InterestingCommand}
*A calculation of $Sqrt(x)$
But this may sometimes conflict with the topic separator.


Summary of mark-up labels

Mark-up labels must appear as first characters on a line. The following mark-up labels are currently defined:

Labels with an argument on the same line (affect only the current line):

Labels that affect the rest of the line and the subsequent paragraph:

Special labels for the reference manual that accept several arguments on the same line:

Special labels without arguments that generate headings for the reference manual:

Other special markup:


Summary of special markup syntax

Special syntax entities include:


Debugging the manual

Sometimes the manual compilation make or make texdocs will break after you edit the plaintext manual sources. This can happen for one of these reasons:

In case of a math syntax error, the documentation exporter cannot print the paragraph where the error occurred, but it usually prints the preceding paragraph. Currently, the easiest way to locate the error is to generate the .tex output and look at it, e.g.:
make ref.book.tex; less ref.book.tex
The last line in the .tex file must be \end{document}. If it is not, then the last portion of the text you see in the .tex file is the text directly before the paragraph where the error occurred. Most probably, there is a malformatted math formula in the next paragraph of your plaintext source.

If the last line is \end{document} but latex does not finish, you will have to run latex by hand, e.g.
latex ref.book.tex
and look at the error message(s) it prints.


Using the script txt2yacasdoc.pl

The script txt2yacasdoc.pl is used to transform plain text markup into the Yacas language. The script acts as a stream filter:

perl txt2yacasdoc.pl < file.txt > file.chapt

In this example, file.txt contains some formatted plain text (source text) and the resulting file file.chapt will be produced in Yacas-language documentation format.

There is a single option for txt2yacasdoc:

perl txt2yacasdoc.pl -debug < file.txt \
  > file.chapt
This option is to be used for debugging, i.e. when the resulting file does not compile in Yacas. The effect of this option is to introduce more breaks between text strings in the generated file, so that the Text() function is called more often. It is then easier to locate the source of the problem in the Yacas-language file (Yacas will tell you the last line in the Yacas-language file at which a syntax error occurred). This option is largely obsolete because the Text() function is called frequently enough by default. See below for hints about finding syntax errors in documentation when the manual does not compile.


book2TeX: preparing typeset documentation

The script book2TeX.sh prepares a TeX file out of a Yacas-language documentation book. Usage is similar to book2txt.sh, except that only one file is processed at a time and the file must be a "book", not just a "chapter". For example:
book2TeX.sh intro.book intro.book.tex
will create a LaTeX-formatted version of the introductory tutorial. The LaTeX file can be processed with standard tools, for example
latex intro.book.tex
dvips -o intro.book.ps intro.book dvi
will prepare a Postscript version, while
pdflatex intro.book.tex
will prepare a PDF version.

To generate printed docs, it is necessary to run latex (at least) three times in a row. This is because at first latex does not know how much space will be taken by the table of contents and the index, so the page numbers are all off by a few pages. Only on the second run latex generates correct page numbers for the TOC (the .aux file) and for the index (the .idx file). After this the index file has to be processed by the makeindex routine to sort it, and the third latex run is needed to actually insert the correct TOC and the processed index into the final document.

The shell commands in book2txt.sh execute the following Yacas commands:
Use("book2TeX.ys");
ToFile("file.chapt.tex")
  Load("file.chapt");
This requires that the Yacas script book2TeX.ys be available in the current directory. The shell script book2TeX.sh assumes that book2TeX.ys is stored in the same directory as book2TeX.sh and that the Yacas executable is available in the directory ../src/. Alternatively, the command line of the Yacas executable can be specified by the -run option. For example, the Makefile runs book2TeX.sh like this:
book2TeX.sh -run "yacas-dir/src/yacas --rootdir
  yacas-dir/scripts" file.book file.book.tex
Note that the entire Yacas command line is given in quotes.

Some concerns with the printed documentation
Not all features of the Yacas documentation-generating scripts are compatible with TeX typesetting. To prevent errors, documentation source should avoid certain things. In general, it is a good idea to check the typeset appearance of documentation, since it helps detect errors.

For example, the symbols %, { }, < >, #, \, _ and & are special to TeX. They should not normally be used in plain text; it is okay to use them in "typewriter" text (within braces {}) or code samples -- but not in section or chapter heads, because it makes it difficult to export to TeX correctly. TeX commands may be entered but will not be correctly rendered in HTML online documentation.

Sometimes fixed-font text will hang over the right edge of the printed page. A workaround is to break the fixed-font text into shorter fragments or to rephrase the text.

Another concern is that code examples (TAB-indented blocks) are typeset in a fixed-width font and may not fit into the width of the page. To avoid this, the lines in the code examples should not be longer than about 50 characters.

The current implementation uses a "report" style which allows chapters, sections, subsections and includes an automatically generated table of contents and an index. The standard 10 point font and two-column format are used to save space (and trees).

The script txt2yacasdoc.pl attempts to convert double quotes "" into proper English quotation marks "". However, this automatic conversion sometimes fails and produces wrongly-directed quotes. One such case is if the quotes are on the same line as a TAB character (e.g. in an itemized environment). This problem can be circumvented by putting the quoted words on a different line.

Some complicated mathematical expressions may not correctly render in TeX. This is because Yacas uses its library function TeXForm() to transform Yacas expressions to TeX. Mathematical expressions are entered in the plain text documentation source using Yacas syntax, then transformed to a special non-evaluating call TeXMath() in the Yacas-language documentation, which formats into HTML using a Write() call or into TeX using a TeXForm() call, as necessary. Testing should be performed on documentation before releasing it. The most stringent limitation is that the expression between dollar signs should evaluate in Yacas (preferably to itself) and not cause syntax errors. In case of doubt, check that the expression evaluates without errors and then try to use TeXForm on that expression and see if that evaluates without errors as well. For example, expressions such as x=+1 will cause a syntax error and this will break the compilation of the manual (both HTML and TeX).


book2ys: extracting Yacas code from books ("literate programming")

book2ys.sh is a shell script that extracts Yacas code examples from a documentation chapter into a separate file. All other text is omitted.

Usage is similar to book2TeX.sh. For example, the benchmarking test code wester.yts can be automatically extracted from the corresponding essay chapter by the command
sh ../manmake/book2ys.sh wester-1994.chapt \
  wester.yts
After this command, the file wester.yts is created. Note that wester-1994.chapt is in Yacas language and is itself a generated file.

To prepare a documentation chapter in such a way that code extraction is possible, one needs to make sure that all code examples in the chapter taken together will become a correct sequence of Yacas expressions when cut out and written sequentially into a file. So, for instance, semicolons at the end of each statement are required. The script book2ys will not export example Yacas session code with "In>" and "Out>" prompts but it will export all other example code.

The example code with "In>" and "Out>" prompts will become comments in the exported Yacas file. It is possible to suppress this comment generation by the -strip option to the book2ys.sh script.

If the output file name is not given, the name will be the same as the Yacas book source name but with the .ys extension.

See the source file wester-1994.chapt.txt to get a feeling of how the source documentation is formatted to allow completely automatic code extraction. Note that the printed documentation may be in twocolumn format, and therefore it is necessary to split lines that are too long.

Using the script book2ys.sh, one can write documentation and code together, a la "literate programming". The main idea of literate programming is that the program code and the documentation should be written as one large book explaining to humans how the program is organized and how it works. From this book, a printable copy is generated and all code is automatically extracted for compilation.

Literate programming may require that code be split between many source files, and yet it may be convenient to keep all descriptions in one book. The special document formatting label *YSFILE can be used to redirect output to different Yacas source files.

By default, the output goes to the file specified on the book2ys.sh command line. This default can be restored by the directive "*YSFILE -" at any time. Otherwise, all code output will be printed to the file specified after the *YSFILE label.

Note that the multiple file support is somewhat restrictive:

Here is an example of using multiple files. Note how documentation text is interspersed with TAB-indented code.

Text that does not appear in the code.
    // code into default file
    // more code
*YSFILE script1.ys
This will not appear in the file.
    x:=1;
    // some code for the file script1.ys
*YSFILE script2.ys
    // some code for the file script2.ys
End of the example, need to reset *YSFILE.
*YSFILE -
After processing this file with book2ys.sh, one should get three files with Yacas code.


ys2book: extracting documentation from Yacas code ("literate programming")

The standard view of literate programming is that one prepares a book readable by humans, and all code is automatically extracted from the book. The focus in this approach is on writing explanations on how the code works.

The converse approach is to write primarily code and embed some documentation as comments in the code files. This approach is implemented by the script ys2book.pl.

This script takes a Yacas script file and extracts Yacas comments from it into another file. Usage may look like this:

perl ys2book.pl < file.ys > file.chapt.txt
perl ys2book.pl -strip < file.ys > file.chapt.txt
Here file.ys is the source file and file.chapt.txt is the output file.

Not all comments may be desirable as documentation. Formatting of comments is implemented as follows:

All exported text is printed to the output file as is, without any additional reformatting. The only change to the text is stripping of initial spaces.

Any leading spaces after the beginning of the comment sign are removed. For example,
/*      text */
will be exported as just "text" without the leading spaces. In a multiline comment, such as
/*    start
      of comment */
the leading spaces in the first line will be stripped. However, the leading spaces (and TABs) in other lines of the multiline comment block will be preserved.

Empty lines can be introduced into the documentation either as part of a multiline comment block, or as a standalone empty comments such as
///
////////

With these features it is easy to prepare the embedded documentation in the Yacas plaintext documentation format. This format requires space- and TAB-based formatting, which is mostly preserved by the script ys2book.pl.


ytxt2tex: Conversion of plain text documentation to LaTeX

An auxiliary script ytxt2tex converts plain text documentation to LaTeX. The script ytxt2tex can be used outside of the Yacas source tree to convert individual documents to LaTeX. This is useful if you would like to produce TeX documents and if you find the plain text format of the Yacas documentation more comfortable. Therefore, ytxt2tex is a kind of a special-purpose TeX preprocessor designed for producing Yacas documentation.

This is a standalone script; it is installed by default into /usr/local/bin and requires the Yacas executable also on the path, as well as the script files book2TeX.* and txt2yacasdoc.pl in the /manmake subdirectory of the Yacas installation tree /usr/local/share/yacas/. The script also requires perl and the Unix shell sh.

Limitations of this script are:

These limitations may be easily overcome by editing the resulting TeX file (but you need to know at least some TeX to do that).

The general usage pattern is
ytxt2tex [-o outputfile] file1.txt [file2.txt] ...
All source files must have extension ".txt". The command-line option -o specifies the name of the output TeX file. If the -o option is not given, the output file will be file1.tex (i.e. the name of the first .txt file with the .tex extension). If several .txt files are given, the first one must *INCLUDE all others.

To illustrate the usage of the script ytxt2tex, consider two examples.

The first example is just one plaintext file example1.txt. This file will have to be a "book" in itself, i.e. it will have to include a book title indented by four TAB symbols. For example:

*REM file: example1.txt
                Example Document Title
*REM this is a section title:
		Numbers and letters
*REM here are some index entries:
*A numbers
*REM simple index entries like this are OK
*A letters

Numbers and letters are very important, etc.

This file example1.txt can be converted to a LaTeX file example1.tex by the following simple command:
ytxt2tex example1.txt
If the resulting file should be named something other than example1.tex, say output1.tex, then the command is
ytxt2tex -o output1.tex example1.txt

The second example is a longer "book" consisting of several plaintext files. One of these files is a "master file" and it should include all other files using the *INCLUDE label. The *INCLUDE label should contain file names without the .txt extension.

Suppose we have prepared the files book1.txt, chapter1.txt, and chapter2.txt containing the preamble text and two chapters. For example:

*REM this is the file "book1.txt"
*BLURB or, The Multitudinous Attempts
to Avoid Counterproductivity
                 Relationships of Entities
*INCLUDE chapter1
*INCLUDE chapter2

The chapter files might be:

*REM this is the file "chapter1.txt"
        Entities and Epiphenomena
The history of the ambiguous question of
epiphenomenological discourse can be
traced to the pre-postmodern period...

*REM this is the file "chapter2.txt"
        Substrates and Superficiality
In the preceding chapter, we have thoroughly
investigated the metaphilosophical aspects of
the trans-homocentric considerations...

The command to create the final LaTeX file book1.tex is
ytxt2tex book1.txt chapter1.txt chapter2.txt
The "master file" book1.txt that includes all other text files must be given first. The -o option can be used if the final LaTeX file should be named something else than book1.tex. For example,
ytxt2tex -o MyBook.tex book1.txt chapter*.txt

By default, both table of contents and the index are generated. The commands to create a PostScript file out of the LaTeX file might be:
latex MyBook.tex
latex MyBook.tex
makeindex MyBook.idx -o MyBook.ind
latex MyBook.tex

Note that the resulting LaTeX file needs to be processed three times if the table of contents or index are to be used. Without a table of contents and index, it is enough to process the file with LaTeX twice.


book2txt: Conversion of existing documentation to plain text

(Note: as of version 1.0.49, all Yacas documentation is converted to plaintext format. This section is left for reference only.)

Currently, most but not all of the Yacas documentation markup functionality is implemented in the simple plaintext filter; also, documentation includes some extra HTML files. However, almost all of the reasonable markup needed to write documentation is present. Therefore it is possible to maintain most of the documentation in the plain text format described above. To convert existing Yacas documentation back to the plain text format, a script book2txt.ys/book2txt.sh can be used.

By using a command such as
book2txt.sh file.chapt
one can create a source text file file.chapt.txt corresponding to the Yacas documentation file file.chapt. For example:

12:51pm scriabin> book2txt.sh intro.book
[editvi.ys] [gnuplot.ys] 
True;
Out> True;
Quitting...
File 'intro.book.txt' was created.
12:51pm scriabin>

In the above example, the shell commands in book2txt.sh executed the following Yacas commands,
Use("book2txt.ys");
ToFile("file.chapt.txt")
  Load("file.chapt");
This requires that the Yacas script book2txt.ys be available in the current directory. The shell script book2txt.sh assumes that book2txt.ys is stored in the same directory as book2txt.sh.

Of course, it is possible that some features of Yacas documentation were not implemented in the script and in that case the resulting file must be edited by hand. But the purpose of the book2txt script is exactly this: to make a plain text source file to be edited and maintained.

Several files can be converted at once, for example:
book2txt.sh f1.chapt f2.chapt file3.book
Each file is processed by an independent Yacas session. Any errors of processing are printed on the screen.


The Yacas build system


Introduction

This chapter describes the build system of Yacas. So here you will find what happens when you give the configure or the make command, and how to change this. It will concentrate on Unix systems; other architectures are briefly covered in the final section.

As the Yacas build system is built on the GNU autotools suite, which contains both automake and autoconf, we will start with a short description of this package. Then we will turn to the various components of Yacas: the program itself, the script files, the documentation, the test suite, and so on.

As explained in the INSTALL file, building Yacas requires the following steps.

Both configure and make accept many options. Some of them are explained below.


The GNU autotools suite

The GNU autotools suite is a collection of applications to streamline the build system of other programs, like Yacas. Its two main goals are to present a consistent build procedure to users, and to assist developers in tackling portability problems.

The autotools suite consists of a number of utilities. These are developed separately, but are designed to be used together. They are

The users do not need to run automake and autoconf (here, "users" refers to the people who do not want to make any changes to Yacas and includes those who just want to compile Yacas from a tar.gz source archive). They do need the libtool script. But the libtool package is included in the Yacas distribution, so the users do not need to install libtool themselves.

Developers do need to install autoconf and automake on their systems. But they usually do not need to run these tools directly, as the Makefiles contain the necessary commands. When the Makefiles are not present, which occurs for instance when installing afresh from the CVS repository, the makemake script in the root of the Yacas source tree can (and probably should) be used to invoke automake and autoconf in the right order and with the correct flags.

In the following three sections, these utilities are briefly explained. In all cases, the reader is referred to the documentation included in the autotools suite for more information. Another useful source of information is GNU Autoconf, Automake, and Libtool by Gary V. Vaughan, Ben Elliston, Tom Tromey and Ian Lance, which is published by New Riders. An online version is available from http://sources.redhat.com/autobook .


The automake tool

Automake is a tool to generate standard-compliant Makefiles. More precisely, automake uses the information in Makefile.am to produce a Makefile.in file, which will be turned into a Makefile by the configure script generated by the autoconf utility.

The Makefile.am file contains the definition of certain macros that are used by automake. The rest of the Makefile.am file is copied verbatim to the generated Makefile.in file.

The most important macros which are used by automake are the so-called primaries. These list the files that make up the Yacas package. For instance, in the src directory, the file Makefile.am contains the following line
 bin_PROGRAMS = yacas
This is an example of the PROGRAMS primary, and says that the directory contains a program called yacas. Hence it will be built if the make all command is executed, it will be installed at make install, etc. Other useful primaries are SCRIPTS for executable scripts, HEADERS for header files, LIBRARIES for static libraries, LTLIBRARIES for libtool libraries, and DATA for all files which are just copied verbatim at installation time (this includes Yacas scripts).

The bin prefix in the example above says that yacas should be installed in the binary directory, as determined by the configure script. By default, this is the directory /usr/local/bin. There are also prefixes for the other directories, as well as some prefixes with different meanings: the noinst prefix says that the specified file need not be installed, and the check prefix says that the file is only needed when testing.

There are also so-called associated variables. The same Makefile.am contains the following variables associated to the Yacas executable:
yacas_SOURCES = yacasmain.cpp commandline.cpp \
      unixcommandline.cpp stdcommandline.cpp
yacas_LDADD = libyacas.a libyacasplatform.a \
      @NUMBERS_LIB@ @NUMBERS_LDFLAGS@ 
These lines tell that the executable is built from four source files (yacasmain.cpp, commandline.cpp, unixcommandline.cpp and stdcommandline.cpp) and two static libraries (libyacas.a and libyacasplatform.a). The @NUMBERS_LIB@ and @NUMBERS_LDFLAGS@ symbols are defined when the configure script is run, as explained in the next section. They contain the names of additional libraries to link in.

From the information contained in these lines, automake can construct the necessary commands to go in the final Makefile. This Makefile does not only support building, testing, and installing the package, but also rolling the tar-ball for release (use make dist for this, as explained in the section "Targets for make" below). In the above example, automake can figure out that yacasmain.cpp should be included in the distribution.

Unfortunately not everything is supported that well. For instance, Yacas comes with its own documentation system, which is of course not supported by automake. So we need to tell automake how to handle these files. To specify which files should be included in the distribution, the EXTRA_DIST variable can be used. The developer should list all files to be included in the distribution that automake does not know about here. If we want to run some commands at build or installation time, we can specify them by Makefile rules in the traditional ways. Just write the rules in Makefile.am and they will be copied verbatim in the generated Makefile. Please keep in mind that the rules should work on a wide variety of platforms in order to retain portability. In particular,

We currently assume automake version 1.4 or later. Note that version 1.5 breaks backward compatibility and should be avoided. Version 1.6 contains some useful additions, like the nobase prefix and the possibility to define new prefixes, so at a certain point we may require version 1.6.

For more information about automake, the reader is referred to the documentation that comes with the package.


The autoconf tool

Autoconf is a tool to generate portable shell scripts that users can run to configure the package (in our case Yacas) for their system. It reads the file configure.in and produces the configure script. The latter script can be run by the user to prepare for building Yacas.

The configure.in file consists of standard shell code, interspersed with special macros defined by the autoconf package. These can be recognized by the AC_ prefix.

As the configure.in file only rarely needs to be changed, we will only describe the autoconf tool by one example. As explained in the previous section, "The automake tool" , the symbols @NUMBERS_LIB@ and @NUMBERS_LDFLAGS@ are used in the Makefile.in to link a library with basic numerical routines into the Yacas executable. This gives the user the option to choose between two libraries: the GNU multi-precision arithmetic (GNU MP) library and a library provided by the Yacas team.

This effect is achieved by the following fragment of configure.in (for clarity, a simplified version is presented).

AC_ARG_WITH(numlib, [  --with-numlib=LIB ... ],  \
            with_numlib=$withval,                \
            with_numlib="native")
case $with_numlib in
  native) 
    NUMBERS_LIB="libyacasnumbers.la"
    NUMBERS_LDFLAGS="-lm"
  gmp)
    AC_CHECK_LIB(gmp, __gmpz_init, \
                 have_gmp=yes, have_gmp=no)
    if test "$have_gmp" = "no" ; then
      AC_MSG_ERROR([GNU MP library not found])
    fi
    NUMBERS_LIB="libgmpnumbers.la"
    NUMBERS_LDFLAGS="-lgmp -lm"
esac
AC_SUBST(NUMBERS_LIB)
AC_SUBST(NUMBERS_LDFLAGS)

The first line tells the configure script to accept an extra option, --with-numlib=LIB. The shell variable with_numlib is set to the value LIB given by the user, or to native if the user does not specify this option on the command line.

If the shell variable with_numlib has the value native (which means that the user has either given the --with-numlib=native option to configure, or not used the option at all), then the NUMBERS_LIB and NUMBERS_LDFLAGS shell variables are set to libyacasnumbers.la and -lm respectively.

If on the other hand the --with-numlib=gmp option is passed to the configure script, then first it is checked that the GNU MP library is available -- if this is not the case, the configuration terminates with an error. Then the NUMBERS_LIB and NUMBERS_LDFLAGS shell variables are set to suitable values.

The last two lines say that the value of the NUMBERS_LIB and NUMBERS_LDFLAGS shell variables should be substituted for the @NUMBER_LIB@ and @NUMBER_LDFLAGS@ symbols respectively, in the Makefile.in.

This ends the brief description of autoconf. For more information, the reader is referred to the documentation that comes with the package.

We currently assume autoconf version 2.13 or later.


The libtool utility

The libtool utility takes care of all the peculiarities of creating, linking and loading shared and static libraries across a great number of platforms, providing a uniform command line interface to the Yacas developer. The inner workings of libtool are fairly complicated, but fortunately Yacas developers very rarely need to concern themselves with it as libtool is tightly integrated with automake and autoconf. In fact, it should suffice to replace any LIBRARIES primary in Makefile.am with an LTLIBRARIES primary to indicate that libtool libraries should be used, and to keep in mind that the correct suffices are used: .lo for object files and .la for any libraries.

For the people that want to know everything, we will now explain the idea of libtool in a nutshell by a simple example: suppose that we want to build and install a library from the source file hello.c. The following commands do the trick, on any system:

libtool gcc -c hello.c
libtool gcc -rpath /usr/local/lib -o \
  libhello.la hello.lo
libtool cp libtrim.la /usr/local/lib

The first command builds a libtool object with the name hello.lo. This file encapsulates the hello.o object file. Subsequently, the libtool library with the name libhello.la is built, which encapsulates both the static and the shared library. The last command installs both the static and the shared library in the /usr/local/lib directory. Note that the GNU compiler gcc need not be installed on the system; libtool will call the native compiler with the correct switches if necessary.

The libtool distribution includes libltdl, the LibTool Dynamic Loading library. This library is used to support Yacas plugins.


The configure script

The configure script is run by the user to prepare the Yacas package for the build and installation process. It examines the user's system and the options passed to the script by the user, and generates suitable Makefiles. Furthermore, it generates yacas.spec which is used to build a package in Red Hat's .rpm format, and the C header file config.h.

A nice feature is the possibility to build in a different directory than the source directory by simply running configure from that directory. This not only prevents the source directory from being cluttered up by object files and so on, but it also enables the user to build for different architectures in different directories or to have the source in a read-only directory. For example, suppose that the Yacas source is installed under /mnt/cdrom/src/yacas and that you want to build Yacas under /tmp/build-yacas. This is achieved by the following commands
mkdir /tmp/build-yacas
cd /tmp/build-yacas
/mnt/cdrom/src/yacas/configure
make

A list of options accepted by the configure script can be retrieved by invoking it with the help option
./configure --help
The most important is the prefix option, which influences where everything will be installed. The default is /usr/local, meaning that for instance the yacas executable is installed as /usr/local/bin/yacas. To change this in /usr/bin/yacas, invoke the script as follows
./configure --prefix=/usr
Other options to configure enable the user to fine-tune the location where the various files should be installed.

We will not describe the common configure options which are shared by all packages, but will restrict ourselves to the options exclusively used by Yacas.

The following three options pertain to the extensive documentation that comes with Yacas. By default, only HTML documentation is generated.

Then there are three options describing where to install various parts of the Yacas package.

Furthermore, the opposites of the above options (eg. disable-server) are also recognized.


Targets for make

One of the advantages of using the autotool suite to generate Makefiles, is that the resulting Makefiles support a variety of targets. Here is a partial list of them.


The Yacas executable

The main executable is called yacas and resides in the src directory.

Conceptually, the build process consists of the following parts:

At installation time, not only the yacas executable is installed but also the static libraries and the header files. This is also to enable developers to embed Yacas in their own programs.


The Yacas script files

The Yacas script files with extension ys can be found in the scripts/ directory and its "repository" subdirectories.

All script files and .def files are listed in Makefile.am. This definition is used to generate the packages.ys file, which contains a list of all .def files. The corefunctions.ys file is also automatically generated. This is done by running the gencorefunctions program, which resides in the src/ directory.

At installation time, all the script files and .def files that are listed in Makefile.am are copied. During this installation, the directory structure should be preserved.

The check target checks that all the script files and .def files listed in the Makefile.am are indeed present, and vice versa, that all files present in the scripts/ hierarchy are indeed listed.


The documentation

The documentation system for Yacas is explained in Essays on Yacas, Chapter 5, Section 2 .

The documentation is generated from plain text source files with the txt extension in the manmake directory. The source files are converted to Yacas code by the txt2yacasdoc.pl script. The result is a set of Yacas source files containing the text of the documentation. These Yacas source files are not well suited for human reading and must be further processed to obtain the books in HTML and PS/PDF formats.

To generate the books in HTML format, the Yacas source files are processed by the Yacas interpreter using the scripts manualmaker and indexmaker. The script indexmaker creates the file books.html with the top-level index to all HTML documentation books. The script manualmaker creates the HTML files with the book text (both the framed and the non-framed versions are written out). It is also possible to generate more than one HTML book at once. The two reference manuals are generated together to produce the full hyperlinked index to all commands.

To generate the books in PostScript and PDF format, the book2TeX.sh script is used. This script converts the Yacas source files to TeX files. They can then be processed by the standard TeX tools.

By default, only the HTML-formatted documents are generated. To change this, use the options disable-html-doc, enable-ps-doc and enable-pdf-doc in the configure script. The with-html-dir and with-ps-dir options can be used to tell where the documentation should be installed.

At the moment, the Makefile.am for the documentation is rather messy. For this reason, we do not describe it in detail. Instead, we just point out some special features:


The test suite

The Yacas distribution contains a (hopefully comprehensive) test suite, which can be used to check whether Yacas is (still) functioning correctly. The command make check tests the latest compiled version of Yacas. To test the executable that is installed on your system, use the command make installcheck.

The yts files in the tests directory contain the tests for different subsystems of Yacas. For instance, the file arithmetic.yts contains tests for the basic arithmetic functions; the first test is to check that 3+2 evaluates to 5. All the test files are listed in the TESTFILES variable. This variable can be overridden. For example, the command
make TESTFILES='comments.yts complex.yts' clean
only runs the test scripts comments.yts and complex.yts.

The shell script test-yacas does the actual testing. It runs the scripts listed in the TESTFILES variable through the Yacas executable. The tests that take a long time are at the end of the list, so that these tests are performed last. The output of the tests is collected in the file testresult.txt. The test is considered to be failed, if the exit code is nonzero (which happens for instance if Yacas crashes) or if either of the strings ****** (six stars) or Error appear in the output.

A special case is the last test, which goes through the problems put forward by Michael Wester (see the section M. Wester's CAS benchmark and Yacas ). The commands for this tests are not in a yts file in the tests directory, but are extracted from the documentation (see the immediately preceding section ).


The library archive

The library archive is a file, commonly called scripts.dat, which contains all the scripts in compressed form. If Yacas is started with the --archive flag, it uses the contents of this file. This is useful for binary releases, as one needs only two files: the Yacas executable and the library archive.

Support for the library archive resides in the src/ directory. It contains the code for the compressor program. This program generates the library archive.

The archive is only built if the enable-archive option is passed to the configure script. In that case, the compressor program is built and run. At installation time, the generated archive is installed in the library directory (/usr/local/lib by default). There is also a test to check that the generated archive in fact works.


Plugins

The directory plugins contains the files which are necessary to build plugins, which can be dynamically loaded into Yacas. See Essays on Yacas, Chapter 3, Section 3 for details.

Plugins are almost the same as shared libraries, except that they are not linked when the Yacas executable is started, but at the user's request while Yacas is being run. The steps for building plugins are as follows. First we need to run Yacas on the .stub file. Then the generated C++ file is compiled, together with the files which contain the actual code for the functionality to be provided by the plugin. Finally, the resulting object files are linked. We need to add the linker flags -module -avoid-version -no-undefined in order to get a plugin. If the plugin uses any functions from other libraries, these libraries should be specified with the -L option. The plugin is installed in pkglib (which defaults to /usr/local/lib/yacas) when the user gives the make install command.

The following plugins are included with Yacas:


Proteus

The directory proteus contains an experimental implementation of a graphical interface to Yacas.

The executable is called proteusworksheet. It is only built if the build-proteus option was given to the configure script, and if the fltk development libraries are present. If this is the case, the executable is built by linking three libraries from the src directory with the object files in the proteus directory.

The install target installs the executable and some auxiliary data files that Proteus needs to function.


Other components

This section lists the components of the Yacas distributions that have not yet been described.

The build system does not perform any actions for the following components of Yacas.


Non-Unix architectures

Until now, we have only talked about the support for building Yacas on Unix-like systems. Here are pointers to descriptions of the support for other architectures. Some architectures may sometimes be trailing behind.


Designing modules in the Yacas scripting language


Introduction

For any software project where the source code grows to a substantial amount of different modules, there needs to be a way to define interfaces between the modules, and a way to make sure the modules don't interact with the environment in an unintended way.

One hallmark of a mature programming language is that it supports modules, and a way to define its interface while hiding the internals of the module. This section describes the mechanisms for doing so in the Yacas scripting language.


Demonstration of the problem

Unintentional interactions between two modules typically happen when the two modules accidentally share a common "global" resource, and there should be a mechanism to guarantee that this will not happen.

The following piece of code is a little example that demonstrates the problem:

SetExpand(fn_IsString) <-- [expand:=fn;];
ram(x_IsList)_(expand != "") <-- ramlocal(x);
expand:="";
ramlocal(x) := Map(expand,{x});

This little bit of code defines a function ram that calls the function Map, passing the argument passed if it is a string, and if the function to be mapped was set with the SetExpand function. It contains the following flaws:

The above code can be entered into a file and loaded from the command line at leisure. Now, consider the following command line interaction after loading the file with the above code in it:

In> ramlocal(a)         
In function "Length" : 
bad argument number 1 (counting from 1)
Argument matrix[1] evaluated to a
In function call  Length(a)
CommandLine(1) : Argument is not a list

We called ramlocal here, which should not have been allowed.

In> ram(a)
Out> ram(a);

The function ram checks that the correct arguments are passed in and that SetExpand was called, so it will not evaluate if these requirements are not met.

Here are some lines showing the functionality of this code as it was intended to be used:

In> SetExpand("Sin")
Out> "Sin";
In> ram({1,2,3})
Out> {Sin(1),Sin(2),Sin(3)};

The following piece of code forces the functionality to break by passing in an expression containing the variable x, which is also used as a parameter name to ramlocal.

In> ram({a,b,c})
Out> {Sin(a),Sin(b),Sin(c)};
In> ram({x,y,z})
Out> {{Sin(x),Sin(y),Sin(z)},Sin(y),Sin(z)};

This result is obviously wrong, comparing it to the call above. The following shows that the global variable expand is exposed to its environment:

In> expand
Out> "Sin";


Declaring resources to be local to the module

The solution to the problem is LocalSymbols, which changes every symbol with a specified name to a unique name that could never be entered by the user on the command line and guarantees that it can never interact with the rest of the system. The following code snippet is the same as the above, with the correct use of LocalSymbols:

LocalSymbols(x,expand,ramlocal) [
  SetExpand(fn_IsString) <-- [expand:=fn;];
  ram(x_IsList)_(expand != "") <-- ramlocal(x);
  expand:="";
  ramlocal(x) := Map(expand,{x});
];

This version of the same code declares the symbols x, expand and ramlocal to be local to this module.

With this the interaction becomes a little bit more predictable:

In> ramlocal(a)
Out> ramlocal(a);
In> ram(a)
Out> ram(a);
In> SetExpand("Sin")
Out> "Sin";
In> ram({1,2,3})
Out> {Sin(1),Sin(2),Sin(3)};
In> ram({a,b,c})
Out> {Sin(a),Sin(b),Sin(c)};
In> ram({x,y,z})
Out> {Sin(x),Sin(y),Sin(z)};
In> expand
Out> expand;


When to use and when not to use LocalSymbols

The LocalSymbols should ideally be used for every global variable, for functions that can only be useful within the module and thus should not be used by other parts of the system, and for local variables that run the risk of being passed into functions like Eval, Apply, Map, etc. (functions that re-evaluate expressions).

A rigorous solution to this is to make all parameters to functions and global variables local symbols by default, but this might cause problems when this is not required, or even wanted, behaviour.

The system will never be able to second-guess which function calls can be exposed to the outside world, and which ones should stay local to the system. It also goes against a design rule of Yacas: everything is possible, but not obligatory. This is important at moments when functionality is not wanted, as it can be hard to disable functionality when the system does it automatically.

There are more caveats: if a local variable is made unique with LocalSymbols, other routines can not reach it by using the UnFence construct. This means that LocalSymbols is not always wanted.

Also, the entire expression on which the LocalSymbols command works is copied and modified before being evaluated, making loading time a little slower. This is not a big problem, because the speed hit is usually during calculations, not during loading, but it is best to keep this in mind and keep the code passed to LocalSymbols concise.


The Yacas arithmetic library


Introduction

Yacas comes with its own arbitrary-precision arithmetic library, to reduce dependencies on other software. Several excellent and free arbitrary precision arithmetic libraries are available, for example gmp and CLN. The gmp library can optionally be linked with the Yacas engine, instead of the internal arithmetic library. The choice of a specific number library is a compile-time option; one cannot switch to another arithmetic library at runtime. The default choice is the internal library ("yacasnumbers.a").

This part describes how the arithmetic library is embedded into Yacas and how to embed other arithmetic libraries.


The link between the interpreter and the arithmetic library

The Yacas interpreter has the concept of an atom, an object which has a string representation. Numbers are also atoms and are initially entered into Yacas as decimal strings.
There are functions to work with numbers in non-decimal bases, but direct input/output of numbers is supported only in decimal notation.
As soon as a calculation needs to be performed, the string representation is used to construct an object representing the number, in an internal representation that the arithmetic library can work with.

The basic layout is as follows: there is one class BigNumber that offers basic numerical functions, arithmetic operations such as addition and multiplication, through a set of class methods. Integers and floating-point numbers are handled by the same class.

The BigNumber class delegates the actual arithmetic operations to the auxiliary classes BigInt and BigFloat. These two classes are direct wrappers of an underlying arithmetic library. The library implements a particular internal representation of numbers.

The responsibility of the class BigNumber is to perform precision tracking, floating-point formatting, error reporting, type checking and so on, while BigInt and BigFloat only concern themselves with low-level arithmetic operations on integer and floating-point numbers respectively. In this way Yacas isolates higher-level features like precision tracking from the lower-level arithmetic operations. The number objects in a library should only be able to convert themselves to/from a string and perform basic arithmetic. It should be easy to wrap a generic arithmetic library into a BigNumber implementation.

It is impossible to have several alternative number libraries operating at the same time. [In principle, one might write the classes BigInt and BigFloat as wrappers of two different arithmetic libraries, one for integers and the other for floats, but at any rate one cannot have two different libraries for integers at the same time.] Having several libraries in the same Yacas session does not seem to be very useful; it would also incur a lot of overhead because one would have to convert the numbers from one internal library representation to another. For performance benchmarking or for testing purposes one can compile separate versions of Yacas configured with different arithmetic libraries.

To embed an arbitrary-precision arithmetic library into Yacas, one needs to write two wrapper classes, BigInt and BigFloat. (Alternatively, one could write a full BigNumber wrapper class but that would result in code duplication unless the library happens to implement a large portion of the BigNumber API. There is already a reference implementation of BigNumber through BigInt and BigFloat in the file numbers.cpp.) The required API for the BigNumber class is described below.


Interface of the BigNumber class

The following C++ code demonstrates how to use the objects of the BigNumber class.

// Calculate z=x+y where x=10 and y=15
BigNumber x("10",100,10);
BigNumber y("15",100,10);
BigNumber z;
z.Add(x,y,10));    
// cast the result to a string
LispString  str;
z.ToString(str,10);
The behaviour is such that in the above example z will contain the result of adding x and y, without modifying x or y. This is equivalent to z:=x+y in Yacas.

A calculation might modify one of its arguments. This might happen when one argument passed in is actually the object performing the calculation itself. For example, if a calculation

x.Add(x,y);
were issued, the result would be assigned to x, and the old value of x is deleted. This is equivalent to the Yacas code x:=x+y. In this case a specific implementation might opt to perform the operation destructively ("in-place"). Some operations can be performed much more efficiently in-place, without copying the arguments. Among them are for example Negate, Add, ShiftLeft, ShiftRight.

Therefore, all class methods of BigNumber that allow a BigNumber object as an argument should behave correctly when called destructively on the same BigNumber object. The result must be exactly the same as if all arguments were copied to temporary locations before performing tasks on them, with no other side-effects. For instance, if the specific object representing the number inside the numeric class is shared with other objects, it should not allow the destructive operation, as then other objects might start behaving differently.

The basic arithmetic class BigNumber defines some simple arithmetic operations, through which other more elaborate functions can be built. Particular implementations of the multiple-precision library are wrapped by the BigNumber class, and the rest of the Yacas core should only use the BigNumber API.

This API will not be completely exposed to Yacas scripts, because some of these functions are too low-level. Among the low-level functions, only those that are very useful for optimization will be available to the Yacas scripts. (For the functions that seem to be useful for Yacas, suggested Yacas bindings are given below.) But the full API will be available to C++ plugins, so that multiple-precision algorithms could be efficiently implemented when performance is critical. Intermediate-level arithmetic functions such as MathAdd, MathDiv, MathMod and so on could be implemented either in the Yacas core or in plugins, through this low-level API. The library scripts will be able to transform numerical expressions such as x:=y+z into calls of these intermediate-level functions.

Here we list the basic arithmetic operations that need to be implemented by a multiple-precision class BigNumber. The operations are divided into several categories for convenience. Equivalent Yacas script code is given, as well as examples of C++ usage.

1. Input/output operations.

2. Basic object manipulation.
These operations, as a rule, do not need to change the numerical value of the object.

3. Basic arithmetic operations.
Note that here "precision" always means the number of significant bits, i.e. digits in the base 2, not decimal digits.

4. Auxiliary operations.
Some additional operations useful for optimization purposes. These operations can be efficiently implemented with a binary-based internal representation of big numbers.

For floating-point numbers, BitCount should return the binary exponent of the number (with sign), like the integer output of the standard C function frexp. More formally: if n=BitCount(x), and x!=0, then 1/2<=Abs(x)*2^(-n)<1. The bit count of an integer or a floating 0 is arbitrarily defined to be 1. (Optimization of the binary logarithm.) C++:
x.BitCount();
Yacas:
MathBitCount(x)
Note: the return type of the bit count is a platform signed integer (C++ type long).

The API includes only the most basic operations. All other mathematical functions such as GCD, power, logarithm, cosine and so on, can be efficiently implemented using this basic interface.

Note that generally the arithmetic functions will set the type of the resulting object to the type of the result of the operation. For example, operations that only apply to integers (Mod, BitAnd etc.) will set the type of the resulting object to integer if it is a float. The results of these operations on non-integer arguments are undefined.


Precision of arithmetic operations

All operations on integers are exact. Integers must grow or shrink when necessary, limited only by system memory. But floating-point numbers need some precision management.

In some arithmetic operations (add, multiply, divide) the working precision is given explicitly. For example,
x.Add(y,z,100)
will add y to z and put the result into x, truncating it to at most 100 bits of mantissa, if necessary. (The precision is given in bits, not in decimal digits, because when dealing with low-level operations it is much more natural to think in terms of bits.) If the numbers y, z have fewer than 100 bits of mantissa each, then their sum will not be precise to all 100 digits. That is fine; but it is important that the sum should not contain more than 100 digits. Floating-point values, unlike integers, only grow up to the given number of significant bits and then a round-off must occur. Otherwise we will be wasting a lot of time on computations with many meaningless digits. A good low-level arithmetic library such as gmp will avoid floating-point computations with more digits than explicitly requested.


Automatic precision tracking

The precision of arithmetic operations on floating-point numbers can be maintained automatically. A rigorous way to do it would be to represent each imprecise real number x by an interval with rational bounds within which x is guaranteed to be. This is usually called "interval arithmetic." A result of an interval-arithmetic calculation is "exact" in the sense that the actual (unknown) number x is always within the resulting interval. However, interval arithmetic is computationally expensive and at any rate the width of the resulting interval is not guaranteed to be small enough for a particular application.

For the Yacas arithmetic library, a "poor man's interval arithmetic" is proposed where the precision is represented by the "number of correct bits". The precision is not tracked exactly but almost always adequately. The purpose of this kind of rough precision tracking is to catch a critical roundoff error or to indicate an unexpected loss of precision in numerical calculations.

Suppose we have two floating-point numbers x and y and we know that they have certain numbers of correct mantissa bits, say m and n. In other words, x is an approximation to an unknown real number x'=x*(1+delta) and we know that Abs(delta)<2^(-m); and similarly y'=y*(1+epsilon) with Abs(epsilon)<2^(-n). Here delta and epsilon are the relative errors for x and y. Typically delta and epsilon are much smaller than 1.

Suppose that every floating-point number knows the number of its correct digits. We can symbolically represent such numbers as pairs {x,m} or {y,n}. When we perform an arithmetic operation on numbers, we need to update the precision component as well.

Now we shall consider the basic arithmetic operations to see how the precision is updated.

Multiplication
If we need to multiply x and y, the correct answer is x'*y' but we only know an approximation to it, x*y. We can estimate the precision by x'*y'=x*y*(1+delta)*(1+epsilon) and it follows that the relative precision is at most delta+epsilon. But we only represent the relative errors by the number of bits. The whole idea of the simplified precision tracking is to avoid costly operations connected with precision. So instead of tracking the number delta+epsilon exactly, we represent it roughly: either set the error of x*y to the larger of the errors of x and y, or double the error.

More formally, we have the estimates Abs(delta)<2^(-m), Abs(epsilon)<2^(-n) and we need a similar estimate Abs(r)<2^(-p) for r=delta+epsilon.

If the two numbers x and y have the same number of correct bits, we should double the error (i.e. decrease the number of significant bits by 1). But if they don't have the same number of bits, we cannot really estimate the error very well. To be on the safe side, we might double the error if the numbers x and y have almost the same number of significant bits, and leave the error constant if the numbers of significant bits of x and y are very different.

The answer expressed as a formula is p=Min(m,n) if Abs(m-n)>=D and p=Min(m,n)-1 otherwise. Here D is a constant that expresses our tolerance for error. In the current implementation, D=1.

If one of the operands is a floating zero x={0.,m} (see below) and x={x,n}, then p=m-BitCount(x)+1. This is the same formula as above, if we pretend that the bit count of {0.,m} is equal to 1-m.

Division
Division is multiplication by the inverse number. When we take the inverse of x*(1+delta), we obtain approximately 1/x*(1-delta). The relative precision does not change when we take the inverse. So the handling of precision is exactly the same as for the multiplication.

Addition
Addition is more complicated because the absolute rather than the relative precision plays the main role, and because there may be roundoff errors associated with subtracting almost equal numbers.

Formally, we have the relative precision r of x+y as

r=(delta*x+epsilon*y)/(x+y).

We have the bounds on delta and epsilon:

[Abs(delta)<2^(-m);Abs(epsilon)<2^(-n);],

and we need to find a bit bound on r, i.e. an integer p such that Abs(r)<2^(-p). But we cannot estimate p without first computing x+y and analyzing the relative magnitude of x and y. To perform this estimate, we need to use the bit counts of x and y and on x+y. Let these bit counts be a, b and c, so that Abs(x)<2^a, Abs(y)<2^b, and 2^(c-1)<=Abs(x+y)<2^c. (At first we assume that x!=0, y!=0, and x+y!=0.) Now we can estimate r as

r<=Abs((x*2^(-m))/(x+y))+Abs((y*2^(-n))/(x+y))<=2^(a+1-m-c)+2^(b+1-n-c).

This is formally similar to multiplying two numbers with a+1-m-c and b+1-m-c correct bits. As in the case of multiplication, we may take the minimum of the two numbers, or double one of them if they are almost equal.

Note that there is one important case when we can estimate the precision better than this. Suppose x and y have the same sign; then there is no cancellation when we compute x+y. The above formula for r gives an estimate

r<Max(Abs(delta),Abs(epsilon))

and therefore the precision of the result is at least p=Min(m,n).

If one of the operands is a floating zero represented by x={0.,m} (see below), then the calculation of the error is formally the same as in the case x={1.,m}. This is as if the bit count of {0.,m} were equal to 1 (unlike the case of multiplication).

Finally, if the sum x+y is a floating zero but x!=0 and y!=0, then it must be that a=b. In that case we represent x+y as {0.,p}, where p=Min(m,n)-a.

Computations with a given target precision
Using these rules, we can maintain a bound on the numerical errors of all calculations. But sometimes we know in advance that we shall not be needing any more than a certain number of digits of the answer, and we would like to avoid an unnecessarily high precision and reduce the computation time. How can we combine an explicitly specified precision, for example, in the function
x.Add(y,z,100)
with the automatic precision tracking?

We should truncate one or both of the arguments to a smaller precision before starting the operation. For the multiplication as well as for the addition, the precision tracking involves a comparison of two binary exponents 2^(-g) and 2^(-h) to obtain an estimate on 2^(-g)+2^(-h). Here g and h are some integers that are easy to obtain during the computation. For instance, the multiplication involves g=m and h=n. This comparison will immediately show which of the arguments dominates the error.

The ideal situation would be when one of these exponentials is much smaller than the other, but not very much smaller (that would be a waste of precision). In other words, we should aim for Abs(g-h)<8 or so, where 8 is the number of guard bits we would like to maintain. (Generally it is a good idea to have at least 8 guard bits; somewhat more guard bits do not slow down the calculation very much, but 200 guard bits would be surely an overkill.) Then the number that is much more precise than necessary can be truncated.

For example, if we find that g=250 and h=150, then we can safely truncate x to 160 bits or so; if, in addition, we need only 130 bits of final precision, then we could truncate both x and y to about 140 bits.

Note that when we need to subtract two almost equal numbers, there will be a necessary loss of precision, and it may be impossible to decide on the target precision before performing the subtraction. Therefore the subtraction will have to be performed using all available digits.

The floating zero
There is a difference between an integer zero and a floating-point zero. An integer zero is exact, so the result of 0*1.1 is exactly zero (also an integer). However, x:=1.1-1.1 is a floating-point zero (a "floating zero" for short) of which we can only be sure about the first digit after the decimal point, i.e. x=0.0. The number x might represent 0.01 or -0.02 for all we know.

It is impossible to track the relative precision of a floating zero, but it is possible to track the absolute precision. Suppose we store the bit count of the absolute precision, just as we store the bit count of the relative precision with nonzero floats. Thus we represent a floating zero as a pair {0.,n} where n is an integer, and the meaning of this is a number between -2^(-n) and 2^(-n).

We can now perform some arithmetic operations on the floating zero. Addition and multiplication are handled similarly to the non-zero case, except that we interpret n as the absolute error rather than the relative error. This does not present any problems. For example, the error estimates for addition is the same as if we had a number 1 with relative error 2^(-n) instead of {0.,n}. With multiplication of {x,m} by {0.,n}, the result is again a floating zero {0.,p}, and the new estimate of absolute precision is p=n-BitCount(x)+1.

The division by the floating zero, negative powers, and the logarithm of the floating zero are not representable in our arithmetic because, interpreted as intervals, they would correspond to infinite ranges. The bit count of the floating zero is therefore undefined. However, we can define a positive power of the floating zero (the result is again a floating zero).

The sign of the floating zero is defined as (integer) 0. (Then we can quickly check whether a given number is a zero.)


Comparison of floats

Suppose we need to compare floating-point numbers x and y. In the strict mathematical sense this is an unsolvable problem because we may need in principle arbitrarily many digits of x and y before we can say that they are equal. In other words, "zero-testing is uncomputable". So we need to relax the mathematical rigor somewhat.

Suppose that x=12.0 and y=12.00. Then in fact x might represent a number such as 12.01, while y might represent 11.999. There may be two approaches: first, "12.0" is not equal to "12.00" because x and y might represent different numbers. Second, "12.0" is equal to "12.00" because x and y might also represent equal numbers. A logical continuation of the first approach is that "12.0" is not even equal to another copy of "12.0" because they might represent different numbers, e.g. if we compute x=6.0+6.0 and y=24.0/2.0, the roundoff errors might be different.

Here is an illustration in support for the idea that the comparison 12.0=12 should return True. Suppose we are writing an algorithm for computing the power, x^y. This is much faster if y is an integer because we can use the binary squaring algorithm. So we need to detect whether y is an integer. Now suppose we are given x=13.3 and y=12.0. Clearly we should use the integer powering algorithm, even though technically y is a float. (To be sure, we should check that the integer powering algorithm generates enough significant digits.)

However, the opposite approach is also completely possible: no two floating-point numbers should be considered equal, except perhaps when one is a bit-for-bit exact copy of the other and when we haven't yet performed any arithmetic on them. (The GMP library uses essentially this definition.) It seems that no algorithm really needs a test for equality of floats. The two useful comparisons on floats x, y seem to be the following:

Given these predicates, it seems that any floating-point algorithm can be implemented just as efficiently as with any "reasonable" definition of the floating-point equality.


How to increase of the working precision

Suppose that in a Yacas session we declare Precision(5), write x:=0.1, and then increase precision to 10 digits. What is x now? There are several approaches:

1) The number x stays the same but further calculations are done with 10 digits. In terms of the internal binary representation, the number is padded with binary zeros. This means that now e.g. 1+x will not be equal to 1.1 but to something like 1.100000381 (to 10 digits). And actually x itself should evaluate to 0.1000003815 now. This was 0.1 to 5 digits but it looks a little different if we print it to 10 digits. (A "binary-padded decimal".)

This problem may look horrible at first sight -- "how come I can't write 0.1 any more??" -- but this seems so because we are used to calculations in decimals with a fixed precision, and the operation such as "increase precision by 10 digits" is largely unfamiliar to us except in decimals. This seems to be mostly a cosmetic problem. In a real calculation, we shouldn't be writing "0.1" when we need an exact number 1/10. When we request to increase precision in the middle of a calculation, this mistake surfaces and gives unexpected results.

2) When precision is increased, the number x takes its decimal representation, pads it with zeros, and converts back to the internal representation, just so that the appearance of "1.100000381" does not jar our eyes. (Note that the number x does not become "more precise" if we pad it with decimal zeros instead of binary zeros, unless we made a mistake and wrote "0.1" instead an exact fraction 1/10.)

With this approach, each number x that doesn't currently have enough digits must change in a complicated way. This will mean a performance hit in all calculations that require dynamically changing precision (Newton's method and some other fast algorithms require this). In these calculations, the roundoff error introduced by "1.100000381" is automatically compensated and the algorithm will work equally well no matter how we extend x to more digits; but it's a lot slower to go through the decimal representation every time.

3) The approach currently being implemented in Yacas is a compromise between the above two. We distinguish number objects that were given by the user as decimal strings (and not as results of calculations), for instance x:=1.2, from number objects that are results of calculations, for instance y:=1.2*1.4. Objects of the first kind are interpreted as exact rational numbers given by a decimal fraction, while objects of the second kind are interpreted as inexact floating-point numbers known to a limited precision. Suppose x and y are first assigned as indicated, with the precision of 5 digits each, then the precision is increased to 10 digits and x and y are used in some calculation. At this point x will be converted from the string representation "1.2" to 10 decimal digits, effectively making 1.2 a shorthand for 1.200000000. But the value of y will be binary-padded in some efficient way and may be different from 1.680000000.

In this way, efficiency is not lost (there are no repeated conversions from binary to decimal and back), and yet the cosmetic problem of binary-padded decimals does not appear. An explicitly given decimal string such as "1.2" is interpreted as a shorthand for 1.2000... with as many zeroes as needed for any currently selected precision. But numbers that are results of arithmetic operations are not converted back to a decimal representation for zero-padding. Here are some example calculations:

In> Precision(5)
Out> True
In> x:=1.2
Out> 1.2
In> y:=N(1/3)
Out> 0.33333
The number y is a result of a calculation and has a limited precision. Now we shall increase the precision:
In> Precision(20)
Out> True
In> y
Out> 0.33333
The number y is printed with 5 digits, because it knows that it has only 5 correct digits.
In> y+0
Out> 0.33333333325572311878
In a calculation, y was binary-padded, so the last digits are incorrect.
In> x+0
Out> 1.2
However, x has not been padded and remains an "exact" 1.2.
In> z:=y+0
Out> 0.33333333325572311878
In> Precision(40)
Out> True
In> z
Out> 0.33333333325572311878
Now we can see how the number z is padded again:
In> z+0
Out> 0.33333333325572311878204345703125


The meaning of the Precision() call

The user calls Precision() to specify the "wanted number of digits." We could use different interpretations of the user's wish: The first interpretation is that Precision(10) means "I want all answers of all calculations to contain 10 correct digits". The second interpretation is "I want all calculations with floating-point numbers done using at least 10 digits".

Suppose we have floating-point numbers x and y, known only to 2 and 3 significant digits respectively. For example, x=1.6 and y=2.00. These x and y are results of previous calculations and we do not have any more digits than this. If we now say
Precision(10);
x*y;
then clearly the system cannot satisfy the first interpretation because there aren't enough digits of x and y to find 10 digits of x*y. But we can satisfy the second interpretation, even if we print "3.2028214767" instead of the expected 3.2. The garbage after the third digit is unavoidable and harmless unless our calculation really depends on having 10 correct digits of x*y. The garbage digits can be suppressed when the number is printed, so that the user will never see them. But if our calculation depends on the way we choose the extra digits, then we are using a bad algorithm.

The first interpretation of Precision() is only possible to satisfy if we are given a self-contained calculation with integer numbers. For example,
N(Sin(Sqrt(3/2)-10^(20)), 50)
This can be computed to 50 digits with some effort, but only if we are smart enough to use 70 digits in the calculation of the argument of Sin().) (This level of smartness is currently not implemented in the N function.) The result of this calculation will have 50 digits and not a digit more; we cannot put the result inside another expression and expect full precision in all cases. This seems to be a separate task, "compute something with n digits no matter what", and not a general routine to be followed at all times.

So it seems that the second interpretation of Precision(n), namely: "please use n digits in all calculations now", is more sensible as a general-purpose prescription.

But this interpretation does not mean that all numbers will be printed with n digits. Let's look at a particular case (for simplicity we are talking about decimal digits but in the implementation they will be binary digits). Suppose we have x precise to 10 digits and y precise to 20 digits, and the user says Precision(50) and z:=1.4+x*y. What happens now in this calculation? (Assume that x and y are small numbers of order 1; the other cases are similar.)

First, the number "1.4" is now interpreted as being precise to 50 digits, i.e. "1.4000000...0" but not more than 50 digits.

Then we compute x*y using their internal representations. The result is good only to 10 digits, and it knows this. We do not compute 50 digits of the product x*y, it would be pointless and a waste of time.

Then we add x*y to 1.4000...0. The sum, however, will be precise only to 10 digits. We can do one of the two things now: (a) we could pad x*y with 40 more zero digits and obtain a 50-digit result. However, this result will only be correct to 10 digits. (b) we could truncate 1.4 to 10 digits (1.400000000) and obtain the sum to 10 digits. In both cases the result will "know" that it only has 10 correct digits.

It seems that the option (b) is better because we do not waste time with extra digits.

The result is a number that is precise to 10 digits. However, the user wants to see this result with 50 digits. Even if we chose the option (a), we would have had some bogus digits, in effect, 40 digits of somewhat random round-off error. Should we print 10 correct digits and 40 bogus digits? It seems better to print only 10 correct digits in this case. The GMP library already has this functionality in its string printing functions: it does not print more digits than the number actually holds.

If we choose this route, then the only effect of Precision(50) will be to interpret a literal constant 1.4 as a 50-digit number. All other numbers already know their real precision and will not invent any bogus digits.

In some calculations, however, we do want to explicitly extend the precision of a number to some more digits. For example, in Newton's method we are given a first approximation x[0] to a root of f(x)=0 and we want to have more digits of that root. Then we need to pad x[0] with some more digits and re-evaluate f(x[0]) to more digits (this is needed to get a better approximation to the root). This padding operation seems rather special and directed at a particular number, not at all numbers at once. For example, if f(x) itself contains some floating-point numbers, then we should be unable to evaluate it with higher precision than allowed by the precision of these numbers. So it seems that we need access to these two low-level operations: the padding and the query of current precision. The proposed interface is GetExactBits(x) and SetExactBits(x,n). These operations are directed at a particular number object x.


Summary of arbitrary-precision semantics


Formal definitions of precision tracking

Here we shall consider arithmetic operations on floats x and y, represented as pairs {x,m} and {y,n}. The result of the operation is z, represented as a pair {z,p}. Here x, y, z are floats and m, n, p are integers.

We give formulae for p in terms of x, y, m, and n. Sometimes the bit count of a number x is needed; it is denoted B(x) for brevity.

Formal definitions
A pair {x,m} where x is a floating-point value and m is an integer value (the "number of correct bits") denotes a real number between x*(1-2^(-m)) and x*(1+2^(-m)) when x!=0, and a real number between -2^(-m) and 2^(-m) when x=0 (a "floating zero").

The bit count B(x) is an integer function of x defined for real x!=0 by

B(x):=1+Floor(Ln(Abs(x))/Ln(2)).

This function also satisfies

2^(B(x)-1)<=Abs(x)<2^B(x).

For example, B(1/4)= -1, B(1)=B(3/2)=1, B(4)=3. The bit count of zero is arbitrarily set to 1. For integer x, the value B(x) is the number of bits needed to write the binary representation of x.

The bit count function can be usually computed in constant time because the usual representation of long numbers is by arrays of platform integers and a binary exponent. The length of the array of digits is usually available at no computational cost.

The absolute error Delta[x] of {x,n} is of order Abs(x)*2^(-n). Given the bit count of x, this can be estimated from as

2^(B(x)-n-1)<=Delta[x]<2^(B(x)-n).

So the bit count of Delta[x] is B(x)-n.

Floor()
The function Floor({x,m}) gives an integer result if there are enough digits to determine it exactly, and otherwise returns the unchanged floating-point number. The condition for Floor({x,m}) to give an exact result is

m>=B(x).

BecomeFloat()
The function BecomeFloat(n) will convert an integer to a float with at least n digits of precision. If x is the original integer value, then the result is {x,p} where p=Max(n,B(x)).

Underflow check
It is possible to have a number {x,n} with x!=0 such that {0.,m}={x,n} for some m. This would mean that the floating zero {0.,m} is not precise enough to be distinguished from {x,n}, i.e.

Abs(x)<2^(-m).

This situation is normal. But it would be meaningless to have a number {x,n} with x!=0 and a precision interval that contains 0. Such {x,n} will in effect be equal to any zero {0.,m}, because we do not know enough digits of x to distinguish {x,n} from zero.

From the definition of {x,n} with x!=0 it follows that 0 can be within the precision interval only if n<= -1. Therefore, we should transform any number {x,n} such that x!=0 and n<= -1 into a floating zero {0.,p} where

p=n-B(x).

(Now it is not necessarily true that p>=0.) This check should be performed at any point where a new precision estimate n is obtained for a number x and where a cancellation may occur (e.g. after a subtraction). Then we may assume that any given float is already reduced to zero if possible.

Equals()
We need to compare {x,m} and {y,n}.

First, we can quickly check that the values x and y have the same nonzero signs and the same bit counts, B(x)=B(y). If x>0 and y<0 or vice versa, or if B(x)=B(y), then the two numbers are definitely unequal. We can also check whether both x=y=0; if this is the case, then we know that {x,m}={y,n} because any two zeros are equal.

However, a floating zero can be sometimes equal to a nonzero number. So we should now exclude this possibility: {0.,m}={y,n} if and only if Abs(y)<2^(-m). This condition is equivalent to

B(y)< -m.

If these checks do not provide the answer, the only possibility left is when x!=0 and y!=0 and B(x)=B(y).

Now we can consider two cases: (1) both x and y are floats, (2) one is a float and the other is an integer.

In the first case, {x,m}={y,n} if and only if the following condition holds:

Abs(x-y)<Max(2^(-m)*Abs(x),2^(-n)*Abs(y)).

This is a somewhat complicated condition but its evaluation does not require any long multiplications, only long additions, bit shifts and comparisons.

It is now necessary to compute x-y (one long addition); this computation needs to be done with Min(m,n) bits of precision.

After computing x-y, we can avoid the full evaluation of the complicated condition by first checking some easier conditions on x-y. If x-y=0 as floating-point numbers ("exact cancellation"), then certainly {x,m}={y,n}. Otherwise we can assume that x-y!=0 and check:

If neither of these conditions can give us the answer, we have to evaluate the full condition by computing Abs(x)*2^(-m) and Abs(x)*2^(-m) and comparing with Abs(x-y).

In the second case, one of the numbers is an integer x and the other is a float {y,n}. Then x={y,n} if and only if

Abs(x-y)<2^(-n)*Abs(y).

For the computation of x-y, we need to convert x into a float with precision of n digits, i.e. replace the integer x by a float {x,n}. Then we may use the procedure for the first case (two floats) instead of implementing a separate comparison procedure for integers.

LessThan()
If {x,m}={y,n} according to the comparison function Equals(), then the predicate LessThan is false. Otherwise it is true if and only if x<y as floats.

IsIntValue()
To check whether {x,n} has an integer value within its precision, we first need to check that {x,n} has enough digits to compute Floor(x)=Floor(x) accurately. If not (if n<B(x)), then we conclude that x has an integer value. Otherwise we compute y:=x-Floor(x) as a float value (without precision control) to n bits. If y is exactly zero as a float value, then x has an integer value. Otherwise {x,n} has an integer value if and only if B(y)< -n.

This procedure is basically the same as comparing {x,n} with Floor(x).

Sign()
The sign of {x,n} is defined as the sign of the float value x. (The number {x,n} should have been reduced to a floating zero if necessary.)

Addition and subtraction (Add, Negate)
We need to add {x,m} and {y,n} to get the result {z,p}. Subtraction is the same as addition, except we negate the second number. When we negate a number, its precision never changes.

First consider the case when x+y!=0.

If x is zero, i.e. {0.,m} (but x+y!=0), then the situation with precision is the same as if x were {1.,m}, because then the relative precision is equal to the absolute precision. In that case we take the bit count of x as B(0)=1 and proceed by the same route.

First, we should decide whether it is necessary to add the given numbers. It may be unnecessary if e.g. x+y<=>x within precision of x (we may say that a "total underflow" occurred during addition). To check for this, we need to estimate the absolute errors of x and y:

2^(B(x)-m-1)<=Delta[x]<2^(B(x)-m),

2^(B(y)-n-1)<=Delta[y]<2^(B(y)-n).

Addition is not necessary if Abs(x)<=Delta[y] or if Abs(y)<=Delta[x]. Since we should rather perform an addition than wrongly dismiss it as unnecessary, we should use a sufficient condition here: if

B(x)<=B(y)-n-1

then we can neglect x and set z=y, p=n-Dist(B(x),B(y)-n-1). (We subtract one bit from the precision of y in case the magnitude of x is close to the absolute error of y.) Also, if

B(y)<=B(x)-m-1

then we can neglect y and set z=x, p=m-Dist(B(y),B(x)-m-1).

Suppose none of these checks were successful. Now, the float value z=x+y needs to be calculated. To find it, we need the target precision of only

1+Max(B(x),B(y))-Max(B(x)-m,B(y)-n)

bits. (An easier upper bound on this is 1+Max(m,n) but this is wasteful when x and y have very different precisions.)

Then we compute B(z) and determine the precision p as

p=Min(m-B(x),n-B(y))+B(z)

-1-Dist(m-B(x),n-B(y)),

where the auxiliary function Dist(a,b) is defined as 0 when Abs(a-b)>2 and 1 otherwise.
The definition of Dist(a,b) is necessarily approximate; if we replace 2 by a larger number, we shall be overestimating the error in more cases.

In the case when x and y have the same sign, we have a potentially better estimate p=Min(m,n). We should take this value if it is larger than the value obtained from the above formula.

Also, the above formula is underestimating the precision of the result by 1 bit if the result and the absolute error are dominated by one of the summands. In this case the absolute error should be unchanged save for the Dist term, i.e. the above formula needs to be incremented by 1. The condition for this is B(x)>B(y) and B(x)-m>B(y)-n, or the same for y instead of x.

The result is now {z,p}.

Note that the obtained value of p may be negative (total underflow) even though we have first checked for underflow. In that case, we need to transform {z,p} into a floating zero, as usual.

Now consider the case when z:=x+y=0.

This is only possible when B(x)=B(y). Then the result is {0.,p} where p is found as

p=1+Min(m,n)-B(x)-Dist(m,n).

Note that this is the same formula as in the general case, if we define B(z)=B(0):=1. Therefore with this definition of the bit count one can use one formula for the precision of addition in all cases.

If the addition needs to be performed with a given maximum precision P, and it turns out that p>P, then we may truncate the final result to P digits and set its precision to P instead. (It is advisable to leave a few bits untruncated as guard bits.) However, the first operation z:=x+y must be performed with the precision specified above, or else we run the danger of losing significant digits of z.

Adding integers to floats
If an integer x needs to be added to a float {y,n}, then we should formally use the same procedure as if x had infinitely many precise bits. In practice we can take some shortcuts.

It is enough to convert the integer to a float {x,m} with a certain finite precision m and then follow the general procedure for adding floats. The precision m must be large enough so that the absolute error of {x,m} is smaller than the absolute error of {y,n}: B(x)-m<=B(y)-n-1, hence

m>=1+n+B(x)-B(y).

In practice we may allow for a few guard bits over the minimum m given by this formula.

Sometimes the formula gives a negative value for the minimum m; this means underflow while adding the integer (e.g. adding 1 to 1.11e150). In this case we do not need to perform any addition at all.

Multiplication
We need to multiply {x,m} and {y,n} to get the result {z,p}.

First consider the case when x!=0 and y!=0. The resulting value is z=x*y and the precision is

p=Min(m,n)-Dist(m,n).

If one of the numbers is an integer x, and the other is a float {y,n}, it is enough to convert x to a float with somewhat more than n bits, e.g. {x,n+3}, so that the Dist function does not decrement the precision of the result.

Now consider the case when {x,m}={0,m} but y!=0. The result z=0 and the resulting precision is

p=m-B(y)+1.

Finally, consider the case when {x,m}={0,m} and {y,n}={0,n}. The result z=0 and the resulting precision is

p=m+n.

The last two formulae are the same if we defined the bit count of {0.,m} as 1-m. This differs from the "standard" definition of B(0)=1. (The "standard" definition is convenient for the handling of addition.) With this non-standard definition, we may use the unified formula

p=2-B(x)-B(y)

for the case when one of x, y is a floating zero.

If the multiplication needs to be performed to a given target precision P which is larger than the estimate p, then we can save time by truncating both operands to P digits before performing the multiplication. (It is advisable to leave a few bits untruncated as guard bits.)

Division
Division is handled essentially in the same way as multiplication. The relative precision of x/y is the same as the relative precision of x*y as long as both x!=0 and y!=0.

When x=0 and y!=0, the result of division {0.,m}/{y,n} is a floating zero {0.,p} where p=m+B(y)-1. When x is an integer zero, the result is also an integer zero.

Division by an integer zero or by a floating zero is not permitted. The code should signal a zero division error.

ShiftLeft(), ShiftRight()
These operations efficiently multiply a number by a positive or negative power of 2. Since 2 is an exact integer, the precision handling is similar to that of multiplication of floats by integers.

If the number {x,n} is nonzero, then only x changes by shifting but n does not change; if {x,n} is a floating zero, then x does not change and n is decremented (ShiftLeft) or incremented (ShiftRight) by the shift amount:
{x, n} << s = {x<<s, n};
{0.,n} << s = {0., n-s};
{x, n} >> s = {x>>s, n};
{0.,n} >> s = {0., n+s};


Implementation notes


Large exponents

The BigNumber API does not support large exponents for floating-point numbers. A floating-point number x is equivalent to two integers M, N such that x=M*2^N. Here M is the (denormalized) mantissa and N is the (binary) exponent. The integer M must be a "big integer" that may represent thousands of significant bits. But the exponent N is a platform signed integer (C++ type long) which is at least 2^31, allowing a vastly larger range than platform floating-point types. One would expect that this range of exponents is enough for most real-world applications. In the future this limitation may be relaxed if one uses a 64-bit platform. (A 64-bit platform seems to be a better choice for heavy-duty multiple-precision computations than a 32-bit platform.) However, code should not depend on having 64-bit exponent range.

We could have implemented the exponent N as a big integer but this would be inefficient most of the time, slowing down the calculations. Arithmetic with floating-point numbers requires only very simple operations on their exponents (basically, addition and comparisons). These operations would be dominated by the overhead of dealing with big integers, compared with platform integers.

A known issue with limited exponents is the floating-point overflow and exponent underflow. (This is not the same underflow as with adding 1 to a very small number.) When the exponent becomes too large to be represented by a platform signed integer type, the code must signal an overflow error (e.g. if the exponent is above 2^31) or an underflow error (e.g. if the exponent is negative and below -2^31).


Library versions of mathematical functions

It is usually the case that a multiple-precision library implements some basic mathematical functions such as the square root. A library implementation may be already available and more efficient than an implementation using the API of the wrapper class BigNumber. In this case it is desirable to wrap the library implementation of the mathematical function, rather than use a suboptimal implementation. This could be done in two ways.

First, we recognize that we shall only have one particular numerical library linked with Yacas, and we do not have to compile our implementation of the square root if this library already contains a good implementation. We can use conditional compilation directives (#ifdef) to exclude our square root code and to insert a library wrapper instead. This scheme could be automated, so that appropriate #defines are automatically created for all functions that are already available in the given multiple-precision library, and the corresponding Yacas kernel code that uses the BigNumber API is automatically replaced by library wrappers.

Second, we might compile the library wrapper as a plugin, replacing the script-level square root function with a plugin-supplied function. This solution is easier in some ways because it doesn't require any changes to the Yacas core, only to the script library. However, the library wrapper will only be available to the Yacas scripts and not to the Yacas core functions. The basic assumption of the plugin architecture is that plugins can provide new external objects and functions to the scripts, but plugins cannot modify anything in the kernel. So plugins can replace a function defined in the scripts, but cannot replace a kernel function. Suppose that some other function, such as a computation of the elliptic integral which heavily uses the square root, were implemented in the core using the BigNumber API. Then it will not be able to use the square root function supplied by the plugin because it has been already compiled into the Yacas kernel.

Third, we might put all functions that use the basic API (MathSqrt, MathSin etc.) into the script library and not into the Yacas kernel. When Yacas is compiled with a particular numerical library, the functions available from the library will also be compiled as the kernel versions of MathSqrt, MathPower and so on (using conditional compilation or configured at build time). Since Yacas tries to call the kernel functions before the script library functions, the available kernel versions of MathSqrt etc. will supersede the script versions, but other functions such as BesselJ will be used from the script library. The only drawback of this scheme is that a plugin will not be able to use the faster versions of the functions, unless the plugin was compiled specifically with the requirement of the particular numerical library.

So it appears that either the first or the third solution is viable.


Converting from bits to digits and back

One task frequently needed by the arithmetic library is to convert a precision in (decimal) digits to binary bits and back. (We consider the decimal base to be specific; the same considerations apply to conversions between any other bases.) The kernel implements auxiliary routines bits_to_digits and digits_to_bits for this purpose.

Suppose that the mantissa of a floating-point number is known to d decimal digits. It means that the relative error is no more than 0.5*10^(-d). The mantissa is represented internally as a binary number. The number b of precise bits of mantissa should be determined from the equation 10^(-d)=2^(-b), which gives b=d*Ln(10)/Ln(2).

One potential problem with the conversions is that of incorrect rounding. It is impossible to represent d decimal digits by some exact number b of binary bits. Therefore the actual value of b must be a little different from the theoretical one. Then suppose we perform the inverse operation on b to obtain the corresponding number of precise decimal digits; there is a danger that we shall obtain a number d' that is different from d.

To avoid this danger, the following trick is used. The binary base 2 is the least of all possible bases, so successive powers of 2 are more frequent than successive powers of 10 or of any other base. Therefore for any power of 10 there will be a unique power of 2 that is the first one above it.

The recipe to obtain this power of 2 is simple: one should round d*Ln(10)/Ln(2) upwards using the Ceil function, but b*Ln(2)/Ln(10) should be rounded downwards using the Floor function.

This procedure will make sure that the number of bits b is high enough to represent all information in the d decimal digits; at the same time, the number d will be correctly restored from b. So when a user requests d decimal digits of precision, Yacas may simply compute the corresponding value of b and store it. The precision of b digits is enough to hold the required information, and the precision d can be easily computed given b.


The internal storage of BigNumber objects

An object of type BigNumber represents a number (and contains all information relevant to the number), and offers an interface to operations on it, dispatching the operations to an underlying arbitrary precision arithmetic library.

Higher up, Yacas only knows about objects derived from LispObject. Specifically, there are objects of class LispAtom which represent an atom.

Symbolic and string atoms are uniquely represented by the result returned by the String() method. For number atoms, there is a separate class, LispNumber. Objects of class LispNumber also have a String() method in case a string representation of a number is needed, but the main uniquely identifying piece of information is the object of class BigNumber stored inside a LispNumber object. This object is accessed using the Number() method of class LispNumber.

The life cycle of a LispNumber is as follows:

In order to fully support the LispNumber object, the function in the kernel that determines if two objects are the same needs to know about LispNumber. This is required to get valid behaviour. Pattern matching for instance uses comparisons of this type, so comparisons are performed often and need to be efficient.

The other functions working on numbers can, in principle, call the String() method, but that induces conversions from BigNumber to string, which are relatively expensive operations. For efficiency reasons, the functions dealing with numeric input should call the Number() method, operate on the BigNumber returned, and return a LispNumber constructed with a BigNumber. A function can call String() and return a LispNumber constructed with a string representation, but it will be less efficient.

Precision tracking inside LispNumber
There are various subtle details when dealing with precision. A number gets constructed with a certain precision, but a higher precision might be needed later on. That is the reason there is the aPrecision argument to the Number() method.

When a BigNumber is constructed from a decimal string, one has to specify a desired precision (in decimal digits). Internally, BigNumber objects store numbers in binary and will allocate enough bits to cover the desired precision. However, if the given string has more digits than the given precision, the BigNumber object will not truncate it but will allocate more bits so that the information given in the decimal string is not lost. If later the string representation of the BigNumber object is requested, the produced string will match the string from which the BigNumber object was created.

Internally, the BigNumber object knows how many precise bits it has. The number of precise digits might be greater than the currently requested precision. But a truncation of precision will only occur when performing arithmetic operations. This behavior is desired, for example:

In> Precision(6)
Out> True;
In> x:=1.23456789
Out> 1.23456789;
In> x+1.111
Out> 2.345567;
In> x
Out> 1.23456789;
In this example, we would like to keep all information we have about x and not truncate it to 6 digits. But when we add a number to x, the result is only precise to 6 digits.

This behavior is implemented by storing the string representation "1.23456789" in the LispNumber object x. When an arithmetic calculation such as x+1.111 is requested, the Number method is called on x. This method, when called for the first time, converts the string representation into a BigNumber object. That BigNumber object will have 28 bits to cover the 9 significant digits of the number, not the 19 bits normally required for 6 decimal digits of precision. But the result of an arithmetic calculation is not computed with more than 6 decimal digits. Later when x needs to be printed, the full string representation is available so it is printed.

If now we increase precision to 20 digits, the object x will be interpreted as 1.23456789 with 12 zeros at the end.

In> Precision(20)
Out> True;
In> x+0.000000000001
Out> 1.234567890001;
This behavior is more intuitive to people who are used to decimal fractions.


The Yacas script compilation system

Yacas currently comes with a compiler that can convert scripts to C++ code. This facility will be used in the future to move as many functions defined in the kernel to scripts as possible.

The advantages of this approach are:

The disadvantages are:

The Yacas system comes with a compiler that can compile scripts to C++ files. To invoke, one can call

CompileCpp(scriptfile,pluginname);

while in Yacas. scriptfile needs to be a full path to a script file, and pluginname is the base name for the plugin. This function creates a file $(pluginname).cpp, a C++ file that can be compiled into a plugin.


Development of scripts that get compiled to plugins


Disabling plugin loading

For the scripts that come as part of the distribution, the scripts are static as far as the end user is concerned. The scripts get compiled once into plugins, when Yacas is built, and after that the user just uses the binary version of Yacas.

For the developer, however, it is a nuissance to have to compile the scripts each time something is changed.

When working on the scripts that need to be compiled it is better if the scripts get loaded in favor of the plugins. To achieve this a programmer can create his own initialization script, in which he can store:

Set(LoadPlugIns,False);
Use("yacasinit.ys");

and pass this file as argument to the --init command line flag.


Language constructs supported by the compiler

The language constructs supported depend on two things:

The calling sequence for Defun is:

Defun(FunctionName,Arguments)Body;

For example, Defun can be used in infix grammar script as:

Defun("add",{x,y}) MathAdd(x,y);

Defun is defined entirely in terms of functions supported by the kernel, so it can be the very first function defined in the initialization script. As such it can be defined even before plugins get loaded.

There is one additional limitation: Compiled code can currently only call functions with a fixed number of arguments for which it already knows the function prototype (it needs to know that it is a function with fixed number of arguments, and the number of arguments). This includes the functions in the kernel, and the scripts and plugins loaded before this script or plugin was loaded. If a function needs to be called that is defined later in the script or defined in another script that is compiled, a forward-declaration of the function can be made through:

ForwardDeclareFixedFunction(name,nrArguments);

Functions defined in script (or either in script or compiled), or functions with variable number of arguments, or macros, can be called by using the function ApplyPure. For instance, to invoke addition as defined in the scripts one could call

ApplyPure("+",{a,b});

Unfencing a routine does not work in compiled mode (currently). However, the preferred method would be to use macros here (which also don't work yet). Macros can be inlined and thus optimized some more.

Also constructs using LocalSymbols don't yet work, although this should be fixed soon. LocalSymbols is the mechanism used in Yacas to limit access to resources.

Global variables currently do not work yet (one can not use a global variable in a compiled script yet, it returns itself unevaluated).

Also, groups of transformation rules can currently not be converted to compiled code. This will only be a problem in cases where the transformation rules dominate performance. In all other cases one can write:

f(x_IsNumber,y_IsNumber) <-- fcompiled(x,y);

with fcompiled defined in some script that gets compiled.

Other than that, all functions with fixed number of arguments in the kernel are supported, and some macros and functions or macros with variable number of arguments.

This will be extended over time.

There is specific code in the compiler that can handle compilation of specific transformation rules (in order to make it easier for the people writing the code that gets compiled). One example is an expression - n where n is a number. Officially, this would have to go through the transformation rules, where one rule would determine that this can be replaced with the evaluation of MathNegate(n). The compiler will make this transformation.


Bootstrapping scripts as plugins into Yacas

The file yacasinit.ys contains lines like:

TrapError(DllLoad("libmath"),Use("base.rep/math.ys"));

What this does is it tries to load the plugin with name libmath, and if this fails it tries to load the script base.rep/math.ys instead. So until the plugin exists Yacas uses the script file that is used to build the plugin instead.

The first time the Yacas executable is built the plugins are not available. To compile the plugins Yacas itself is needed. Being able to use the scripts in stead of the plugins solves this chicken-and-egg problem. At first there are the scripts, and Yacas will be a little bit slower, but it will work, and can be used to compile the scripts to plugins, after which it will be faster.

The scripts that get compiled in are grouped in the base.rep/ directory in the scripts/ directory.


Steps to make a compiled script

The steps involved in making a script that can get compiled are:


Internal workings of the compiler

The compiler targets an instruction set described below, writing out the routines one by one in terms of this instruction set. The functions have the same interface as the core kernel routines. The compiled functions use the same stack in the same way. The code in a routine pushes arguments on the stack and then calls other routines, or it can jump (conditionally or unconditionally). In addition there is the option of registering as many "registers" for the routine to use as is necessary. This section describes the instruction set and the calling convention used.


The Yacas calling convention

The environment passed in to a routine has a stack. The routine also gets passed in a stack position, sp. sp points to the slot on the stack that will receive the result to be returned by the routine. For interpreted routines this return slot stack[sp] is initialized with the expression being evaluated, so that if an error occurs a useful error can be generated (most importantly, this expression has the function name of the function that is executing). For compiled functions this slot is initialized to NULL, so that the error handling code knows it is failing in compiled code.

For a function that accepts n arguments the following n slots on the stack contain the values of the arguments (which have already been evaluated). Thus for a function that accepts 2 arguments, stack[sp] is the slot for the return value, and stack[sp+1] and stack[sp+2] contain the two arguments to the function.

The routine that called this function is responsible for popping the arguments off of the stack. This model fits nicely with the evaluation order. Before a function gets evaluated (applied to its arguments), the arguments get evaluated first. Evaluating each argument leaves a new result on the stack. Then the function is applied to this set of arguments.

The called routine can infer the number of arguments on the stack because the stack also knows its size, stack.size.

The result slot is referred to with RESULT, and the arguments are referred to through ARGUMENT(i) where i is a value between 1 and the number of arguments n.


Registers

In essence, the stack used by a routine starts at stack[sp]. The expression ARGUMENT(i) is used to access stack[sp+i]. ARGUMENT(0) has a special definition, it being the slot for the return value of the function. It is also defined as RESULT. ARGUMENT(i) for i from 1 to n accesses the n arguments. In addition the function can start off by adding m extra reserve slots on the stack, which can be used as slots for storage of additional information. We will call these "registers".

If m registers are needed, they can then be referred to through ARGUMENT(n+i) where n is the number of arguments, and i has a value between 1 and m.

Currently a separate register is created for each local variable declared through Local(...). This is done at the beginning of the execution of the routine. The necessary amount of slots is pushed onto the stack for use as a register. At the end the function needs to pop them again, because the caller does not know about the registers the function uses.

The stack starting at stack[sp+n+m+1] is used to pass arguments to other functions called from within this function.


The compiler instruction set

A function declaration starts with

VmFunction(name)

where name represents the calling name.

At the end, a function is closed through

VmFunctionEnd()

Objects can be pushed onto a stack through

VmPushNulls(nr)
VmPush(register)
VmPushConstant(constant)

When pushing an object onto the stack, it is stored in the slot stack[stack.size], and stack.size is incremented by one afterwards. Values that can be pushed on the stack are:

Objects can be popped from the stack through

VmPop(i)

When this instruction is invoked, i arguments are removed from the stack.

Registers can be initialized through

VmInitRegister(register,constant)

Where register can have the form ARGUMENT(i) or RESULT. This is usually called at the place where the local variable got declared (through the Local(...) macro). Variables that have no value evaluate to themselves. This can be simulated by setting the variable to a constant.

After some evaluation has taken place, stack[stack.size-1] usually contains the result of the calculation. This result can be stored in a register through

VmSetRegister(register)

where again register usually has the form ARGUMENT(i). This operation does not pop the value off the stack, it just copies it to the register. register could in principle also be RESULT.

At the end of a function for instance the result of the last calculation is usually stored in the result slot on the stack through a call VmSetRegister(RESULT).

In principle all sorts of optimizations would be possible by smart re-use of the slots RESULT and ARGUMENT(i) on the stack at places where they are not needed. The values stored in the arguments slot could be replaced with other values, and combined with a jump statement that jumps to the beginning of the routine. Calculating a factorial could be implemented for instance without needing extra registers. The RESULT slot could keep the current result, while ARGUMENT(1) gets replaced with ARGUMENT(1)-1 and the routine terminates if ARGUMENT(1) equals one.

Control flow can be controlled through labels and (conditional) jumps to these labels. A label is defined through

VmLabel(label)

where label is some unique name. One can jump to that label from any other part of the routine through

VmJump(label)

In addition, one can do a conditional jump with

VmJumpIfTrue(label)
VmJumpIfFalse(label)

These instructions perform a jump if the value stored in stack[stack.size-1] is True or False respectively. It doesn't pop the value from the stack.

Calling a function (after its arguments have been put on the stack) can be done with

VmCall(fname,nrArgs)

where fname is the calling name of the routine and nrArgs the number of arguments that were pushed.

The instruction set has one instruction to build a list from elements currently on the stack. The procedure to use this would start by pushing a constant, List, onto the stack, and then evaluating n expressions. The instruction to combine this into a list is

VmConsList(n)

it combines the n arguments on the stack, together with the initial constant List into a list, and stores it in the slot where List was initially stored. This operation is expensive on stack use but it is intended to be used for short lists any way.


An example

The following code, when compiled

Defun("fac",{n})
[
  If(Equals(n,1),1,MathMultiply(n,fac(MathAdd(n,-1))));
];

results in

VmFunction(fac)
  VmPushNulls(1)
  VmPush(1)
  VmPushConstant(0)
  VmCall(LispEquals, 2 )
  VmPop(2 )
  VmJumpIfFalse(C20 )
  VmPop(1)
  VmPushConstant(0)
  VmJump(C21 )
VmLabel(C20 )
  VmPop(1)
  VmPushNulls(1)
  VmPush(1)
  VmPushNulls(2)
  VmPush(1)
  VmPushConstant(1)
  VmCall(LispAdd, 2 )
  VmPop(2 )
  VmCall(Compiled_fac, 1 )
  VmPop(1 )
  VmCall(LispMultiply, 2 )
  VmPop(2 )
VmLabel(C21 )
  VmSetRegister(0)
  VmPop(1 )
VmFunctionEnd()

Only the indented lines generate instructions, so this would be 23 instructions.


Execution

The compiler currently supports emitting a C++ file with macros defined for each of the instructions described above.

On the other end of the spectrum there is the byte code compiler. A conversion to byte code, for execution through a virtual machine, could also be performed. Code could then be compiled dynamically while Yacas is running, platform-independently, at a small performance cost. The byte code loader would have to handle preparing the constants at load time, and linking of function calls. One extra advantage is the compactness of the generated byte code.

If size is important, a mix of C++ with a byte code interpreter is possible. This leaves linking to the actual C++ linker. One has a C++ file which can be compiled into a plugin, but which defines the function through a byte array with the byte code. This would come at a performance cost, since interpreting the byte code would slow the system down.


Syntax conversion for programs written in other languages


Introduction

At the current writing, Yacas is still in its infancy. There are currently systems available that have many more years of development behind them. Maxima for instance has a license that is compatible with the Yacas license (they are both GPL). These systems already implement features and algorithms that Yacas has yet to implement.

The big difference, in this respect, to the other systems is the programming language the code is written in. All these algorithms will have to be re-implemented simply because Yacas runs on top of the Yacas scripting language, which is different from the languages in which the larger systems are implemented.

Another difference is of course that functions (algorithms) defined in these systems are built on top of other functions defined in these languages. It would thus at first seem that one can not simply take a small part of a system without taking the entire rest of the system with it.

This chapter addresses the first problem, the one where the algorithms are already defined in other languages, but with a different syntax. A simple scheme is presented here that will hopefully allow conversion of algorithms written in a certain syntax to the Yacas syntax. The aim is to preserve the structure of the original files as much as possible. Comments should thus be left untouched. Because of this, standard parsing and compilation techniques are not fully usable, as these schemes usually throw away comments from code parsed in as soon as possible, and the intermediate data structures don't offer a place for storing comments with sub-expressions.

The potential is huge; hundreds of man-years of research and development went into the other systems, and it would thus be a big boost to Yacas to be able to use these facilities. For this the licensing issues with these systems need to be resolved too.

This document will end with a discussion of the second problem; being able to take over parts of other systems as opposed to having to take over the entire system.

There are two steps to converting code so that it can be used in another system:

Converting the syntax is usually a large and cumbersome task which, as this document will try to show, can easily be explained to a computer. The second step, changing the semantics, might require hand-work.


Rationale

This is an interesting project in the domain of Yacas for several reasons:


General approach

There are three steps to the process of processing source code as data. The three steps come close to the read-eval-print concept from LISP:

The general steps taken by the interpreter are as follows:


Applications

The applications are essentially encoded in the middle process, where input tokens are converted to output tokens. For the support of one programming language, the READ and PRINT steps will stay the same, but the EVAL step will depend on what the aim is. The possible operations that the EVAL step could perform include (but are not limited to):

Converting to another programming language is a difficult task, and the author is not aware of other projects that succeeded in this arena. There are alternative approaches; for instance one could simply write a parser that converts to an internal format, and write a layer of macros to emulate the other programming language. However, the chances are that some handwork will be required to get the code in the desirable form. In this case an initial conversion into a form that is accepted by the normal parser of the programming language already aids a lot.


Implementation


Organization of the source files

The files are organized as follows:

In general it should only be necessary to load the [language][operation].ys file. This file should include the file [language]reader.ys which in turn should include the file codetransform.ys.


Optimizers


Examples


Compatibility modes

The first step is converting LISP code to code that can be parsed by the Yacas interpreter. The next step is to get the code to run at all. An initial method for this is to define macros that 'simulate' the Common Lisp environment.

The second step is to convert the code to the native Yacas coding standards. There are two things to consider; the general coding standards of the entire system (including the use of the facilities provided by Common Lisp), and the coding standards used by isolated programmers. This is an issue since lots of programmers have worked on these large systems, in general. For this it might be necessary to write specific optimizers for specific routines, as opposed to optimizers for the entire system. The alternative is of course to optimize by hand.


Problems with Common Lisp code interpretation

There are a few problems with Common Lisp that make it difficult to emulate it inside Yacas:


Parsing a language

After the tokenizing phase described in the previous section, a parser needs to be defined to effectively read in expressions in order to manipulate them. The fact that all tokens are in memory makes the process of parsing a little easier.

The module genericparser.ys defines some functions that allow definition of a grammar. The algorithm implemented is a recursive-descent parser. This parser can be used with the tokenizer implemented in the rest of the code (that is, the code maintaining indices into the arrays of tokens).

The code is not concerned about whether the grammar is ambiguous per se. Interaction between the tokenizer and the parser might be necessary for some languages (like C++), if the parser is to read preprocessor directives and template definitions correctly (the tokenizer can temporarily switch to another set of definitions of tokens).

The top-level routine to call is:

MatchGrammar(grammar);

Where a grammar is defined as follows. A grammar is either:

A sentence has the form:

{Tag,grammar_1,grammar_2,...,grammar_n}

The result returned will be a list with Tag as the first element (so that pattern matchers can recognize it later), and one element for each grammar matched.

Grammars can be defined with the DefineGrammar macro. The syntax is:

DefineGrammar(name,grammar);

where name is an atom, and other grammars can refer to this grammar using this name, and grammar is the grammar to recognize.

As a very simple example, consider the following definition of a grammar, defined in examplegrammar1.ys:

Use("codecheck.rep/codetransform.ys");
Use("codecheck.rep/genericparser.ys");

Expr'Tokens :=
{
  {"^[\\s]+",Spaces},
  {"^//.*?\\n",Comments},
  {"^[\\(\\)]",Brackets},
  {"^[0-9]+",Atoms},
  {"^[\\+\\-\\*]",Operators},
  {"^[\\;]",Delimiters},
  {"^[a-zA-Z_][a-zA-Z0-9_]*",Atoms},
  {"^.+",UnParsed}
};
DefineGrammar(statement,
  {
    {Statement,expression,";"}
  });
DefineGrammar(expression,
  {
    {Expr2,term,"+",expression},
    {Expr3,term,"-",expression},
    {Expr4,term}
  });
DefineGrammar(term,
  {
    {Term5,factor,"*",term},
    {Term6,factor},
  });
DefineGrammar(factor,
  {
    {SubExpression,"(",expression,")"},
    {SimpleAtom,AcceptToken(Atoms)}
  });

100 # Expr'Test'TokenMap(_rest) <-- rest;
Expr'Test'PreProcess():= [ ];
Expr'Test'PostProcess():=
[
  Local(nrInputTokens);
  nrInputTokens:=Length(CodeTrans'input);
  tokenIndex:=0;
  NextToken();
  While(tokenIndex < nrInputTokens)
  [
    Local(expression);
    expression := MatchGrammar(statement);
    MakePostFix(GetInputTokens(expression));
  ];
];
10 # MakePostFix({Statement,_expression,";"} ) 
   <--
   [
     ConvertExpression(expression);
     Echo(";");
   ];
10 # ConvertExpression({Expr2,_term,"+",_expression})
   <-- 
[
  ConvertExpression(term);
  Space(1);
  ConvertExpression(expression);
  Space(1);
  WriteString("+");
];
10 # ConvertExpression({Expr3,_term,"-",_expression}) 
   <-- 
[
  ConvertExpression(term);
  Space(1);
  ConvertExpression(expression);
  Space(1);
  WriteString("-");
];
10 # ConvertExpression({Expr4,_term}) <-- ConvertExpression(term);
10 # ConvertExpression({Term5,_factor,"*",_term})
   <-- 
[
  ConvertExpression(factor);
  Space(1);
  ConvertExpression(term);
  Space(1);
  WriteString("*");
];
10 # ConvertExpression({Term6,_factor}) <-- ConvertExpression(factor);
10 # ConvertExpression({SimpleAtom,_atom}) <-- WriteString(atom);
10 # ConvertExpression({SubExpression,"(",_expression,")"}) <-- ConvertExpression(expression);
100 # ConvertExpression(_rest) <-- Echo("UNHANDLED! ",rest);

This code defines a simple grammar with addition, subtraction and multiplication. Statements are expressions that are terminated by a ";". Brackets can be used to define a sub-expression. The code further proceeds to convert the input infix expressions to postfix, as the following example interaction will show:

In> Load("examplegrammar1.ys")
Out> True;
In> FromString("A+B*2;")CodeTransform("Expr","Test")
A B 2 * +;
A+B*2;Out> True;
In> FromString("(A+B)*2;")CodeTransform("Expr","Test")
A B + 2 *;
(A+B)*2;Out> True;

This example also clearly shows how transformation rules would map naturally to the lists returned by the parser. The variable parts of the sentences can just be matched to variables.

The function also outputs the original input, which has not been transformed. This facility would allow converting the code in place.


GNU Free Documentation License

Version 1.1, March 2000
Copyright (C) 2000 Free Software Foundation, Inc.

59 Temple Place, Suite 330
Boston, MA, 02111-1307
USA

Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.


Preamble

The purpose of this License is to make a manual, textbook, or other written document ``free'' in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with or without modifying it, either commercially or noncommercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modifications made by others.

This License is a kind of ``copyleft'', which means that derivative works of the document must themselves be free in the same sense. It complements the GNU General Public License, which is a copyleft license designed for free software.

We have designed this License in order to use it for manuals for free software, because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. But this License is not limited to software manuals; it can be used for any textual work, regardless of subject matter or whether it is published as a printed book. We recommend this License principally for works whose purpose is instruction or reference.


Applicability and Definitions

This License applies to any manual or other work that contains a notice placed by the copyright holder saying it can be distributed under the terms of this License. The ``Document'', below, refers to any such manual or work. Any member of the public is a licensee, and is addressed as ``you''.

A ``Modified Version'' of the Document means any work containing the Document or a portion of it, either copied verbatim, or with modifications and/or translated into another language.

A ``Secondary Section'' is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document's overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. (For example, if the Document is in part a textbook of mathematics, a Secondary Section may not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters, or of legal, commercial, philosophical, ethical or political position regarding them.

The ``Invariant Sections'' are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the Document is released under this License.

The ``Cover Texts'' are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License.

A ``Transparent'' copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, whose contents can be viewed and edited directly and straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise Transparent file format whose markup has been designed to thwart or discourage subsequent modification by readers is not Transparent. A copy that is not ``Transparent'' is called ``Opaque''.

Examples of suitable formats for Transparent copies include plain ASCII without markup, Texinfo input format, LaTeX input format, SGML or XML using a publicly available DTD, and standard-conforming simple HTML designed for human modification. Opaque formats include PostScript, PDF, proprietary formats that can be read and edited only by proprietary word processors, SGML or XML for which the DTD and/or processing tools are not generally available, and the machine-generated HTML produced by some word processors for output purposes only.

The ``Title Page'' means, for a printed book, the title page itself, plus such following pages as are needed to hold, legibly, the material this License requires to appear in the title page. For works in formats which do not have any title page as such, ``Title Page'' means the text near the most prominent appearance of the work's title, preceding the beginning of the body of the text.


Verbatim Copying

You may copy and distribute the Document in any medium, either commercially or noncommercially, provided that this License, the copyright notices, and the license notice saying this License applies to the Document are reproduced in all copies, and that you add no other conditions whatsoever to those of this License. You may not use technical measures to obstruct or control the reading or further copying of the copies you make or distribute. However, you may accept compensation in exchange for copies. If you distribute a large enough number of copies you must also follow the conditions in section 3.

You may also lend copies, under the same conditions stated above, and you may publicly display copies.


Copying in Quantity

If you publish printed copies of the Document numbering more than 100, and the Document's license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both covers must also clearly and legibly identify you as the publisher of these copies. The front cover must present the full title with all words of the title equally prominent and visible. You may add other material on the covers in addition. Copying with changes limited to the covers, as long as they preserve the title of the Document and satisfy these conditions, can be treated as verbatim copying in other respects.

If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages.

If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a publicly-accessible computer-network location containing a complete Transparent copy of the Document, free of added material, which the general network-using public has access to download anonymously at no charge using public-standard network protocols. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public.

It is requested, but not required, that you contact the authors of the Document well before redistributing any large number of copies, to give them a chance to provide you with an updated version of the Document.


Modifications

You may copy and distribute a Modified Version of the Document under the conditions of sections 2 and 3 above, provided that you release the Modified Version under precisely this License, with the Modified Version filling the role of the Document, thus licensing distribution and modification of the Modified Version to whoever possesses a copy of it. In addition, you must do these things in the Modified Version:

If the Modified Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied from the Document, you may at your option designate some or all of these sections as invariant. To do this, add their titles to the list of Invariant Sections in the Modified Version's license notice. These titles must be distinct from any other section titles.

You may add a section entitled ``Endorsements'', provided it contains nothing but endorsements of your Modified Version by various parties -- for example, statements of peer review or that the text has been approved by an organization as the authoritative definition of a standard.

You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any one entity. If the Document already includes a cover text for the same cover, previously added by you or by arrangement made by the same entity you are acting on behalf of, you may not add another; but you may replace the old one, on explicit permission from the previous publisher that added the old one.

The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modified Version.


Combining Documents

You may combine the Document with other documents released under this License, under the terms defined in section 4 above for modified versions, provided that you include in the combination all of the Invariant Sections of all of the original documents, unmodified, and list them all as Invariant Sections of your combined work in its license notice.

The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy. If there are multiple Invariant Sections with the same name but different contents, make the title of each such section unique by adding at the end of it, in parentheses, the name of the original author or publisher of that section if known, or else a unique number. Make the same adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work.

In the combination, you must combine any sections entitled ``History'' in the various original documents, forming one section entitled ``History''; likewise combine any sections entitled ``Acknowledgements'', and any sections entitled ``Dedications''. You must delete all sections entitled ``Endorsements.''


Collections of Documents

You may make a collection consisting of the Document and other documents released under this License, and replace the individual copies of this License in the various documents with a single copy that is included in the collection, provided that you follow the rules of this License for verbatim copying of each of the documents in all other respects.

You may extract a single document from such a collection, and distribute it individually under this License, provided you insert a copy of this License into the extracted document, and follow this License in all other respects regarding verbatim copying of that document.


Aggregation With Independent Works

A compilation of the Document or its derivatives with other separate and independent documents or works, in or on a volume of a storage or distribution medium, does not as a whole count as a Modified Version of the Document, provided no compilation copyright is claimed for the compilation. Such a compilation is called an ``aggregate'', and this License does not apply to the other self-contained works thus compiled with the Document, on account of their being thus compiled, if they are not themselves derivative works of the Document.

If the Cover Text requirement of section 3 is applicable to these copies of the Document, then if the Document is less than one quarter of the entire aggregate, the Document's Cover Texts may be placed on covers that surround only the Document within the aggregate. Otherwise they must appear on covers around the whole aggregate.


Translation

Translation is considered a kind of modification, so you may distribute translations of the Document under the terms of section 4. Replacing Invariant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections. You may include a translation of this License provided that you also include the original English version of this License. In case of a disagreement between the translation and the original English version of this License, the original English version will prevail.


Termination

You may not copy, modify, sublicense, or distribute the Document except as expressly provided for under this License. Any other attempt to copy, modify, sublicense or distribute the Document is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance.


Future Revisions of This License

The Free Software Foundation may publish new, revised versions of the GNU Free Documentation License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. See http://www.gnu.org/copyleft/ .

Each version of the License is given a distinguishing version number. If the Document specifies that a particular numbered version of this License "or any later version" applies to it, you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. If the Document does not specify a version number of this License, you may choose any version ever published (not as a draft) by the Free Software Foundation.


ADDENDUM: How to use this License for your documents

To use this License in a document you have written, include a copy of the License in the document and put the following copyright and license notices just after the title page:

Copyright (C) YEAR   YOUR NAME. Permission is
granted to copy, distribute and/or modify this
document under the terms of the GNU Free
Documentation License, Version 1.1 or any later
version published by the Free Software Foundation;
with the Invariant Sections being LIST THEIR
TITLES, with the Front-Cover Texts being LIST, and
with the Back-Cover Texts being LIST. A copy of
the license is included in the section entitled
``GNU Free Documentation License''.

If you have no Invariant Sections, write ``with no Invariant Sections'' instead of saying which ones are invariant. If you have no Front-Cover Texts, write ``no Front-Cover Texts'' instead of ``Front-Cover Texts being LIST''; likewise for Back-Cover Texts.

If your document contains nontrivial examples of program code, we recommend releasing these examples in parallel under your choice of free software license, such as the GNU General Public License, to permit their use in free software.