Viewed from the top level, Yacas is just one function; Yacas is called with one, or possibly two arguments, or from a different point of view, a request is made to the Yacas system. How the function performs the action is implementation-dependent, but at the top-level it can be viewed as one call. From c++ or Java, it might look like
yacas.Evaluate("expression"); |
Or from Lisp,
(yacas expression environment) |
The environment argument is optional; it can be a global resource that doesn't have to be passed in.
For instance, taking the derivative of a function could look like
yacas.evaluate("D(x)Sin(x);"); |
when in c++, or
(yacas '(D x (sin x)) ) |
when in Lisp. The quote in Lisp makes it clear that the argument, the parameter to the request, is actually data, and not program. This is an important distinction which has to be made clear in Lisp, but is clearer in other environments, due to the form of data being distinct from the form of program.
To drive home this point, (* a 1) would raise an error when given to the Lisp eval function if the variable a is not bound to a value, but would be perfectly valid data when passed to a CAS. In fact, one would like it to return a in this case. The * operator is actually a function call when invoked from Lisp ( (* a 1) is a program), but is data when passed to a CAS.
One can conclude that Yacas, an engine for simplifying symbolic expressions and performing calculations, is just one huge function that performs that task, and has one entry point, a function call with one argument as the data to the request.
Of course this is a bit of an oversimplification. We are not just interested in evaluating and simplifying expressions. When Yacas is embedded inside an application that offers a user interface, one would want to have access to all the operations allowed, and their arguments. Furthermore, for a debugger, information about the internal state of the interpreter is needed (values of global variables, etc.).
At the end of the day we want to arrive at a language where we can implement an algorithm in a few lines of code, and have the compiler generate efficient code for it.
Given the strong typed Lisp we are using here, how is it different from using c++/Java directly?
The difference is in the syntax. Algorithms for manipulating symbolic expressions are written down much more naturally in a language that supports these operations natively. The code to perform symbolic manipulation operations can look surprisingly complex when written in c++ or Java. Later on, a compiler for the Lisp dialect developed here that will try to compile to human-readable source code will be described. Although the results are somewhat readable, they are a lot less readable than the original Lisp or Yacas script code they were generated from.
Convenient notation aids the programmer greatly in thinking about a problem. It is not unusual in mathematics to devise a new notation scheme to make life easier. For example, the bra ket notation, where <a|b> stands for the inproduct defined as
Integrate()a*b |
has the added advantage of being able to write down projection operators of the form |b><a| which would be a lot more cumbersome to deal with without this notation.
Flexibility is not lost in a strong typed system; one can define a type EXPRESSION, which (naturally) represents an expression that can be manipulated. This could even be the default type for variables for which types are not specified, to make life easier for the programmer. This is not advisable, though, as the section describing the compiler will make clear.
This section will first introduce the reader to the basic concepts of Lisp, and then proceed to show how the yacas function can be implemented in Lisp in general. Clearness of concepts explained will be favored over efficiency and performance. A small subset of Lisp will be used, so it should be easy to take the code along to various different Lisp dialects, or to write a little Lisp implementation to implement the yacas function.
The first chapter briefly describes a small Lisp dialect. A small subset of what is usually found in the average Lisp interpreter is described. The Lisp dialect is different in small subtle ways from normal Lisp implementations. Simplicity is put first.
For the Lisp dialect presented, one was chosen that is different from the Lisp dialect that represents Yacas. This separation is done explicitly, on purpose, to facilitate implementation of execution of algorithms written out in Yacas syntax from any other language. All the Yacas-specific details have to be handled by the routine performing these algorithms, independent of language constructs that might support it.
Even though in Lisp program and data look the same, for Yacas they are distinctly different in that data will likely not be evaluated by the default evaluator. It is important to keep the distinction between data and program.
Also, the typed Lisp dialect developed in this book is easily translated to other languages. This removes the argument of Lisp being less efficient; the code can be converted to efficient c++/Java code. What is more, the code is easy to parse and transform into these languages, giving a language-independent specification of algorithms.
There are various sources on the internet describing the early history of Lisp.
The definition of an expression can be given such;
So, the words foo and bar could be called atoms. A list is represented by writing the expressions in the list out with spaces in between them, and brackets around them. So, (foo bar) would be a list with two elements, foo and bar.
In Lisp, each expression can evaluate to a value (if no error occurred). When evaluating an atom, the atom can either be a 'special' atom that evaluates to itself, or otherwise a lexical word that refers to a variable. When it is a variable, the Lisp system looks up its value, and returns the value the variable is bound to, in what is called the 'environment' the Lisp interpreter is evaluating in. When a variable is not bound, an error is raised. The noted difference with the usual Lisp dialects is that a normal Lisp system raises an error when an expression tries to evaluate a variable that is not bound to a value, whereas in a CAS, the variable is a free parameter in some expression.
When a list of expressions is encountered, the first element in the list is a function that should be called, and the rest of the expressions in the list are the arguments. The interpreter handles the list by first evaluating the arguments (the rest of the list), and then calling the function with the evaluated arguments as parameters to the function. There are exceptions to this; the function quote returns its argument unevaluated. In stead of writing (quote <expression>), a common short-hand is '<expression>, eg. the expression with a quote in front of it. Quotes are used a lot, to tell the system that data is being passed in.
For the following example interaction, the following commands are used:
The following is an interaction with our little lisp dialect. In the usual Lisp dialects, when variables are unbound, an error is raised.
In> a Error, variable has no value : a |
Lisp dialects are usually case-insensitive (the atom a is the same as A).
Here the atom a evaluated to itself, because it was not assigned.
In> (setf a '(this is a list)) Out> (this is a list ) In> a Out> (this is a list ) |
Here we assigned the value (this is a list) to the variable a, and a now evaluates to that list. The system looked up the value bound to the variable a without re-evaluating it.
In> 'a Out> a In> (eval a) ERROR: function not defined : this Out> nil |
With quote we can withhold evaluating a, and with eval we can re-evaluate a. Since this is not a defined function in this case, an error is raised.
Note that evaluation as defined in Lisp dialects is less suitable for CAS systems. A CAS would return expressions unevaluated, which is what you want in a CAS. It then represents some function which has not been specified yet. In a Lisp intepreter, it is nice to have the interpreter raise an error, but in a CAS, where it is data, it should not be simplified further.
For atoms that refer to variables that are not yet bound, the variables should be returned unevaluated by default, and functions that are called but not defined should be returned as the same function but with their arguments evaluated. evaluation as defined in most Lisp dialects is not suited to simplifying expressions. Later in this book an evaluator will be developed in Lisp itself that is closer to a sense of 'simplification' required for CAS.
Sometimes the user wants to refer to a function without having to define it, because transformation rules for the function can be given without explaining explicitly how to go about evaluating the function. Certain functions have properties, and sometimes it is easier to work with the properties than to evaluate them explicitly.
For the rest of this book, it is assumed that the semi-colon, ;, means that the rest of the input line up to the end of line or end of file is a human readable comment, not to be executed.
Note that building a Turing machine is just a theoretical exercise. Turing machines don't concern themselves with efficiency. Algorithms can be run, in finite time, but it will most certainly not be the most efficient way to execute the algorithm on a real computer.
Methods to make execution more efficient break down in two steps:
This section describes a minimum set of Lisp primitives needed for building a Turing machine, theoretically. After that, first numeric operations are introduced, before implementing a Turing machine. Fundamentally, the language does not need to support arithmetic in order to be able to implement any algorithm, as other constructs using lists can be used to represent and work with numbers, but not implementing it does make the effort rather complex.
The following sections will describe the following minimal set of commands;
In the next sections, whenever a function call is explained, the arguments to the function will be delimited by < and >. Arguments are evaluated unless stated otherwise.
The default set of functions implemented in the core of a Lisp interpreter can be divided between function-like and macro-like functions, in that when a function-like (defun) function is called, its arguments are evaluated first. Examples of this are cons, first and rest.
When calling macro-like (defmacro) functions, however, the arguments are not evaluated first, and the variables defined in the calling environment are visible while expressions using arguments on the command line are evaluated. A macro can be viewed as a bit of code that got pasted in to the place where it got called, by the interpreter (in fact, this is what a compiler can do).
The defmacro function allows the programmer to expand the language by adding functions that act as if they were defined in the core part of the interpreter.
Macros are not necessary, strictly speaking. Macros clean up the syntax of a program, in that it allows a programmer to write more clean and brief code. For example, the expression
(if (equal a b) () (print a)) |
could be written more concisely as
(unless (equal a b) (print a)) |
if unless were defined. But it would have to be defined as a macro, as when unless is called, the arguments should not be evaluated first. Unless can be defined as:
(defmacro unless (predicate body) (if predicate nil body) ) |
This makes unless work, as the following interaction shows:
In> (unless t (print 'SOMETHING)) Out> nil In> (unless nil (print 'SOMETHING)) SOMETHING Out> SOMETHING |
Especially when code is generated, and does not have to be human-readable, macros are not necessary. But they are a nice feature nonetheless, and have to be implemented in the interpreter, it can not be faked with normal defun functionality.
Examples involving defun:
In> (defun foo (atom) (cons atom (list rest))) Out> foo In> (setf rest 'end) Out> end In> (foo a) Out> (a end ) In> (foo b) Out> (b end ) |
In addition, nameless (pure) functions can be made, which work exactly as their defun/defmacro counterparts, as lambda and macro:
Examples:
In> (setf add '(lambda (x y) (+ x y))) Out> add In> (add 2 3) Out> 5 |
Note that pure functions, the lambda and macro expressions described here, are not strictly necessary, if we have equivalents for defun and defmacro. But pure unnamed functions do allow for convenient notation, without clobbering up the global name space. They also allow for locally defined functions. If they are required, they have to be part of the interpreter, since it cannot be faked with defun and defmacro.
An interesting option is to not define the operations defun and defmacro, but just setf, lambda and macro. This set is just as powerful, and in fact allows you to create defun and defmacro from them. We choose defun and defmacro over the route of using setf, lambda and macro to define defun and defmacro because lambda and macro are not strictly required to build a CAS. When implementing a more mature Lisp interpreter, it is wise to define lambda and macro, for the convenience of the programmer.
To disseminate a list:
Examples, taken from an interaction with clisp:
In> (first '(a b c)) Out> a In> (rest '(a b c)) Out> (b c ) |
To reconstruct a list:
Example taken from an interaction with clisp:
In> (cons 'a '(b c)) Out> (a b c ) In> (list (+ 2 3) 4 3) Out> (5 4 3 ) |
One important function is assoc:
Examples:
In> (setf list '( ("name" "Peter") ("age" 28) ) ) Out> list In> (assoc '"name" list) Out> ("name" "Peter" ) In> (assoc '"age" list) Out> ("age" 28 ) In> (assoc '"something" list) Out> nil |
In> () Out> nil In> nil Out> nil |
In Common Lisp, nil is considered to mean false when considered as a boolean result from a list, and anything different from nil is considered to mean true. This is then used as an efficient way to find items in lists. The result returned can then either be a found result, or nil which means nothing was found.
In> (rest '(a b)) Out> (b ) In> (rest '(a)) Out> nil |
The author feels uneasy with having the nil atom overloaded as both an empty list and false when used in predicates, because it makes code harder take along to another language. In some cases nil means an empty list, and in some cases it means false, which complicates converting code, due to the fact that the code has to be examined more closely for this. Yacas has a distinct set of atoms, True and False precisely for this reason. A test to see if something is an empty list is still possible, by comparing the result of evaluating an expression with an empty list. Indeed, Yacas is not alone in this, Scheme (another dialect of Lisp) has different symbols for true and false, #t and #f respectively.
Two predicates needed for operating on lists are then:
Note the difference in atom and listp, which was taken from Common Lisp. In Common Lisp predicates are supposed to be suffixed by a 'p', but apparently not so with atom.
When setf is called, first the local variables are searched for a candidate variable to set, and after that the global variables are searched. When entering a function body, a local frame of local variables for that function are defined, and the parameter variables are bound to the arguments passed in to the function. let allows the programmer to declare additional local variables.
The following example sets up variables n and m and assigns them values locally. n is bound to a, and m bound to (b c).
In> (let ( (n 'a) (m '(b c)) ) (cons n m)) Out> (a b c ) In> n ERROR: variable not defined : n Out> nil |
The expression part in the let form gets evaluated after the local variables have been assigned. Note that evaluation of the expressions in the let form are performed in the scope where the let form is called, thus expressions in the let form can use local variables from its surroundings.
Local variables that are declared in a certain scope where the variable already had a value mask the older declaration of the variable.
Note: the values that will be assigned to the local variables are evaluated before being assigned to the local variables. So the expressions in the assign-list cannot refer to other variables declared in the same assign-list.
This concludes the a set of functions required for building up a computer algebra system. It is not a minimal set by any means, as a following section will show; if can be defined by using defmacro and cond, for instance.
The next section will first introduce arbitrary number arithmetic support. It is not strictly necessary for the interpreter to work and be able to do anything, but it is the first concession we will make towards efficiency, as it is rather hard to find an algorithm that does not need arithmetic.
The next sections after that will continue by introducing a few building blocks that are needed to implement a CAS like Yacas.
We start with the representation of numbers. This section will only refer to arbitrary precision integers, floating point numbers will come later. The implementation of the Lisp dialect described here is designed to be easily extended that way.
Numbers are represented as atoms, sequences of characters. More specifically, they can be a sequence of digits from 0 to 9 (we will work in the decimal number system here, as that is what most humans are used to). Negative integers have a minus sign in front of them. So, 1234 and -1234 are atoms representing the integers 1234 and -1234 respectively.
The most basic arithmetic operations are then addition and negating a number:
In> (+ 1 2 3) Out> 6 In> (- 5) Out> -5 In> (+ 2 (- 3)) Out> -1 |
Together with a predicate to determine that something is an integer (integerp), and an ordering operator less than denoted <, other arithmetic operations like multiplication, division, exponentiation etc. can be built.
Some examples using < and integerp.
In> (integerp 4) Out> t In> (integerp -4) Out> t In> (integerp 'a) Out> nil In> Out> (a b c ) In> (integerp '(a b c)) Out> nil In> (< 2 3) Out> t In> (< 3 2) Out> nil |
Note that a computer algebra system requires arbitrary precision arithmetic, as for certain algorithms the intermediate results of a calculation can become big, even if the final result is small. So, a CAS should be able to do the following:
In> (+ 12345678901234567890999999999999999999 1) Out> 12345678901234567891000000000000000000 |
The routines developed here will be built from the primitive building blocks described before. However, an implementation is free to implement these hard-coded in the core kernel for speed. This is advisable because the constructs developed here will be used often, and so there is a need for efficiency.
Note that the functions introduced so far are not minimal. Some can be defined in terms of others. They are defined here because they are easy to define in an interpreter without making the interpreter much larger, but are used often, so efficiency is required. For instance, given the cond and defmacro operations, it is possible to define an if operation:
In> (defmacro Newif (pred then else) (cond ( pred then) ( t else) ) ) Out> Newif In> (Newif (equal 'a 'b) 'c 'd) Out> d In> (Newif (equal 'a 'a) 'c 'd) Out> c In> (if (equal 'a 'b) 'c 'd) Out> d In> (if (equal 'a 'a) 'c 'd) Out> c |
Now let us start by building a set of new primitives using the ones introduced so far. We start by defining Null, which returns True if its argument is an empty list:
In> (defun null (expr) (equal expr nil) ) Out> null In> (null 'a) Out> nil In> (null ()) Out> t In> (null nil) Out> t |
With this, we can define a Length function which returns the length of a list:
In> (defun length (list) (if (null list) 0 (+ 1 (length (rest list))) ) ) Out> length In> (length '(a b c)) Out> 3 In> (length ()) Out> 0 |
In addition to the already defined function first, we can define the functions second, third, etc. to print code that is easier to read:
In> (defun second (list) (first (rest list) ) ) Out> second In> (defun third (list) (first (rest (rest list) ) ) ) Out> third In> (setf lst '(a b c)) Out> lst In> (first lst) Out> a In> (second lst) Out> b In> (third lst) Out> c |
and with this we can start to print functions for boolean expressions. First, let us define a not operator, which returns True if its argument is nil:
In> (defmacro not (a) (null a) ) Out> not In> (not 'a) Out> nil In> (not nil) Out> t |
Then lazy and and lazy or operations can be defined:
In> (defmacro and (a b) (if a b nil)) Out> and In> (defmacro or (a b) (if a t b) ) Out> or |
These evaluate their first argument first, and decide to evaluate the second only if the first one does not give enough information to decide whether to stop. For instance, if the first argument to and returns false (eg. nil), then the result of evaluating the entire expression should be nil, otherwise it returns the result of evaluating the second argument. or works the other way round: it evaluates the second argument if the result of evaluating the first argument was nil.
Lazy and and or operators can be used to combine predicates into a larger predicate, but also to guard evaluation of an expression, let it only evaluate an expression if a pre-condition is met. The unless operation could equally well have been defined by using or:
In> (defmacro unless (predicate body) (or predicate body) ) Out> unless In> (unless t (print 'SOMETHING)) Out> nil In> (unless nil (print 'SOMETHING)) SOMETHING Out> SOMETHING |
Most of these functions are convenience functions. They make code easier to read. Macros in effect allow you to extend the language.
The reasons for adding these functions fall into the following categories:
The commands described here can be implemented in the interpreter, but they are not strictly necessary, or they can be implemented in Lisp code using more basic primitives.
From the other side, exceptions in Java for instance, can not be ignored. A routine has to handle it, or promise to pass off the exception to the calling environment. But then all the functions calling the function must handle the exception, resulting in much more typing work.
This concept can be put to great use in for instance a typing system, where it can be made very hard to pass an arbitrary-type object around, like the void * of c or c++, or a GenericLispObjectHugeTypeDeclarator in a Lisp system, where type declarations can be enforced. In addition to this, it can be made harder to pass around wrong types by forcing hand-written type-conversions. An object of a generic type can be passed in, but if it results in considerable extra effort on the behalf of the programmer to do type conversions in the code, the programmer might consider declaring the object of the correct type in the first place.
C++ does this the other way round; the language makes it particularly easy to do automatic type conversions between types, sometimes resulting in dangerous code.
A lot of algorithms working on data can get away with this sequential processing, performing an operation on an object, and then moving to the next object to perform the same operation on it.
The classic way in Lisp has been to have a list of objects, and have the first function return the first of the list, rest return the rest (the objects that still need processing), and when the list is empty, all objects have been processed.
The reason to make this abstraction is that for specific cases optimizations can be made in the way the container is implemented internally. For instance, in stead of a list, an array could be used, which gives constant-time access to an element in a container, but slow O(n) insertion time. Many more options are available; objects could be obtained from a compressed block of data, or read in sequentially from disk (for huge sets of objects), or a set of objects could be defined symbolically, like (range 1 Infinity). A routine could then be written that finds the first occurrence of an integer that has certain properties, for instance. One would need to have a function first and rest and null for arrays and for objects of the form (range <from> <to>).
This is an important issue, because computers are good at doing a dull task, but doing it millions of times over. One option is to have a huge data set and process it. An efficient way of walking over a sequence of data objects is then necessary.
Packaged environments are similar to the well-known singleton classes from c++, which are in turn again similar to global functions and variables. The disadvantage of having things global is mainly that you can only have one instance of the global state of that module. This is not a problem for modules for which there is naturally only one instance.
Packaged modules are usually of this form. They can be seen as a name space, where certain symbols are defined to have a meaning locally. Specific symbols from a module can be exported to another module, thus the symbol of another module becomes visible in another name space.
As mentioned above, modules in the Lisp dialect developed here have a very simple structure; they are essentially what is usually referred to as a class, but there will be only one instantiation of each module, and modules are not 'derived' in any form from other modules. Modules can export functions (or more precisely, modules can import functions from other modules) when appropriate, but this feature should be used with care as this easily results in re-introducing name collisions in the name space of a module.
Modules have their own global variables. When a function from a module is invoked, it can only see its own globals, even if exported to another environment. This has the following advantages:
When well-thought out, packages can greatly aid in making a system modular, built from small sub-modules which can be added or removed.
An environment can be built up with makeenv, and defining the functions and global variables contained therein:
(makeenv arithmetic (progn (defun add (n m) ...) (setf pi 3.1415926535) ) ) |
An expression can then be evaluated in this environment through:
(namespace <module-name> <function-name> <arguments...>) |
The 'environment' can be thought of as a static class, a singleton, in object-oriented terms. There is only one of this type of object, and one instance of each property in the class, which is why the compiler can generate a static link to it. A function in an environment is no more than a method in such a singleton class.
The arithmetic operations like + are actually defined in the module INTEGER for the reference implementation of the Lisp interpreter described in this book. This means that from start up, the + operator is not directly accessible. One way to access a function in a module is to pass the name of the module as an argument to the operator namespace:
In> (+ 2 3) ERROR: function not defined : + Out> nil In> (namespace 'INTEGER '+ 2 3) Out> 5 |
This is obviously not the most convenient syntax. One way to encapsulate a call to a function in a module is to define a function for it:
In> (defun +(a b) (namespace 'INTEGER '+ a b)) Out> + In> (+ 2 3) Out> 5 |
This way we have back our easy access to the functionality of adding two integers. However, we are left with cumbersome syntax for importing this function. For this the function import was defined, which makes importing a function from another module into a module a little bit easier:
In> (import INTEGER + AddInt) Out> t In> (AddInt 2 3) Out> 5 |
Note that for this language we chose the construct of namespace because we don't want access to global variables in the module. Another option would have been to allow passing an environment, or module name to eval, as in (eval <expression> <module-name>). This would have allowed the programmer to pass in any expression to the environment, effectively making it possible to evaluate variables in that module. We want to actively disallow that, by only allowing function calls to that environment. The arguments to the function call are evaluated before the function defined in the module is called, thus they are evaluated in the calling environment.
To define a module, makeenv can be used. makeenv takes two arguments, a module name and the body to execute to define globals and functions inside it. For example, we can define a function bar in the module foo as follows:
In> (makeenv foo (defun bar(n m) (list 'head n m))) Out> bar In> (bar 2 3) ERROR: function not defined : bar Out> nil In> (import foo bar bar) Out> t In> (bar 2 3) Out> (head 2 3 ) In> (namespace 'foo 'bar 2 3) Out> (head 2 3 ) |
The function functionp allows one to check that a function is defined in a package. This is useful for checking that a package name that is passed as an argument is a valid package (has the correct functions implemented). The syntax is (functionp <namespace> <functionname>):
In> (functionp 'std '+) Out> nil In> (functionp 'INTEGER '+) Out> t In> (import INTEGER + +) Out> t In> (functionp 'std '+) Out> t |
The default environment (module) the application starts in, the one the user can access, is the name space std. Other name spaces are INTEGER,
Note that INTEGER is both the name of a name space and of a type. This has powerful consequences: of we define a vector to be a vector of integers, as some structure with VECTOR INTEGER in it, the code to add two vectors knows that the components of the vector can be added by using the addition operation from the INTEGER module directly, resulting in speedier code than code that would accept the components as being of type OBJECT, and figuring out for each component that it was an integer.
The system can even go as far as checking that a parametric type can be made by checking that certain operations are defined for it, through the functionp call.
As such, the programming language also offers an interface to operating system functionality through modules supporting these operations.
{quote, if, cond, setf, while, progn, defun, defmacro, assoc, eval, first, rest, cons, atom, listp, list, equal, namespace, import, makeenv, functionp}.
All other commands (including arithmetic commands) can be accessed through other modules.
For modules representing some type, usually there should at least be a type function which returns t if an object is of that type, nil otherwise, and a function make which makes such an object given some arguments (if applicable).
It is defendable to use exceptions, as the problem is likely to be invalid input to a function, and as a result invalid input passed in by the user, and as such an environment issue that the code cannot be expected to treat correctly.
Exceptions halt execution immediately. The preEval catches it, and the user interface can continue where it left off. This is defendable behavior. The system is only expected to do on-line commands typed in by the user, or fail. The story would be different if Yacas were used to control some system, and it needs to keep running after failure, but it was not designed for that. Yacas is a computer algebra system, not a controlling system for a Space Shuttle.
The Lisp interpreter described so far is untyped, meaning that everything is of one type, OBJECT. What happens in fact is that the interpreter executes functions, and these functions internally do conversions. When calling an arithmetic operator like +, arguments of type OBJECT are converted to NUMBER, the addition is performed, and the result is converted back from type NUMBER to OBJECT. Untyped systems are inefficient this way.
The compiled Lisp dialect that will be developed will allow for typed arguments and functions; functions can only accept arguments of a certain type, and return a result of a specific type. The typed Lisp dialect will still allow for untyped (eg. OBJECT) arguments, for the lazy programmer (but as we shall see later, the lazy programmer is in for a few nasty surprises: slower code and more typing required, as a lot of manually coded type conversions will need to be made). With the simple typing system we will use, no automatic type conversions will be made. They will have to be done manually.
The system will consist of various packages involving operations on objects. For instance, there will be a package that performs addition on integers and real numbers, and there will be a package that deals with addition of vectors and matrices. The same will hold for for instance complex numbers. The addition operator in the global environment will then be responsible for dispatching an addition operator to the appropriate addition operator in the appropriate environment. For instance, there can be an AddIntegers operator, and an AddComplexNumbers operator, and the invoking addition operator can first check argument types before invoking the correct operator. This is called data-directed programming.
Complications arise when two objects of different types are added. When adding an integer to a real number, one cannot use the operators for adding integers or real numbers themselves. One solution for this is to first convert the integer to a real number type, and then invoke the addition of real numbers operator. For numeric types, this works well. Numbers can be put in what is called a tower of types: integers are a sub-type of rational numbers, which are a sub-type of real numbers, which are a sub-type of complex numbers. A number can always be moved up in the tower of types. An integer is just a special case of a complex number, for instance. So when adding an integer to a complex number, one only needs to convert from the integer to complex. This can be done in steps, so for n types only n-1 convertors are needed; IntegerToRational, RationalToReal, etc.
Typing systems are not always this simple. For example, a diagonal matrix is a sub-type of the generic set of all matrices, but some diagonal matrices are also sub-types of the unitary matrices. So a matrix can have more than one super-type. Super-types can have multiple sub-types too. In this case, choosing the right conversion is not trivial. The choice can be based on some heuristic; when doing a specific type of operation, first try to convert the matrix to a type for which there is the most efficient algorithm for performing that action, otherwise try another type.
It gets worse; for polynomials, x^2-y can be a multivariate polynomial in (x,y) with integer coefficients, or a univariate polynomial in x with coefficients which are univariate polynomials in y, or a univariate polynomial in y with coefficients which are univariate polynomials in x. There might be cases where the system can not decide which type to take, and the user will have to help the system by telling what type to take.
Types can be down-cast to sub-types too, if possible. This is sometimes necessary. Suppose one has two objects of type OBJECT, as entered by the user. The user wants to perform a division operation. The system can try to drop the object to the lowest possible type, by first trying to convert to integer, and if this doesn't work, see if the objects can be converted to objects of type univariate polynomial, and otherwise multivariate polynomial, etc. The trick is to have a routine that accepts univariate polynomials in internal representation, performs the operation, and returns the result of the operation in internal representation. When the operation is called with generic objects as arguments, if they can be converted to internal representation for univariate polynomials first, this can be done, and the result of the operation converted back to generic object type (a form the user could have typed in). The advantage of this is an object is converted once, when passed to a function. The result of consecutive function calls can then use the object in the type it is already in, without having to convert types each time, and the final result is converted back. This way types are only converted twice; at the beginning and at the end.
Interestingly, often when dealing with mathematical objects, there are properties that always need to hold for such objects. In the case of make-complex, real-part and imag-part, for a complex number z, the following should always be true:
(equal z (make-complex (real-part z) (imag-part z))) |
Data structures are a powerful way to abstract types away. A data structure can be defined for for instance rational numbers, matrices, etc. Also cons, first and rest are such a combination, where cons is the constructor, and first and rest are the selectors. The result of evaluating (cons a b) is an object for which first returns a, and rest returns b.
Suppose we decide to represent complex numbers as (complex real imag). Then the following routines are constructors and selectors:
(defun make-complex (r i) (list 'complex r i) ) (defun real-part (z) (second z)) (defun imag-part (z) (third z)) |
Then this allows for the following interaction:
In> (setf z (make-complex 2 3)) Out> z In> (real-part z) Out> 2 In> (imag-part z) Out> 3 |
We can define a predicate to check if an object is complex as:
(defun complexp(z) (and (listp z)(equal 'complex (first z))) ) |
And with that a simple addition operation:
(defun add (x y) (cond ( (integerp x) (+ x y) ) ( (complexp x) (make-complex (add (real-part x) (real-part y)) (add (imag-part x) (imag-part y)) ) ) ) ) |
which allows for the interaction:
In> (add (make-complex 2 3) (make-complex 4 5)) Out> (complex 6 8 ) |
This is an example of data-driven programming; invoking different routines for different data types. Also note we use the add operation for the components of the complex data object. This has powerful consequences: it means that a complex object can have components of arbitrary type, as long as there is an addition operation defined for it in the add function. The complex data object is in effect a parametric type: (complex <type> <type>), where <type> can be anything, integers, matrices, etc. <type> is a type parameter for this complex data type.
Types in general aid in two ways: to check for correctness of code, and to give information to the compiler to generate more efficient code. This also holds for parametric types. Suppose we have a vector object. Pseudo-code for adding the two vectors could look like this:
vector add-vectors(v1,v2) vector result = zero-vector for i from 1 to degree(v1,v2) result[i] := add(v1[i],v2[i]) return result |
Note we use the add operation here for adding the components of a vector. Suppose the add operation was then defined as:
object add(x,y) if (object-type(x) = integer) return add-integers(x,y) if (object-type(x) = real) return add-reals(x,y) if (object-type(x) = complex) return add-complex(x,y) if (object-type(x) = vector) return add-vectors(x,y) |
Now the important thing to notice is that the add function is called from within the loop of add-vectors, and if the vectors in question contain a billion items, add gets called a billion times. However, add checks the type of the object to be added each time it is called, and it is called from the inner loop! The add-vectors function can be written with the add operation inlined into it:
vector add-vectors(v1,v2) vector result = zero-vector for i from 1 to degree(v1,v2) if (object-type(v1[i]) = integer) result[i] = add-integers(v1[i],v2[i]) if (object-type(v1[i]) = real) result[i] = add-reals(v1[i],v2[i]) if (object-type(v1[i]) = complex) result[i] = add-complex(v1[i],v2[i]) if (object-type(v1[i]) = vector) result[i] = add-vectors(v1[i],v2[i]) return result |
However, because all the type of all the vector components can be expected to be the same, this can be improved by pulling the if statements outside of the loop, so the if statements are only performed once in stead of a billion times. This is a common way to optimize code, and is often done by hand. There is no reason, however, why a computer could not do this optimization. When the add operation is inlined, if the compiler knows that the if statements will yield the same results every time, it can pull them outside of the loop, as follows:
vector add-vectors(v1,v2) vector result = zero-vector if (object-type(v1[i]) = integer) for i from 1 to degree(v1,v2) result[i] = add-integers(v1[i],v2[i]) if (object-type(v1[i]) = real) for i from 1 to degree(v1,v2) result[i] = add-reals(v1[i],v2[i]) if (object-type(v1[i]) = complex) for i from 1 to degree(v1,v2) result[i] = add-complex(v1[i],v2[i]) if (object-type(v1[i]) = vector) for i from 1 to degree(v1,v2) result[i] = add-vectors(v1[i],v2[i]) return result |
If a compiler knows beforehand that two vectors of type vector<integer>, it can devise a special function that will only work on objects of type vector<integer>, but this is already a lot less useful. The if statements have already been pulled out of the loop, so they will not be performed a billion times for an addition of two vectors both with a billion components.
Parametric types can be supported naturally. A vector could be represented as (vector <type> <components>), so that (vector SMALLFLOAT (1 0 0)) would be a base unit vector in a three-dimensional space, with real-valued vector components. Flexibility is not lost, the type can be EXPRESSION or OBJECT or some other generic type. But it can also be (UNIPOLY INTEGER x), eg. making the vector a vector with univariate polynomials in x over the integers as components.
The above considerably speeds up the operation. The problem is that it is more natural to place the if statements in the add function, so it can be used in many places, when adding complex numbers, matrices, etc. But inlining it allows the compiler to optimize the if statements away by pulling them outside of the loop. In most cases, even if not in a loop, the compiler can optimize code away if inlined, by using typing information.
Inlining functions as described above is automatically done on macros. However, macros are slow to execute in the interpreter, so a possible feature of a compiler could be to tell it a function can be inlined. The variable scoping rules can still hold; the body should not be allowed to access variables in its calling environment.
The overheads of executing Lisp are:
The first concern will be to convert the Lisp code to source code for the target platform; when the Lisp interpreter is written in Java we want the Lisp code compiled to Java, etc.
The disadvantage is the compiler then needs to be around for compilation to be able to take place. Programs thus get compiled once, and then run. The compiler is not available at run time. As compiled code can interact with interpreted code, the user is still free to write his own functions, but they will have to run interpreted.
The advantage of directly compiling to source code as input for another compiler is that it is less work. However, a little compiler could be written that takes the resulting output source code and compiles it further to some other target, dynamically at run time. Theoretically, even a byte code interpreter could be written, and a compiler for it, and this could be plugged into the system.
The fact that compilation is reminiscent of interpretation suggests that the strategy pattern could be used, when the compiler is written in the target environment. For this, the default evaluation routine or object needs to be replaced by another, one that compiles in stead of interpreting. Compiling would then reduce to loading a file, but using a different 'interpreter' for it. This abstraction is further warranted by the fact that a debugging environment is needed, one where a programmer can step through code.
Defining a function will reduce to defining a class (in an object-oriented system) with an eval or apply method. The method gets called with a list of arguments to perform its tasks. In addition, routines can be created that directly perform the task with the arguments given in directly in the target programming language. This is made easier with the Lisp dialect described earlier because of the constraint of a fixed number of arguments.
A form (foo a b) could be compiled to
LispObject object = cons(recSymbol("foo"), cons(recSymbol("a"), cons(recSymbol("b"),nil))); |
Equivalently, a symbol bar would be defined as:
LispObject object = recSymbol("bar"); |
If nothing is known about the function foo, no further simplifications can be made, as it could be a macro or a function, and thus nothing is known about whether the arguments need to be evaluated or not.
When foo is a macro, it can be dealt with easily by directly expanding the code in place. When foo is defined as
In> (defmacro foo (a) a) Out> foo |
Evaluation of (foo (+ 2 3)),
In> (foo (+ 2 3)) Out> (+ 2 3 ) |
is the same as inlining the code with a let statement:
In> (let ((a (quote (+ 2 3)))) a) Out> (+ 2 3 ) |
Macros have to be expanded before an expression is evaluated, but other than that, the compiler can read in all the code before inlining macros, to be able to inline macros that are defined later on. Alternatively, the actual evaluation could be cast in code that looks at runtime to discover whether a function call is actually a macro expansion, but this is less efficient at run time.
In the following examples, it is assumed that there are objects of type LispObject, which can hold a general Lisp expression, LispNumber which can hold a number, and functions eval, cons, and recSymbol. The last, recSymbol, takes some native argument like a string, and converts it to a LispObject. nil is taken to be an empty list.
Optimizing an expression can then continue in various steps. Suppose we take the definition of the call (foo a b) above, and that we know that foo is a function that will be compiled. Then we can rewrite
LispObject object = eval( cons(recSymbol("foo"), cons(recSymbol("a"), cons(recSymbol("b"),nil)))); return object; |
as
LispObject object = Foo.eval( cons(eval(recSymbol("a")), cons(eval(recSymbol("b")),nil))); return object; |
Here the arguments are evaluated explicitly before being passed to foo, and foo is hard-linked, called directly with the appropriate arguments.
A next step is to optimize the eval inside these expressions, eg. optimizing evaluation of a and b. If a and b were local variables from within a let statement, or arguments to a function call in its environment, this could be optimized further. Each declaration of a local variable can be made a real local variable in the target source code.
(let ((a 2) (b 3)) (foo a b)) |
LispObject a = recSymbol("2"); LispObject b = recSymbol("3"); LispObject object = Foo.eval(cons(a,cons(b,nil))); return object; |
This skips lookup of local variables in an environment, as the compiler will be able to directly take the appropriate value.
In the dialect described here, we only accept a fixed number of arguments, and thus we could add a method to the Foo object which accepts two arguments, and the code would change to:
LispObject a = recSymbol("2"); LispObject b = recSymbol("3"); LispObject object = Foo.eval(a,b); return object; |
If foo is guaranteed to only work with numeric values for input, we could add another method to the foo evaluator object Foo, one which takes two numeric arguments:
LispNumber a = 2; LispNumber b = 3; LispNumber number = Foo.evalTyped(a,b); return recSymbol(number); |
The interesting option for the last optimization is that all the intermediate values are now numbers, not lisp atoms. The advantage to this is that if an expression is more complex than the above example, the input arguments can be converted to numbers at the beginning, and then the arithmetic operations performed, and only at the end are these numbers converted back to atoms (through the recSymbol function explained above). So in this case,
(let ((a 2) (b 3)) (+ (foo a b) 5)) |
Could be converted to:
LispNumber a = 2; LispNumber b = 3; return recSymbol(Foo.evalTyped(a,b)+5); |
A lot of conversions to and from formats can be avoided if the compiler can recognize that it is appropriate. Code of this form would not be much slower than code written natively in the target language (actually not quite true, if you have the full language at your disposal you might be able to do optimizations by hand).
If functions are defined as macros, of course the compiler can inline them and perform additional optimizations, removing the need for the overhead of a call.
A further optimization could be a conversion to native numeric types. If the compiler knows that the result of some computation is an integer, and it fits in a native integer (for instance a 32-bit integer), the above code could become:
return recSymbol(Foo.evalTypedNative(2,3)+5); |
The compiler can optimize code based on assumptions about the input. This is why compiled languages usually have typed variables. This can also be solved, for specific cases, by using macros. Suppose one wants to multiply two matrices, (mat-mult a b). The elements of the matrices a and b can be anything, univariate polynomials, multivariate polynomials, analytic functions of one variable, complex, real. But when the compiler knows the elements of the matrices are real-valued, some optimizations can be made in the body of a macro. Alternatively, if a function call required for multiplying matrix elements is defined as a function, that function can be treated as a macro, and a separate function compiled from it and called from within mat-mult, to take into account the fact that it is going to be passed real-valued matrix elements. Having more information about the input gives options for optimizing the code more. This allows algorithms to be stated at a high-level, with the compiler taking care of the optimizations for a specific case.
With optimization by hand, often a situation is encountered where a loop performs a task a few million times, and thus every clock cycle that can be saved in the inner loop saves a few million clock cycles. Inlining macros allows the compiler to discover that an if or cond statement always executes the same clause, and thus the if or cond statement (which has to be independent of the loop index) can be taken outside of the loop.
One slight problem arises due to the fact that every expression in Lisp returns a result, including macro's. Consider the following piece of code:
(while (< i n) (progn (print i) (setf i (+ i 1)) ) ) |
This is easy to convert to for instance c++ syntax as:
while (i<n) { printf("%d\n",i); i=i+1; } |
Now a slightly more advanced example shows the problem:
(while (progn (print i) (< i n)) (progn (setf i (+ i 1)) ) ) |
This is valid Lisp code. progn returns the result of evaluating the last expression. A naive translation to c++ would yield:
while ({printf("%d\n",i); i<n;}) { i=i+1; } |
which is obviously invalid c++ code. One way around this in this case could be:
{ printf("%d\n",i); pred = i<n; } while (pred) { i=i+1; { printf("%d\n",i); pred = i<n; } } |
However, in this case, if the progn statement is large, it could result in a doubling of a large bit of code. It would however be efficient.
The other alternative would be to encapsulate the progn in a separate function that can be inlined by the compiler, as:
inline int pred(int i, int n) { printf("%d\n",i); return (i<n); } ... while (pred(i,n)) { i=i+1; } |
One more sophisticated example:
(f a (progn b c)) |
would at first erroneously yield:
f(a,{b; c;}); |
Also here the choice can be to either call the progn statement beforehand:
aarg=a; { b; barg = b; } f(aarg,barg); |
This can become expansive when there are nested macro-like calls. Or, encapsulation can be performed, leading (again) to a slightly less efficient version due to function call overhead:
inline int ftemp() { b; return c; } f(a,ftemp()); |
In the compiler, the choice has to be made to either inline the code directly, or to allow the c++/Java compiler to inline the code. The above suggests that when generating code for a function body, the algorithm should first try to expand macro bodies, until it encounters a function call, and then expand the function calls as arguments to function calls. When it cannot expand a function argument because it is actually a macro expansion, it can either inline it itself, or generate an inlined method for it to allow the c++/Java compiler to inline the code at hand.
Bootstrapping can solve this problem. One starts with a small interpreter which can run the code, albeit slowly, and a (possibly poor) compiler is written and used to compile some base components of the system. At the second iteration, in stead of loading the files that were loaded the first time, the compiled versions can be loaded, which run a lot faster. This can be repeated many times, compiling a small sub-system, and loading the compiled version the next time the interpeter starts up, thus making the system faster at each iteration.
All that is then needed for a Yacas system is to have a little Lisp interpreter in place, plus a compiler to compile to the native platform, and 'bootstrap' the rest of the code into the system using the compiler. The system effectively 'ports' itself to the new platform, by generating efficient code for that platform.
For example, one could first start with compiling some utility functions, then compile support for parsing Yacas expressions and pattern matching, and then compile more powerful script code.
As a first step, a simple but fast compiler can be invoked, invoking more powerful compilers each time the system has faster components, resulting in faster and faster code.
Take a function fact
(defun fact (n) (if (equal n 0) 1 (* n (fact (+ n -1))) ) ) |
Now, if we write this as
(defun (fact INTEGER) ( (n INTEGER) ) (if (equal n 0) 1 (* n (fact (+ n -1))) ) ) |
Note that this just adds extra information, about types that should be passed in or returned. But the addition of type information greatly aids in translating (compiling) to other targets. For example, in c++ this bit of code could look like
BigInteger fact(BigInteger n) { if (n==0) return 1; else return n*fact(n-1); } |
Strong typing has the advantages of adding extra information the compiler can use to optimize code further, and strong typing also allows the compiler to give appropriate error messages when a function is called with incompatible arguments.
The step back to untyped Lisp can of course always be made, with a simple translation step removing typing information from expressions. We can keep a simple low-level Lisp untyped interpreter, write code in a typed version of the Lisp dialect, and in a translation phase remove the typing information so the untyped interpreter can understand it.
For the typed Lisp, the following functions are changed:
In addition, global variables will have to be declared explicitly, with global:
declare is not needed for the untyped interpreter, just for the typed Lisp code that will be used for conversion to c++/Java or other targets.
The typing system is very simple; no automatic type conversion is performed. In stead, the user is expected to call conversion routines. For instance, when a function sin expects a float, but an integer n is passed in, the calling sequence should ne something like (sin (int-to-float n)), and not (sin n). Some functions, like setf, will accept different types of arguments (the second argument to setf should be of the type of the variable the value is assigned to).
This is a very simple typing system, but this Lisp dialect is not meant to be really used for programming, just a basic minimum set. Overloading of functions and variable number of arguments are also not supported.
The following native types are supported in the basic Lisp dialect:
This list will be extended.
The compiler can also be made to accept non-typed variables, arguments and functions. The type of these objects should then default to OBJECT. However, this does not save the user from converting to appropriate type. For instance, if the type of some expression is not explicitly set to INTEGER, and it is used in addition, then a call to object-to-int is required every time the object is treated as an integer. Next to it being a nuisance for the programmer, it also makes the code less efficient.
The compiler is of course free to ignore types when it comes to translation. All objects should be able to pass as type OBJECT. The typing system is there to ensure efficiency of the generated code. The untyped interpreter described in the beginning of this book disregards all typing information whatsoever. The compiler does however have to check that arguments passed in have a valid type.
By the same token, the interpreter could be modified to handle the typed expressions, and just ignore the typing information. The interpreter could easily be extended to ignore the typing information given:
In> (defun foo-untyped (n) (+ n 2)) Out> foo-untyped In> (defun (foo-typed INTEGER) ( (n INTEGER) ) (+ n 2) ) Out> (foo-typed INTEGER ) In> (foo-untyped 3) Out> 5 In> (foo-typed 3) Out> 5 |
The change from <name> to (<name> <type>) is a trivial one. But this extra information can easily be mapped to strong typed compiled languages, which is why it is a useful step.
Indeed, 100 lines of Yacas (Yacas already had an interpreter at the time of writing of this document, version 1.0.53 had an interpreter written in c++) code can render the following:
(defun foo (n ) (IntegerToObject (+ (ObjectToInteger n )2 ))) |
to
LispObject foo(LispObject n) { LispObject result ; { result = IntegerToObject(ObjectToInteger(n)+2); } return result; } |
and
(defun (foo INTEGER ) ((n INTEGER )) (+ n 2 )) |
to
BigInteger foo(BigInteger n) { BigInteger result ; { result = n+2; } return result; } |
The typing information in effect gives the compiler enough information to directly convert expressions to another syntax. The compiler thus just becomes a simple expression transformer.
Small Lisp interpreters can easily be written in any language. Being able to directly compile the interpreter into a program from for instance Common Lisp or Java allows the freedom to first print the code for the reference Yacas interpreter implementation, and then use it in other projects later on.
The main aim for Yacas is to allow for mathematically-oriented algorithms to be written easily. Being able to compile that code to code that can be compiled into other syntax gives flexibility. This effort thus also defines tha absolute bare minimum required for such a system.
Common Lisp was chosen because it is sufficiently close to Yacas, Yacas resembling a Lisp interpreter internally. Yacas already comes with a parser for Yacas script code, and can easily manipulate expressions to bring them into a form suitable for execution in another environment.
Not all of Yacas will be made available, just the functional subset needed for doing CAS. Other systems have their own functionality for interacting with the system (file input/output and such).
Thus this project takes a minimum subset of Common Lisp to implement a minimum environment for running code originally written in Yacas script. In practice this system could then also be used in other Lisp dialects like Scheme or elisp, or a minimum Lisp interpreter implementation can be written for other environments.
The following sections describe the implementation of the parts described above.
The functions defined that offer specific functionality for Yacas interpretation are prepended with a yacas-, in order to be able to easily recognize their origin.
Functions compiled from Yacas script have yacas-script- prepended, and -arity-n appended (where n is the arity of the function defined).
The lisp code is divided into the following modules:
The following is used from Common Lisp:
Note that in theory both read and print can be replaced by something else, or not implemented at all (then the standard Lisp print and read can be used.
The functions to perform these tasks are yacas-read, yacas-eval and yacas-print.
There are slight differences between the way Common Lisp evaluates and the way Yacas expressions should be evaluated, necessitating a bit of code between the two.
In Yacas, when a function is not defined or a variable not bound to a value, the expression in question is returned unevaluated (actually, in the case of a function call, its arguments are evaluated).
Since interpreting a Lisp program is just an algorithm, and any algorithm can be implemented in Lisp, one could write a Lisp interpreter in Lisp itself. In fact, it is very simple to do so. This section describes an example Lisp interpreter written in Lisp itself. It will be used in the end to implement an interpreter for Yacas code.
The advantage of directly compiling to native code (c++/Java) is clear; one stands a chance of writing a Lisp interpreter in Lisp itself, and have it be as efficient as the internal interpreter itself (if the compiler is good enough). The interpreting code just needs to be compiled by a good compiler.
59 Temple Place, Suite 330 Boston, MA, 02111-1307 USA |
Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
This License is a kind of ``copyleft'', which means that derivative works of the document must themselves be free in the same sense. It complements the GNU General Public License, which is a copyleft license designed for free software.
We have designed this License in order to use it for manuals for free software, because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. But this License is not limited to software manuals; it can be used for any textual work, regardless of subject matter or whether it is published as a printed book. We recommend this License principally for works whose purpose is instruction or reference.
A ``Modified Version'' of the Document means any work containing the Document or a portion of it, either copied verbatim, or with modifications and/or translated into another language.
A ``Secondary Section'' is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document's overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. (For example, if the Document is in part a textbook of mathematics, a Secondary Section may not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters, or of legal, commercial, philosophical, ethical or political position regarding them.
The ``Invariant Sections'' are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the Document is released under this License.
The ``Cover Texts'' are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License.
A ``Transparent'' copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, whose contents can be viewed and edited directly and straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise Transparent file format whose markup has been designed to thwart or discourage subsequent modification by readers is not Transparent. A copy that is not ``Transparent'' is called ``Opaque''.
Examples of suitable formats for Transparent copies include plain ASCII without markup, Texinfo input format, LaTeX input format, SGML or XML using a publicly available DTD, and standard-conforming simple HTML designed for human modification. Opaque formats include PostScript, PDF, proprietary formats that can be read and edited only by proprietary word processors, SGML or XML for which the DTD and/or processing tools are not generally available, and the machine-generated HTML produced by some word processors for output purposes only.
The ``Title Page'' means, for a printed book, the title page itself, plus such following pages as are needed to hold, legibly, the material this License requires to appear in the title page. For works in formats which do not have any title page as such, ``Title Page'' means the text near the most prominent appearance of the work's title, preceding the beginning of the body of the text.
You may also lend copies, under the same conditions stated above, and you may publicly display copies.
If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages.
If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a publicly-accessible computer-network location containing a complete Transparent copy of the Document, free of added material, which the general network-using public has access to download anonymously at no charge using public-standard network protocols. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public.
It is requested, but not required, that you contact the authors of the Document well before redistributing any large number of copies, to give them a chance to provide you with an updated version of the Document.
If the Modified Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied from the Document, you may at your option designate some or all of these sections as invariant. To do this, add their titles to the list of Invariant Sections in the Modified Version's license notice. These titles must be distinct from any other section titles.
You may add a section entitled ``Endorsements'', provided it contains nothing but endorsements of your Modified Version by various parties -- for example, statements of peer review or that the text has been approved by an organization as the authoritative definition of a standard.
You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any one entity. If the Document already includes a cover text for the same cover, previously added by you or by arrangement made by the same entity you are acting on behalf of, you may not add another; but you may replace the old one, on explicit permission from the previous publisher that added the old one.
The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modified Version.
The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy. If there are multiple Invariant Sections with the same name but different contents, make the title of each such section unique by adding at the end of it, in parentheses, the name of the original author or publisher of that section if known, or else a unique number. Make the same adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work.
In the combination, you must combine any sections entitled ``History'' in the various original documents, forming one section entitled ``History''; likewise combine any sections entitled ``Acknowledgements'', and any sections entitled ``Dedications''. You must delete all sections entitled ``Endorsements.''
You may extract a single document from such a collection, and distribute it individually under this License, provided you insert a copy of this License into the extracted document, and follow this License in all other respects regarding verbatim copying of that document.
If the Cover Text requirement of section 3 is applicable to these copies of the Document, then if the Document is less than one quarter of the entire aggregate, the Document's Cover Texts may be placed on covers that surround only the Document within the aggregate. Otherwise they must appear on covers around the whole aggregate.
Each version of the License is given a distinguishing version number. If the Document specifies that a particular numbered version of this License "or any later version" applies to it, you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. If the Document does not specify a version number of this License, you may choose any version ever published (not as a draft) by the Free Software Foundation.
Copyright (C) YEAR YOUR NAME. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with the Invariant Sections being LIST THEIR TITLES, with the Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST. A copy of the license is included in the section entitled ``GNU Free Documentation License''. |
If you have no Invariant Sections, write ``with no Invariant Sections'' instead of saying which ones are invariant. If you have no Front-Cover Texts, write ``no Front-Cover Texts'' instead of ``Front-Cover Texts being LIST''; likewise for Back-Cover Texts.
If your document contains nontrivial examples of program code, we recommend releasing these examples in parallel under your choice of free software license, such as the GNU General Public License, to permit their use in free software.