This part describes how the arithmetic library is embedded into Yacas and how to embed other arithmetic libraries.
The basic layout is as follows: there is one class BigNumber that offers basic numerical functions, arithmetic operations such as addition and multiplication, through a set of class methods. Integers and floating-point numbers are handled by the same class.
The BigNumber class delegates the actual arithmetic operations to the auxiliary classes BigInt and BigFloat. These two classes are direct wrappers of an underlying arithmetic library. The library implements a particular internal representation of numbers.
The responsibility of the class BigNumber is to perform precision tracking, floating-point formatting, error reporting, type checking and so on, while BigInt and BigFloat only concern themselves with low-level arithmetic operations on integer and floating-point numbers respectively. In this way Yacas isolates higher-level features like precision tracking from the lower-level arithmetic operations. The number objects in a library should only be able to convert themselves to/from a string and perform basic arithmetic. It should be easy to wrap a generic arithmetic library into a BigNumber implementation.
It is impossible to have several alternative number libraries operating at the same time. [In principle, one might write the classes BigInt and BigFloat as wrappers of two different arithmetic libraries, one for integers and the other for floats, but at any rate one cannot have two different libraries for integers at the same time.] Having several libraries in the same Yacas session does not seem to be very useful; it would also incur a lot of overhead because one would have to convert the numbers from one internal library representation to another. For performance benchmarking or for testing purposes one can compile separate versions of Yacas configured with different arithmetic libraries.
To embed an arbitrary-precision arithmetic library into Yacas, one needs to write two wrapper classes, BigInt and BigFloat. (Alternatively, one could write a full BigNumber wrapper class but that would result in code duplication unless the library happens to implement a large portion of the BigNumber API. There is already a reference implementation of BigNumber through BigInt and BigFloat in the file numbers.cpp.) The required API for the BigNumber class is described below.
// Calculate z=x+y where x=10 and y=15 BigNumber x("10",100,10); BigNumber y("15",100,10); BigNumber z; z.Add(x,y,10)); // cast the result to a string LispString str; z.ToString(str,10); |
A calculation might modify one of its arguments. This might happen when one argument passed in is actually the object performing the calculation itself. For example, if a calculation
x.Add(x,y); |
Therefore, all class methods of BigNumber that allow a BigNumber object as an argument should behave correctly when called destructively on the same BigNumber object. The result must be exactly the same as if all arguments were copied to temporary locations before performing tasks on them, with no other side-effects. For instance, if the specific object representing the number inside the numeric class is shared with other objects, it should not allow the destructive operation, as then other objects might start behaving differently.
The basic arithmetic class BigNumber defines some simple arithmetic operations, through which other more elaborate functions can be built. Particular implementations of the multiple-precision library are wrapped by the BigNumber class, and the rest of the Yacas core should only use the BigNumber API.
This API will not be completely exposed to Yacas scripts, because some of these functions are too low-level. Among the low-level functions, only those that are very useful for optimization will be available to the Yacas scripts. (For the functions that seem to be useful for Yacas, suggested Yacas bindings are given below.) But the full API will be available to C++ plugins, so that multiple-precision algorithms could be efficiently implemented when performance is critical. Intermediate-level arithmetic functions such as MathAdd, MathDiv, MathMod and so on could be implemented either in the Yacas core or in plugins, through this low-level API. The library scripts will be able to transform numerical expressions such as x:=y+z into calls of these intermediate-level functions.
Here we list the basic arithmetic operations that need to be implemented by a multiple-precision class BigNumber. The operations are divided into several categories for convenience. Equivalent Yacas script code is given, as well as examples of C++ usage.
x.SetTo("2.e-19", 100, 10); |
x.SetTo("2a8c.e2", 100, 16); |
x.SetTo(12345); y.SetTo(-0.001); |
x.ToString(buffer, 200, 16); // hexadecimal x.ToString(buffer, 40, 10); // decimal |
double a=x.Double(); |
x.SetTo(y); |
x.Equals(y)==true; |
MathEquals(x,y) |
y.Equals(x) |
x.IsInt()==true; |
x.IsIntValue()==true; |
FloatIsInt(x) |
x.BecomeInt(); x.BecomeFloat(); x.BecomeFloat(100); |
x.LessThan(y)==true; |
LessThan(x,y) |
x.Floor(y); |
MathFloor(x) |
prec=x.GetExactBits(); |
GetExactBits(x) |
x.SetExactBits(300); |
SetExactBits(x, 300) |
x.Add(y,z, 300); |
MathAdd(x,y) |
x.Negate(y); |
MathNegate(x) |
x.Multiply(y,z, 300); |
MathMultiply(x,y) |
x.Divide(y,z, 300); |
MathDivide(x,y) |
x.IsSmall()==true; |
MathIsSmall(x) |
x.MultiplyAdd(y,z, 300); |
MathMultiplyAdd(x,y,z) |
x.Mod(y,n); |
MathMod(x,n) |
int sign_of_x = x.Sign(); |
MathSign(x) |
For floating-point numbers, BitCount should return the binary exponent of the number (with sign), like the integer output of the standard C function frexp. More formally: if n=BitCount(x), and x!=0, then 1/2<=Abs(x)*2^(-n)<1. The bit count of an integer or a floating 0 is arbitrarily defined to be 1. (Optimization of the binary logarithm.) C++:
x.BitCount(); |
MathBitCount(x) |
x.ShiftLeft(y, n); x.ShiftRight(y, n); |
ShiftLeft(x,n); ShiftRight(x,n); |
x.BitAnd(y,z); x.BitOr(y,z); x.BitXor(y,z); x.BitNot(y); |
BitAnd(x,y); BitOr(y,z); BitXor(y,z); BitNot(y); |
The API includes only the most basic operations. All other mathematical functions such as GCD, power, logarithm, cosine and so on, can be efficiently implemented using this basic interface.
Note that generally the arithmetic functions will set the type of the resulting object to the type of the result of the operation. For example, operations that only apply to integers (Mod, BitAnd etc.) will set the type of the resulting object to integer if it is a float. The results of these operations on non-integer arguments are undefined.
In some arithmetic operations (add, multiply, divide) the working precision is given explicitly. For example,
x.Add(y,z,100) |
For the Yacas arithmetic library, a "poor man's interval arithmetic" is proposed where the precision is represented by the "number of correct bits". The precision is not tracked exactly but almost always adequately. The purpose of this kind of rough precision tracking is to catch a critical roundoff error or to indicate an unexpected loss of precision in numerical calculations.
Suppose we have two floating-point numbers x and y and we know that they have certain numbers of correct mantissa bits, say m and n. In other words, x is an approximation to an unknown real number x'=x*(1+delta) and we know that Abs(delta)<2^(-m); and similarly y'=y*(1+epsilon) with Abs(epsilon)<2^(-n). Here delta and epsilon are the relative errors for x and y. Typically delta and epsilon are much smaller than 1.
Suppose that every floating-point number knows the number of its correct digits. We can symbolically represent such numbers as pairs {x,m} or {y,n}. When we perform an arithmetic operation on numbers, we need to update the precision component as well.
Now we shall consider the basic arithmetic operations to see how the precision is updated.
More formally, we have the estimates Abs(delta)<2^(-m), Abs(epsilon)<2^(-n) and we need a similar estimate Abs(r)<2^(-p) for r=delta+epsilon.
If the two numbers x and y have the same number of correct bits, we should double the error (i.e. decrease the number of significant bits by 1). But if they don't have the same number of bits, we cannot really estimate the error very well. To be on the safe side, we might double the error if the numbers x and y have almost the same number of significant bits, and leave the error constant if the numbers of significant bits of x and y are very different.
The answer expressed as a formula is p=Min(m,n) if Abs(m-n)>=D and p=Min(m,n)-1 otherwise. Here D is a constant that expresses our tolerance for error. In the current implementation, D=1.
If one of the operands is a floating zero x={0.,m} (see below) and x={x,n}, then p=m-BitCount(x)+1. This is the same formula as above, if we pretend that the bit count of {0.,m} is equal to 1-m.
Formally, we have the relative precision r of x+y as
Note that there is one important case when we can estimate the precision better than this. Suppose x and y have the same sign; then there is no cancellation when we compute x+y. The above formula for r gives an estimate
If one of the operands is a floating zero represented by x={0.,m} (see below), then the calculation of the error is formally the same as in the case x={1.,m}. This is as if the bit count of {0.,m} were equal to 1 (unlike the case of multiplication).
Finally, if the sum x+y is a floating zero but x!=0 and y!=0, then it must be that a=b. In that case we represent x+y as {0.,p}, where p=Min(m,n)-a.
x.Add(y,z,100) |
We should truncate one or both of the arguments to a smaller precision before starting the operation. For the multiplication as well as for the addition, the precision tracking involves a comparison of two binary exponents 2^(-g) and 2^(-h) to obtain an estimate on 2^(-g)+2^(-h). Here g and h are some integers that are easy to obtain during the computation. For instance, the multiplication involves g=m and h=n. This comparison will immediately show which of the arguments dominates the error.
The ideal situation would be when one of these exponentials is much smaller than the other, but not very much smaller (that would be a waste of precision). In other words, we should aim for Abs(g-h)<8 or so, where 8 is the number of guard bits we would like to maintain. (Generally it is a good idea to have at least 8 guard bits; somewhat more guard bits do not slow down the calculation very much, but 200 guard bits would be surely an overkill.) Then the number that is much more precise than necessary can be truncated.
For example, if we find that g=250 and h=150, then we can safely truncate x to 160 bits or so; if, in addition, we need only 130 bits of final precision, then we could truncate both x and y to about 140 bits.
Note that when we need to subtract two almost equal numbers, there will be a necessary loss of precision, and it may be impossible to decide on the target precision before performing the subtraction. Therefore the subtraction will have to be performed using all available digits.
It is impossible to track the relative precision of a floating zero, but it is possible to track the absolute precision. Suppose we store the bit count of the absolute precision, just as we store the bit count of the relative precision with nonzero floats. Thus we represent a floating zero as a pair {0.,n} where n is an integer, and the meaning of this is a number between -2^(-n) and 2^(-n).
We can now perform some arithmetic operations on the floating zero. Addition and multiplication are handled similarly to the non-zero case, except that we interpret n as the absolute error rather than the relative error. This does not present any problems. For example, the error estimates for addition is the same as if we had a number 1 with relative error 2^(-n) instead of {0.,n}. With multiplication of {x,m} by {0.,n}, the result is again a floating zero {0.,p}, and the new estimate of absolute precision is p=n-BitCount(x)+1.
The division by the floating zero, negative powers, and the logarithm of the floating zero are not representable in our arithmetic because, interpreted as intervals, they would correspond to infinite ranges. The bit count of the floating zero is therefore undefined. However, we can define a positive power of the floating zero (the result is again a floating zero).
The sign of the floating zero is defined as (integer) 0. (Then we can quickly check whether a given number is a zero.)
Suppose that x=12.0 and y=12.00. Then in fact x might represent a number such as 12.01, while y might represent 11.999. There may be two approaches: first, "12.0" is not equal to "12.00" because x and y might represent different numbers. Second, "12.0" is equal to "12.00" because x and y might also represent equal numbers. A logical continuation of the first approach is that "12.0" is not even equal to another copy of "12.0" because they might represent different numbers, e.g. if we compute x=6.0+6.0 and y=24.0/2.0, the roundoff errors might be different.
Here is an illustration in support for the idea that the comparison 12.0=12 should return True. Suppose we are writing an algorithm for computing the power, x^y. This is much faster if y is an integer because we can use the binary squaring algorithm. So we need to detect whether y is an integer. Now suppose we are given x=13.3 and y=12.0. Clearly we should use the integer powering algorithm, even though technically y is a float. (To be sure, we should check that the integer powering algorithm generates enough significant digits.)
However, the opposite approach is also completely possible: no two floating-point numbers should be considered equal, except perhaps when one is a bit-for-bit exact copy of the other and when we haven't yet performed any arithmetic on them. (The GMP library uses essentially this definition.) It seems that no algorithm really needs a test for equality of floats. The two useful comparisons on floats x, y seem to be the following:
Given these predicates, it seems that any floating-point algorithm can be implemented just as efficiently as with any "reasonable" definition of the floating-point equality.
1) The number x stays the same but further calculations are done with 10 digits. In terms of the internal binary representation, the number is padded with binary zeros. This means that now e.g. 1+x will not be equal to 1.1 but to something like 1.100000381 (to 10 digits). And actually x itself should evaluate to 0.1000003815 now. This was 0.1 to 5 digits but it looks a little different if we print it to 10 digits. (A "binary-padded decimal".)
This problem may look horrible at first sight -- "how come I can't write 0.1 any more??" -- but this seems so because we are used to calculations in decimals with a fixed precision, and the operation such as "increase precision by 10 digits" is largely unfamiliar to us except in decimals. This seems to be mostly a cosmetic problem. In a real calculation, we shouldn't be writing "0.1" when we need an exact number 1/10. When we request to increase precision in the middle of a calculation, this mistake surfaces and gives unexpected results.
2) When precision is increased, the number x takes its decimal representation, pads it with zeros, and converts back to the internal representation, just so that the appearance of "1.100000381" does not jar our eyes. (Note that the number x does not become "more precise" if we pad it with decimal zeros instead of binary zeros, unless we made a mistake and wrote "0.1" instead an exact fraction 1/10.)
With this approach, each number x that doesn't currently have enough digits must change in a complicated way. This will mean a performance hit in all calculations that require dynamically changing precision (Newton's method and some other fast algorithms require this). In these calculations, the roundoff error introduced by "1.100000381" is automatically compensated and the algorithm will work equally well no matter how we extend x to more digits; but it's a lot slower to go through the decimal representation every time.
3) The approach currently being implemented in Yacas is a compromise between the above two. We distinguish number objects that were given by the user as decimal strings (and not as results of calculations), for instance x:=1.2, from number objects that are results of calculations, for instance y:=1.2*1.4. Objects of the first kind are interpreted as exact rational numbers given by a decimal fraction, while objects of the second kind are interpreted as inexact floating-point numbers known to a limited precision. Suppose x and y are first assigned as indicated, with the precision of 5 digits each, then the precision is increased to 10 digits and x and y are used in some calculation. At this point x will be converted from the string representation "1.2" to 10 decimal digits, effectively making 1.2 a shorthand for 1.200000000. But the value of y will be binary-padded in some efficient way and may be different from 1.680000000.
In this way, efficiency is not lost (there are no repeated conversions from binary to decimal and back), and yet the cosmetic problem of binary-padded decimals does not appear. An explicitly given decimal string such as "1.2" is interpreted as a shorthand for 1.2000... with as many zeroes as needed for any currently selected precision. But numbers that are results of arithmetic operations are not converted back to a decimal representation for zero-padding. Here are some example calculations:
In> Precision(5) Out> True In> x:=1.2 Out> 1.2 In> y:=N(1/3) Out> 0.33333 |
In> Precision(20) Out> True In> y Out> 0.33333 |
In> y+0 Out> 0.33333333325572311878 |
In> x+0 Out> 1.2 |
In> z:=y+0 Out> 0.33333333325572311878 In> Precision(40) Out> True In> z Out> 0.33333333325572311878 |
In> z+0 Out> 0.33333333325572311878204345703125 |
Suppose we have floating-point numbers x and y, known only to 2 and 3 significant digits respectively. For example, x=1.6 and y=2.00. These x and y are results of previous calculations and we do not have any more digits than this. If we now say
Precision(10); x*y; |
The first interpretation of Precision() is only possible to satisfy if we are given a self-contained calculation with integer numbers. For example,
N(Sin(Sqrt(3/2)-10^(20)), 50) |
So it seems that the second interpretation of Precision(n), namely: "please use n digits in all calculations now", is more sensible as a general-purpose prescription.
But this interpretation does not mean that all numbers will be printed with n digits. Let's look at a particular case (for simplicity we are talking about decimal digits but in the implementation they will be binary digits). Suppose we have x precise to 10 digits and y precise to 20 digits, and the user says Precision(50) and z:=1.4+x*y. What happens now in this calculation? (Assume that x and y are small numbers of order 1; the other cases are similar.)
First, the number "1.4" is now interpreted as being precise to 50 digits, i.e. "1.4000000...0" but not more than 50 digits.
Then we compute x*y using their internal representations. The result is good only to 10 digits, and it knows this. We do not compute 50 digits of the product x*y, it would be pointless and a waste of time.
Then we add x*y to 1.4000...0. The sum, however, will be precise only to 10 digits. We can do one of the two things now: (a) we could pad x*y with 40 more zero digits and obtain a 50-digit result. However, this result will only be correct to 10 digits. (b) we could truncate 1.4 to 10 digits (1.400000000) and obtain the sum to 10 digits. In both cases the result will "know" that it only has 10 correct digits.
It seems that the option (b) is better because we do not waste time with extra digits.
The result is a number that is precise to 10 digits. However, the user wants to see this result with 50 digits. Even if we chose the option (a), we would have had some bogus digits, in effect, 40 digits of somewhat random round-off error. Should we print 10 correct digits and 40 bogus digits? It seems better to print only 10 correct digits in this case. The GMP library already has this functionality in its string printing functions: it does not print more digits than the number actually holds.
If we choose this route, then the only effect of Precision(50) will be to interpret a literal constant 1.4 as a 50-digit number. All other numbers already know their real precision and will not invent any bogus digits.
In some calculations, however, we do want to explicitly extend the precision of a number to some more digits. For example, in Newton's method we are given a first approximation x[0] to a root of f(x)=0 and we want to have more digits of that root. Then we need to pad x[0] with some more digits and re-evaluate f(x[0]) to more digits (this is needed to get a better approximation to the root). This padding operation seems rather special and directed at a particular number, not at all numbers at once. For example, if f(x) itself contains some floating-point numbers, then we should be unable to evaluate it with higher precision than allowed by the precision of these numbers. So it seems that we need access to these two low-level operations: the padding and the query of current precision. The proposed interface is GetExactBits(x) and SetExactBits(x,n). These operations are directed at a particular number object x.
IsInteger(0) =True IsIntValue(1.)=True IsInteger(1.) =False |
x:=1.33; y:=x/3; |
We give formulae for p in terms of x, y, m, and n. Sometimes the bit count of a number x is needed; it is denoted B(x) for brevity.
The bit count B(x) is an integer function of x defined for real x!=0 by
The bit count function can be usually computed in constant time because the usual representation of long numbers is by arrays of platform integers and a binary exponent. The length of the array of digits is usually available at no computational cost.
The absolute error Delta[x] of {x,n} is of order Abs(x)*2^(-n). Given the bit count of x, this can be estimated from as
From the definition of {x,n} with x!=0 it follows that 0 can be within the precision interval only if n<= -1. Therefore, we should transform any number {x,n} such that x!=0 and n<= -1 into a floating zero {0.,p} where
First, we can quickly check that the values x and y have the same nonzero signs and the same bit counts, B(x)=B(y). If x>0 and y<0 or vice versa, or if B(x)=B(y), then the two numbers are definitely unequal. We can also check whether both x=y=0; if this is the case, then we know that {x,m}={y,n} because any two zeros are equal.
However, a floating zero can be sometimes equal to a nonzero number. So we should now exclude this possibility: {0.,m}={y,n} if and only if Abs(y)<2^(-m). This condition is equivalent to
If these checks do not provide the answer, the only possibility left is when x!=0 and y!=0 and B(x)=B(y).
Now we can consider two cases: (1) both x and y are floats, (2) one is a float and the other is an integer.
In the first case, {x,m}={y,n} if and only if the following condition holds:
It is now necessary to compute x-y (one long addition); this computation needs to be done with Min(m,n) bits of precision.
After computing x-y, we can avoid the full evaluation of the complicated condition by first checking some easier conditions on x-y. If x-y=0 as floating-point numbers ("exact cancellation"), then certainly {x,m}={y,n}. Otherwise we can assume that x-y!=0 and check:
If neither of these conditions can give us the answer, we have to evaluate the full condition by computing Abs(x)*2^(-m) and Abs(x)*2^(-m) and comparing with Abs(x-y).
In the second case, one of the numbers is an integer x and the other is a float {y,n}. Then x={y,n} if and only if
This procedure is basically the same as comparing {x,n} with Floor(x).
First consider the case when x+y!=0.
If x is zero, i.e. {0.,m} (but x+y!=0), then the situation with precision is the same as if x were {1.,m}, because then the relative precision is equal to the absolute precision. In that case we take the bit count of x as B(0)=1 and proceed by the same route.
First, we should decide whether it is necessary to add the given numbers. It may be unnecessary if e.g. x+y<=>x within precision of x (we may say that a "total underflow" occurred during addition). To check for this, we need to estimate the absolute errors of x and y:
Suppose none of these checks were successful. Now, the float value z=x+y needs to be calculated. To find it, we need the target precision of only
Then we compute B(z) and determine the precision p as
In the case when x and y have the same sign, we have a potentially better estimate p=Min(m,n). We should take this value if it is larger than the value obtained from the above formula.
Also, the above formula is underestimating the precision of the result by 1 bit if the result and the absolute error are dominated by one of the summands. In this case the absolute error should be unchanged save for the Dist term, i.e. the above formula needs to be incremented by 1. The condition for this is B(x)>B(y) and B(x)-m>B(y)-n, or the same for y instead of x.
The result is now {z,p}.
Note that the obtained value of p may be negative (total underflow) even though we have first checked for underflow. In that case, we need to transform {z,p} into a floating zero, as usual.
Now consider the case when z:=x+y=0.
This is only possible when B(x)=B(y). Then the result is {0.,p} where p is found as
If the addition needs to be performed with a given maximum precision P, and it turns out that p>P, then we may truncate the final result to P digits and set its precision to P instead. (It is advisable to leave a few bits untruncated as guard bits.) However, the first operation z:=x+y must be performed with the precision specified above, or else we run the danger of losing significant digits of z.
It is enough to convert the integer to a float {x,m} with a certain finite precision m and then follow the general procedure for adding floats. The precision m must be large enough so that the absolute error of {x,m} is smaller than the absolute error of {y,n}: B(x)-m<=B(y)-n-1, hence
Sometimes the formula gives a negative value for the minimum m; this means underflow while adding the integer (e.g. adding 1 to 1.11e150). In this case we do not need to perform any addition at all.
First consider the case when x!=0 and y!=0. The resulting value is z=x*y and the precision is
If one of the numbers is an integer x, and the other is a float {y,n}, it is enough to convert x to a float with somewhat more than n bits, e.g. {x,n+3}, so that the Dist function does not decrement the precision of the result.
Now consider the case when {x,m}={0,m} but y!=0. The result z=0 and the resulting precision is
Finally, consider the case when {x,m}={0,m} and {y,n}={0,n}. The result z=0 and the resulting precision is
The last two formulae are the same if we defined the bit count of {0.,m} as 1-m. This differs from the "standard" definition of B(0)=1. (The "standard" definition is convenient for the handling of addition.) With this non-standard definition, we may use the unified formula
If the multiplication needs to be performed to a given target precision P which is larger than the estimate p, then we can save time by truncating both operands to P digits before performing the multiplication. (It is advisable to leave a few bits untruncated as guard bits.)
When x=0 and y!=0, the result of division {0.,m}/{y,n} is a floating zero {0.,p} where p=m+B(y)-1. When x is an integer zero, the result is also an integer zero.
Division by an integer zero or by a floating zero is not permitted. The code should signal a zero division error.
If the number {x,n} is nonzero, then only x changes by shifting but n does not change; if {x,n} is a floating zero, then x does not change and n is decremented (ShiftLeft) or incremented (ShiftRight) by the shift amount:
{x, n} << s = {x<<s, n}; {0.,n} << s = {0., n-s}; {x, n} >> s = {x>>s, n}; {0.,n} >> s = {0., n+s}; |
We could have implemented the exponent N as a big integer but this would be inefficient most of the time, slowing down the calculations. Arithmetic with floating-point numbers requires only very simple operations on their exponents (basically, addition and comparisons). These operations would be dominated by the overhead of dealing with big integers, compared with platform integers.
A known issue with limited exponents is the floating-point overflow and exponent underflow. (This is not the same underflow as with adding 1 to a very small number.) When the exponent becomes too large to be represented by a platform signed integer type, the code must signal an overflow error (e.g. if the exponent is above 2^31) or an underflow error (e.g. if the exponent is negative and below -2^31).
First, we recognize that we shall only have one particular numerical library linked with Yacas, and we do not have to compile our implementation of the square root if this library already contains a good implementation. We can use conditional compilation directives (#ifdef) to exclude our square root code and to insert a library wrapper instead. This scheme could be automated, so that appropriate #defines are automatically created for all functions that are already available in the given multiple-precision library, and the corresponding Yacas kernel code that uses the BigNumber API is automatically replaced by library wrappers.
Second, we might compile the library wrapper as a plugin, replacing the script-level square root function with a plugin-supplied function. This solution is easier in some ways because it doesn't require any changes to the Yacas core, only to the script library. However, the library wrapper will only be available to the Yacas scripts and not to the Yacas core functions. The basic assumption of the plugin architecture is that plugins can provide new external objects and functions to the scripts, but plugins cannot modify anything in the kernel. So plugins can replace a function defined in the scripts, but cannot replace a kernel function. Suppose that some other function, such as a computation of the elliptic integral which heavily uses the square root, were implemented in the core using the BigNumber API. Then it will not be able to use the square root function supplied by the plugin because it has been already compiled into the Yacas kernel.
Third, we might put all functions that use the basic API (MathSqrt, MathSin etc.) into the script library and not into the Yacas kernel. When Yacas is compiled with a particular numerical library, the functions available from the library will also be compiled as the kernel versions of MathSqrt, MathPower and so on (using conditional compilation or configured at build time). Since Yacas tries to call the kernel functions before the script library functions, the available kernel versions of MathSqrt etc. will supersede the script versions, but other functions such as BesselJ will be used from the script library. The only drawback of this scheme is that a plugin will not be able to use the faster versions of the functions, unless the plugin was compiled specifically with the requirement of the particular numerical library.
So it appears that either the first or the third solution is viable.
Suppose that the mantissa of a floating-point number is known to d decimal digits. It means that the relative error is no more than 0.5*10^(-d). The mantissa is represented internally as a binary number. The number b of precise bits of mantissa should be determined from the equation 10^(-d)=2^(-b), which gives b=d*Ln(10)/Ln(2).
One potential problem with the conversions is that of incorrect rounding. It is impossible to represent d decimal digits by some exact number b of binary bits. Therefore the actual value of b must be a little different from the theoretical one. Then suppose we perform the inverse operation on b to obtain the corresponding number of precise decimal digits; there is a danger that we shall obtain a number d' that is different from d.
To avoid this danger, the following trick is used. The binary base 2 is the least of all possible bases, so successive powers of 2 are more frequent than successive powers of 10 or of any other base. Therefore for any power of 10 there will be a unique power of 2 that is the first one above it.
The recipe to obtain this power of 2 is simple: one should round d*Ln(10)/Ln(2) upwards using the Ceil function, but b*Ln(2)/Ln(10) should be rounded downwards using the Floor function.
This procedure will make sure that the number of bits b is high enough to represent all information in the d decimal digits; at the same time, the number d will be correctly restored from b. So when a user requests d decimal digits of precision, Yacas may simply compute the corresponding value of b and store it. The precision of b digits is enough to hold the required information, and the precision d can be easily computed given b.
Higher up, Yacas only knows about objects derived from LispObject. Specifically, there are objects of class LispAtom which represent an atom.
Symbolic and string atoms are uniquely represented by the result returned by the String() method. For number atoms, there is a separate class, LispNumber. Objects of class LispNumber also have a String() method in case a string representation of a number is needed, but the main uniquely identifying piece of information is the object of class BigNumber stored inside a LispNumber object. This object is accessed using the Number() method of class LispNumber.
The life cycle of a LispNumber is as follows:
In order to fully support the LispNumber object, the function in the kernel that determines if two objects are the same needs to know about LispNumber. This is required to get valid behaviour. Pattern matching for instance uses comparisons of this type, so comparisons are performed often and need to be efficient.
The other functions working on numbers can, in principle, call the String() method, but that induces conversions from BigNumber to string, which are relatively expensive operations. For efficiency reasons, the functions dealing with numeric input should call the Number() method, operate on the BigNumber returned, and return a LispNumber constructed with a BigNumber. A function can call String() and return a LispNumber constructed with a string representation, but it will be less efficient.
When a BigNumber is constructed from a decimal string, one has to specify a desired precision (in decimal digits). Internally, BigNumber objects store numbers in binary and will allocate enough bits to cover the desired precision. However, if the given string has more digits than the given precision, the BigNumber object will not truncate it but will allocate more bits so that the information given in the decimal string is not lost. If later the string representation of the BigNumber object is requested, the produced string will match the string from which the BigNumber object was created.
Internally, the BigNumber object knows how many precise bits it has. The number of precise digits might be greater than the currently requested precision. But a truncation of precision will only occur when performing arithmetic operations. This behavior is desired, for example:
In> Precision(6) Out> True; In> x:=1.23456789 Out> 1.23456789; In> x+1.111 Out> 2.345567; In> x Out> 1.23456789; |
This behavior is implemented by storing the string representation "1.23456789" in the LispNumber object x. When an arithmetic calculation such as x+1.111 is requested, the Number method is called on x. This method, when called for the first time, converts the string representation into a BigNumber object. That BigNumber object will have 28 bits to cover the 9 significant digits of the number, not the 19 bits normally required for 6 decimal digits of precision. But the result of an arithmetic calculation is not computed with more than 6 decimal digits. Later when x needs to be printed, the full string representation is available so it is printed.
If now we increase precision to 20 digits, the object x will be interpreted as 1.23456789 with 12 zeros at the end.
In> Precision(20) Out> True; In> x+0.000000000001 Out> 1.234567890001; |