October 1st,2000: modified according to Ruby-1.6 change.

BigFloat is an extension library for the Ruby interpreter. Using BigFloat class, you can obtain any number of significant digits in computation.

**Maintenance of BigFloat has been stopped, use BigDecimal which is bundled from Ruby-1.8 instead.
The most recent source code can be downloaded from Ruby CVS.** 2003 - 8

For the details about Ruby see:

- http://www.ruby-lang.org/en/:Official Ruby page(English).
- http://ruby.freak.ne.jp/:Ruby informations(Japanese).
- http://kahori.com/ruby/ring/:Mutually linked pages relating to Ruby(Japanese).

This software is provided "AS IS" and without any express or implied warranties,including,without limitation,the implied warranties of merchantibility and fitness for a particular purpose. For the details,see COPYING and README included in this distribution.

- Introduction
- Usage and methods
- Infinity,NaN,Zero
- Internal structure
- Binary or decimal number representation
- Resulting number of significant digits

ruby extconf.rb
make
make install

ruby extconf.rb
nmake
nmake install

For the user using Microsoft Visual C/C++ 6.0,the project files are available. Download:

winide143.lzh and decompress it in the directory "../ruby-1.4.3/win32".or

winide143.lzh (or winide16.tar.gz) and decompress it in the directory "../ruby-1.6.x/win32".

require 'BigFloat' # From v1.1.9 change to 'bigfloat' (UNIX user)
a=BigFloat::new("0.123456789123456789")
b=BigFloat::new("123456.78912345678",40)
c=a+b

- new
- +
- -
- *
- /
- assign!
- assign
- add!
- add
- sub!
- sub
- mult!
- mult
- div!
- div
- %
- fix
- frac
- floor
- ceil
- round
- truncate
- divmod
- remainder
- abs
- to_i
- to_s
- to_s2
- exponent
- to_f
- E
- PI
- BASE
- mode
- limit([n])
- sign
- nan?
- infinite?
- finite?
- to_parts
- inspect
- dup
- sqrt
- sincos
- exp
- power
- zero?
- nonzero?
- <=>

"new" method creates a new BigFloat object.

a=BigFloat::new(s[,n])

where:

s: Initial value string.

n: Maximum number of significant digits of a. n must be a Fixnum object. If n is omitted or is equal to 0,then the maximum number of significant digits of a is determined from the length of s.

addition(c = a + b)

For the resulting number of significant digits of c,see Resulting number of significant digits.

subtraction (c = a - b) or negation (c = -a)

For the resulting number of significant digits of c,see Resulting number of significant digits.

multiplication(c = a * b)

For the resulting number of significant digits of c,see Resulting number of significant digits.

division(c = a / b)

For the resulting number of significant digits of c,see Resulting number of significant digits.

asign! is a class method,and is used like:

n = BigFloat::assign!(c,a,f)

when f > 0 then a is assigned to c. when f < 0 then -a is assigned to c. The absolute value of f (|f|) must be 1 or 2. If |f|=2,then proper round operation over c is done,when the maximum number of significant digits of c is less than current number of significant digits of a. If |f|=1 then extra digits are discarded when the maximum number of significant digits of c is less than current number of significant digits of a. n is the resulting number of significant digits of c. n = 0 means the c is NaN or Infinity.

c = a.assign(n,f)

assigns the value of a to c.

If f > 0,then a is assigned to c.

If f < 0,then -a is assigned to c.

The meaning of f is the same as assign! method. n is the resulting number of significant digits of c.

add! is a class method,and is used like:

n = BigFloat::add!(c,a,b)

BigFloat::add!(c,a,b) performs c = a + b. If the maximum significant digits c can holds is less than the actual significant digits of a + b, then c is rounded properly. n is the resulting number of significant digits of c. n = 0 means the c is NaN or Infinity.

c = a.add(b,n)

c = a.add(b,n) performs c = a + b. If n is less than the actual significant digits of a + b, then c is rounded properly.

sub! is a class method,and is used like:

n = BigFloat::sub!(c,a,b)

BigFloat::sub!(c,a,b) performs c = a - b. If the maximum significant digits c can holds is less than the actual significant digits of a - b, then c is rounded properly. n is the resulting number of significant digits of c. n = 0 means the c is NaN or Infinity.

c = a.sub(b,n)

c = a.sub(b,n) performs c = a - b. If n is less than the actual significant digits of a - b, then c is rounded properly.

mult! is a class method,and is used like:

n = BigFloat::mult!(c,a,b)

BigFloat::mult!(c,a,b) performs c = a * b. If the maximum significant digits c can holds is less than the actual significant digits of a * b, then c is rounded properly. n is the resulting number of significant digits of c. n = 0 means the c is NaN or Infinity.

c = a.mult(b,n)

c = a.mult(b,n) performs c = a * b. If n is less than the actual significant digits of a * b, then c is rounded properly.

div! is a class method,and is used like:

n = BigFloat::div!(c,r,a,b)

BigFloat::div!(c,r,a,b) performs c = a / b, r is the residue of a / b. If necessary,the divide operation continues to the maximum significant digits c can hold. Unlike the divmod method,c is not always an integer. c is never rounded,and the equation a = c*b + r is always valid unless c is NaN or Infinity. As the relation r = a-c*b is always valid,the maximum significant digits r can have must be greater or equal to the current siginificant digits of a and c*b. If the siginificant digits of r is not sufficient,then computation will be stopped with an error message. n is the resulting number of significant digits of c. n = 0 means the c is NaN or Infinity.

c,r = a.div(b,n)

c,r = a.div(b,n) performs c = a / b, r is the residue of a / b. If necessary,the divide operation continues to n digits which c can hold. Unlike the divmod method,c is not always an integer. c is never rounded,and the equation a = c*b + r is always valid unless c is NaN or Infinity.

r = a%b

is the same as:

r = a-((a/b).floor)*b

c = a.fix

returns integer part of a.

c = a.frac

returns fraction part of a.

c = a.floor

returns the maximum integer value (in BigFloat) which is less than or equal to a.

As shown in the following example,an optional integer argument (n) specifying the position of 'floor'ed digit can be given. If n> 0,then the (n+1)th digit counted from the decimal point in fraction part is 'floor'ed. If n<0,then the n-th digit counted from the decimal point in integer part is 'floor'ed.

c = BigFloat::new("1.23456")

d = c.floor(4) # d = 1.2345

c = BigFloat::new("15.23456")

d = c.floor(-1) # d = 10.0

c = a.ceil

returns the minimum integer value (in BigFloat) which is greater than or equal to a.

As shown in the following example,an optional integer argument (n) specifying the position of 'ceil'ed digit can be given. If n>0,then the (n+1)th digit counted from the decimal point in fraction part is 'ceil'ed. If n<0,then the n-th digit counted from the decimal point in integer part is 'ceil'ed.

c = BigFloat::new("1.23456")

d = c.ceil(4) # d = 1.2346

c = BigFloat::new("15.23456")

d = c.ceil(-1) # d = 20.0

c = a.round

round off a to the nearest 1D

As shown in the following example,an optional integer argument (n) specifying the position of rounded digit can be given. If n>0,then the (n+1)th digit counted from the decimal point in fraction part is rounded. If n<0,then the n-th digit counted from the decimal point in integer part is rounded.

c = BigFloat::new("1.23456")

d = c.round(4) # d = 1.235

c = BigFloat::new("15.23456")

d = c.round(-1) # d = 20.0

c = a.truncate

truncate a to the nearest 1D

As shown in the following example,an optional integer argument (n) specifying the position of truncated digit can be given. If n>0,then the (n+1)th digit counted from the decimal point in fraction part is truncated. If n<0,then the n-th digit counted from the decimal point in integer part is truncated.

c = BigFloat::new("1.23456")

d = c.truncate(4) # d = 1.2345

c = BigFloat::new("15.23456")

d = c.truncate(-1) # d = 10.0

c,r = a.divmod(b) # a = c*b + r

returns the quotient and remainder of a/b.

a = c * b + r is always satisfied.

where c is the integer sutisfying c = (a/b).floor

and,therefore r = a - c*b

r=a.remainder(b)

returns the remainder of a/b.

where c is the integer sutisfying c = (a/b).fix

and,therefore: r = a - c*b

c = a.abs

returns an absolute value of a.

changes a to an integer.

i = a.to_i

i becomes to Fixnum or Bignum. IF a is Infinity or NaN,then i becomes to nil.

converts to string(results look like "0.xxxxxEn").

s = a.to_s

Same as to_s. to_s2(n) inserts a space after every n digits for readability.

s = a.to_s2(n)

returns an integer holding exponent value of a.

n = a.exponent

means a = 0.xxxxxxx*10**n.

same as dup method. creates a new BigFloat object having same value.

e = BigFloat::E(n)

where e(=2.718281828....) is the base value of natural logarithm.

n specifies the length of significant digits of e.

e = BigFloat::PI(n)

returns at least n digits of the ratio of the circumference of a circle to its dirmeter (pi=3.14159265358979....) using J.Machin's formula.

Base value used in the BigFloat calculation. On 32 bit integer system,the value of BASE is 10000.

b = BigFloat::BASE

mode method controls BigFloat computation. Following usage are defined.

f = BigFloat::mode(BigFloat::EXCEPTION_NaN,flag)

f = BigFloat::mode(BigFloat::EXCEPTION_INFINITY,flag)

f = BigFloat::mode(BigFloat::EXCEPTION_UNDERFLOW,flag)

f = BigFloat::mode(BigFloat::EXCEPTION_OVERFLOW,flag)

f = BigFloat::mode(BigFloat::EXCEPTION_ZERODIVIDE,flag)

f = BigFloat::mode(BigFloat::EXCEPTION_ALL,flag)

EXCEPTION_NaN controls the execution once computation results to NaN. EXCEPTION_INFINITY controls the execution once computation results to Infinity(}Infinity). EXCEPTION_UNDERFLOW controls the execution once computation underflows. EXCEPTION_OVERFLOW controls the execution once computation overflows. EXCEPTION_ZERODIVIDE controls the execution once zero-division occures. EXCEPTION_ALL controls the execution for any exception defined occures. If the flag is true,then the relating exception is thrown. No exception is thrown when the flag is false(default) and computation continues with the result:

EXCEPTION_NaN results to NaN

EXCEPTION_INFINITY results to +Infinity or -Infinity

EXCEPTION_UNDERFLOW results to 0.

EXCEPTION_OVERFLOW results to +Infinity or -Infinity

EXCEPTION_ZERODIVIDE results to +Infinity or -Infinity

EXCEPTION_INFINITY,EXCEPTION_OVERFLOW, and EXCEPTION_ZERODIVIDE are currently the same.

The return value of mode method is the value set. Suppose the return value of the mode method is f,then f & BigFloat::EXCEPTION_NaN !=0 means EXCEPTION_NaN is set to on. If the value of the argument flag is other than nil,true nor false then current mode status is returned.

Limits the maximum digits that the newly created BigFloat objects can hold never exceed n. Returns maximum value before set. Zero,the default value,means no upper limit.

mf = BigFloat::limit(n)

returns the 'attribute'. n = a.sign

where the value of n means that a is:

n = BigFloat::SIGN_NaN(0) : a is NaN

n = BigFloat::SIGN_POSITIVE_ZERO(1) : a is +0

n = BigFloat::SIGN_NEGATIVE_ZERO(-1) : a is -0

n = BigFloat::SIGN_POSITIVE_FINITE(2) : a is positive

n = BigFloat::SIGN_NEGATIVE_FINITE(-2) : a is negative

n = BigFloat::SIGN_POSITIVE_INFINITE(3) : a is +Infinity

n = BigFloat::SIGN_NEGATIVE_INFINITE(-3) : a is -Infinity

The value in () is the actual value,see (Internal structure.

a.nan? returns True when a is NaN.

a.infinite? returns True when a is + or -.

a.finite? returns True when a is neither nor NaN.

decomposes a BigFloat value to 4 parts. All 4 parts are returned as an array.

Parts consist of a sign(0 when the value is NaN,+1 for positive and -1 for negative value), a string representing fraction part,base value(always 10 currently),and an integer(Fixnum) for exponent respectively. a=BigFloat::new("3.14159265",10)

f,x,y,z = a.to_parts

where f=+1,x="314159265",y=10 and z=1

therefore,you can translate BigFloat value to Float as:

s = "0."+x

b = f*(s.to_f)*(y**z)

is used for debugging output.

p a=BigFloat::new("3.14",10)

should produce output like "[0x112344:'0.314E1',4(12)]". where "0x112344" is the address, '0.314E1' is the value,4 is the number of the significant digits, and 12 is the maximum number of the significant digits the object can hold.

creates a new BigFloat object having same value.

c = a.sqrt(n)

computes square root value of a with significant digit number n at least.

computes and returns sine and cosine value of a with significant digit number n at least.

sin,cos = a.sincos(n)

c = a.exp(n)

computes the base of natural logarithm value(e=2.718281828....) powered by a with significant digit number n at least.

c = a.power(n)

returns the value of a powered by n(c=a**n). n must be an integer.

c = a.zero?

returns true if a is equal to 0,otherwise returns false

c = a.nonzero?

returns false if a is 0,otherwise returns a itself.

c = a <=> b

returns 0 if a==b,1 if a > b,and returns -1 if a < b.

- ==
- === same as ==,used in case statement.
- !=
- <
- <=
- >
- >=

- 1.Both A and B are BigFloat objects
- A op B is normally performed.
- 2.A is the BigFloat object but B is other than BigFloat object
- Operation is performed,after B is translated to correcponding BigFloat object(because BigFloat supports coerce method).
- 3.A is not the BigFloat object but B is BigFloat object
- If A has coerce mthod,then B will translate A to corresponding BigFloat object and the operation is performed,otherwise an error occures.

String representing zero or infinity such as "Infinity","+Infinity","-Infinity",and "NaN" can also be translated to BigFloat unless false is specified by mode method.

BigFloat class supports coerce method(for the details about coerce method,see Ruby documentations). This means the most binary operation can be performed if the BigFloat object is at the left hand side of the operation.

For example:

a = BigFloat.E(20)
c = a * "0.123456789123456789123456789" # A String is changed to BigFloat object.

is performed normally.But,because String does not have coerce method,the following example can not be performed.

a = BigFloat.E(20)
c = "0.123456789123456789123456789" * a # ERROR

If you actually have any inconvenience about the error above.
You can define a new class derived from String class,
and define coerce method within the new class.NaN(Not a number) can be obtained by undefined computation like 0.0/0.0 or Infinity-Infinity. Any computation including NaN results to NaN. Comparisons with NaN never become true,including comparison with NaN itself.

Zero has two different variations as +0.0 and -0.0. But,still, +0.0==-0.0 is true.

Computation results including Infinity,NaN,+0.0 or -0.0 become complicated. Run following program and comfirm the results. Send me any incorrect result if you find.

```
require "BigFloat"
aa = %w(1 -1 +0.0 -0.0 +Infinity -Infinity NaN)
ba = %w(1 -1 +0.0 -0.0 +Infinity -Infinity NaN)
opa = %w(+ - * / <=> > >= < == != <=)
for a in aa
for b in ba
for op in opa
x = BigFloat::new(a)
y = BigFloat::new(b)
eval("ans= x #{op} y;print a,' ',op,' ',b,' ==> ',ans.to_s,\"\n\"")
end
end
end
```

where 'x' is any digit representing mantissa(kept in the array frac[]), BASE is base value(=10000 in 32 bit integer system), and n is the exponent value.

Larger BASE value enables smaller size of the array frac[],and increases computation speed. The value of BASE is defined ind VpInit(). In 32 bit integer system,this value is 10000. In 64 bit integer system,the value becomes larger. BigFloat has not yet been compiled and tested on 64 bit integer system. It will be very nice if anyone try to run BigFloat on 64 bit system and inform me the results. When BASE is 10000,an element of the array frac[] can have vale of from 0 to 9999. (up to 4 digits).

The structure Real is defined in bigfloat.h as:

```
typedef struct {
unsigned long MaxPrec; // The size of the array frac[]
unsigned long Prec; // Current size of frac[] actually used.
short sign; // Attribute of the value.
// ==0 : NaN
// 1 : +0
// -1 : -0
// 2 : Positive number
// -2 : Negative number
// 3 : +Infinity
// -3 : -Infinity
unsigned short flag; // Control flag
int exponent; // Exponent value(0.xxxx*BASE**exponent)
unsigned long frac[1]; // An araay holding mantissa(Variable)
} Real;
```

The decimal value 1234.56784321 is represented as(BASE=10000):0.1234 5678 4321*(10000)**1wher frac[0]=1234,frac[1]=5678,frac[2]=4321, Prec=3,sign=2,exponent=1. MaxPrec can be any value greater than or equal to Prec.

- Easy for debugging
- The floating number 1234.56784321 can be easily represented as:

frac[0]=1234,frac[1]=5678,frac[2]=4321,exponent=1,and sign=2. - Exact representation
- Following program can add all numbers(in decimal) in a file
without any error(no round operation).

If the internal representation is binary,translation from decimal to binary is required and the translation error is inevitable. For example, 0.1 can not exactly be represented in binary.`file = File::open(....,"r") s = BigFloat::new("0") while line = file.gets s = s + line end`

0.1 => b1*2**(-1)+b1*2**(-2)+b3*2**(-3)+b4*2**(-4)....

where b1=0,b2=0,b3=0,b4=1...

bn(n=1,2,3,...) is infinite series of digit with value of 0 or 1, and rounding operation is necessary but where we should round the series ? Of cource,exact "0.1" is printed if the rouding operation is properly done, - Significant digit we can have is automatically determined
- In binary representation,0.1 can not be represented in finite series of digit. But we only need one element(frac[0]=1) in decimal representation. This means that we can always determine the size of the array frac[] in Real structure.

Resulting number of significant digits are defined as:

1.1 For * and /,resulting number of significant digits is the sum of the significant digits of both side of the operator.

1.2 For + and -,resulting number of significant digits is determined so that no round operation is needed.

For example, c has more than 100 siginificant digits if c is computed as:

c = 0.1+0.1*10**(-100)

As +,-,and * are always exact(no round operation is performed), attention must be paid for the program like:

e = BigFloat.new("1")
while e + 1.0 != 1.0
e = e / 10
end

Above example continues till all available memories is exhausted.
(Because no round operation is performed on e+1.0)As for the division as c = a/b,the significant digits of c is the same as a*b. Division such as c=1.0/3.0 will be rounded.

2.1 Using class method

#
# PI (Calculates 3.1415.... using J. Machin's formula.
#
sig = 2000 # <== Number of significant figures
exp = -sig
sig = sig + sig/100 # no theoretical reason
pi = BigFloat::new("0",sig)
two = BigFloat::new("2")
m25 = BigFloat::new("-0.04")
m57121 = BigFloat::new("-57121")
k = BigFloat::new("1")
w = BigFloat::new("1")
t = BigFloat::new("-80",sig)
v = BigFloat::new("0",sig)
u = BigFloat::new("0",sig)
r = BigFloat::new("0",sig+sig+1)
n1 = 0
n2 = 0
ts = Time::now
while (u.exponent >= exp)
n1 += 1
BigFloat::mult!(v,t,m25)
BigFloat::assign!(t,v,1)
BigFloat::div!(u,r,t,k)
BigFloat::add!(v,pi,u)
BigFloat::assign!(pi,v,1)
BigFloat::add!(w,k,two)
BigFloat::assign!(k,w,1)
end
k = BigFloat::new("1")
w = BigFloat::new("1")
BigFloat::assign!(t,"956",1)
BigFloat::assign!(u,0,1)
while (u.exponent >= exp )
n2 += 1
BigFloat::div!(v,r,t,m57121)
BigFloat::assign!(t,v,1)
BigFloat::div!(u,r,t,k)
BigFloat::add!(v,pi,u)
BigFloat::assign!(pi,v,1)
BigFloat::add!(w,k,two)
BigFloat::assign!(k,w,1)
end
p pi
print "# of iterations = ",n1,"+",n2,"\n"
exit

2.2 Using instance method
#
# PI (Calculates 3.1415.... using J. Machin's formula.
#
sig = 2000 # <== Number of significant figures
exp = -sig
sig = sig + sig/100 # no theoretical reason
pi = BigFloat::new("0")
two = BigFloat::new("2")
m25 = BigFloat::new("-0.04")
m57121 = BigFloat::new("-57121")
n1 = 0
n2 = 0
u = BigFloat::new("1")
k = BigFloat::new("1")
w = BigFloat::new("1")
t = BigFloat::new("-80")
while (u.exponent >= exp)
n1 += 1
t = t*m25
u,r = t.div(k,sig)
pi = pi + u
k = k+two
end
u = BigFloat::new("1")
k = BigFloat::new("1")
w = BigFloat::new("1")
t = BigFloat::new("956")
while (u.exponent >= exp )
n2 += 1
t,r = t.div(m57121,sig)
u,r = t.div(k,sig)
pi = pi + u
k = k+two
end
p pi
print "# of iterations = ",n1,"+",n2,"\n"
exit