Fixed code to match comments and the algorithm:

- in preparing for the third approximation, actually make t larger in
  magnitude than cbrt(x).  After chopping, t must be incremented by 2
  ulps to make it larger, not 1 ulp since chopping can reduce it by
  almost 1 ulp and it might already be up to half a different-sized-ulp
  smaller than cbrt(x).  I have not found any cases where this is
  essential, but the think-time error bound depends on it.  The relative
  smallness of the different-sized-ulp limited the bug.  If there are
  cases where this is essential, then the final error bound would be
  5/6+epsilon instead of of 4/6+epsilon ulps (still < 1).
- in preparing for the third approximation, round more carefully (but
  still sloppily to avoid branches) so that the claimed error bound of
  0.667 ulps is satisfied in all cases tested for cbrt() and remains
  satisfied in all cases for cbrtf().  There isn't enough spare precision
  for very sloppy rounding to work:
  - in cbrt(), even with the inadequate increment, the actual error was
    0.6685 in some cases, and correcting the increment increased this
    a little.  The fix uses sloppy rounding to 25 bits instead of very
    sloppy rounding to 21 bits, and starts using uint64_t instead of 2
    words for bit manipulation so that rounding more bits is not much
    costly.
  - in cbrtf(), the 0.667 bound was already satisfied even with the
    inadequate increment, but change the code to almost match cbrt()
    anyway.  There is not enough spare precision in the Newton
    approximation to double the inadequate increment without exceeding
    the 0.667 bound, and no spare precision to avoid this problem as
    in cbrt().  The fix is to round using an increment of 2 smaller-ulps
    before chopping so that an increment of 1 ulp is enough.  In cbrt(),
    we essentially do the same, but move the chop point so that the
    increment of 1 is not needed.

Fixed comments to match code:
- in cbrt(), the second approximation is good to 25 bits, not quite 26 bits.
- in cbrt(), don't claim that the second approximation may be implemented
  in single precision.  Single precision cannot handle the full exponent
  range without minor but pessimal changes to renormalize, and although
  single precision is enough, 25 bit precision is now claimed and used.

Added comments about some of the magic for the error bound 4/6+epsilon.
I still don't understand why it is 4/6+ and not 6/6+ ulps.

Indent comments at the right of code more consistently.
This commit is contained in:
Bruce Evans 2005-12-18 21:46:47 +00:00
parent 14bbd30f2e
commit ce804bff58
Notes: svn2git 2020-12-20 02:59:44 +00:00
svn path=/head/; revision=153517
2 changed files with 42 additions and 20 deletions

View file

@ -37,7 +37,12 @@ double
cbrt(double x)
{
int32_t hx;
union {
double value;
uint64_t bits;
} u;
double r,s,t=0.0,w;
uint64_t bits;
u_int32_t sign;
u_int32_t high,low;
@ -47,7 +52,7 @@ cbrt(double x)
if(hx>=0x7ff00000) return(x+x); /* cbrt(NaN,INF) is itself */
GET_LOW_WORD(low,x);
if((hx|low)==0)
return(x); /* cbrt(0) is itself */
return(x); /* cbrt(0) is itself */
/*
* Rough cbrt to 5 bits:
@ -73,7 +78,7 @@ cbrt(double x)
SET_HIGH_WORD(t,sign|(hx/3+B1));
/*
* New cbrt to 26 bits; may be implemented in single precision:
* New cbrt to 25 bits:
* cbrt(x) = t*cbrt(x/t**3) ~= t*R(x/t**3)
* where R(r) = (14*r**2 + 35*r + 5)/(5*r**2 + 35*r + 14) is the
* (2,2) Pade approximation to cbrt(r) at r = 1. We replace
@ -91,16 +96,26 @@ cbrt(double x)
s=C+r*t;
t*=G+F/(s+E+D/s);
/* chop t to 20 bits and make it larger in magnitude than cbrt(x) */
GET_HIGH_WORD(high,t);
INSERT_WORDS(t,high+0x00000001,0);
/*
* Round t away from zero to 25 bits (sloppily except for ensuring that
* the result is larger in magnitude than cbrt(x) but not much more than
* 2 25-bit ulps larger). With rounding towards zero, the error bound
* would be ~5/6 instead of ~4/6. With a maximum error of 1 25-bit ulps
* in the rounded t, the infinite-precision error in the Newton
* approximation barely affects third digit in the the final error
* 0.667; the error in the rounded t can be up to about 12 25-bit ulps
* before the final error is larger than 0.667 ulps.
*/
u.value=t;
u.bits=(u.bits+0x20000000)&0xfffffffff0000000ULL;
t=u.value;
/* one step Newton iteration to 53 bits with error less than 0.667 ulps */
s=t*t; /* t*t is exact */
r=x/s;
w=t+t;
r=(r-t)/(w+r); /* r-t is exact */
t=t+t*r;
/* one step Newton iteration to 53 bits with error < 0.667 ulps */
s=t*t; /* t*t is exact */
r=x/s; /* error <= 0.5 ulps; |r| < |t| */
w=t+t; /* t+t is exact */
r=(r-t)/(w+r); /* r-t is exact; w+r ~= 3*t */
t=t+t*r; /* error <= 0.5 + 0.5/3 + epsilon */
return(t);
}

View file

@ -48,7 +48,7 @@ cbrtf(float x)
hx ^=sign;
if(hx>=0x7f800000) return(x+x); /* cbrt(NaN,INF) is itself */
if(hx==0)
return(x); /* cbrt(0) is itself */
return(x); /* cbrt(0) is itself */
/* rough cbrt to 5 bits */
if(hx<0x00800000) { /* subnormal number */
@ -64,16 +64,23 @@ cbrtf(float x)
s=C+r*t;
t*=G+F/(s+E+D/s);
/* chop t to 12 bits and make it larger in magnitude than cbrt(x) */
/*
* Round t away from zero to 12 bits (sloppily except for ensuring that
* the result is larger in magnitude than cbrt(x) but not much more than
* 1 12-bit ulp larger). With rounding towards zero, the error bound
* would be ~5/6 instead of ~4/6, and with t 2 12-bit ulps larger the
* infinite-precision error in the Newton approximation would affect
* the second digit instead of the third digit of 4/6 = 0.666..., etc.
*/
GET_FLOAT_WORD(high,t);
SET_FLOAT_WORD(t,(high&0xfffff000)+0x00001000);
SET_FLOAT_WORD(t,(high+0x1002)&0xfffff000);
/* one step Newton iteration to 24 bits with error less than 0.667 ulps */
s=t*t; /* t*t is exact */
r=x/s;
w=t+t;
r=(r-t)/(w+r); /* r-t is exact */
t=t+t*r;
/* one step Newton iteration to 24 bits with error < 0.667 ulps */
s=t*t; /* t*t is exact */
r=x/s; /* error <= 0.5 ulps; |r| < |t| */
w=t+t; /* t+t is exact */
r=(r-t)/(w+r); /* r-t is exact; w+r ~= 3*t */
t=t+t*r; /* error <= 0.5 + 0.5/3 + epsilon */
return(t);
}