You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I chatted with Andrew at POPL and he asked me to file this issue, about a floating-point arithmetic trick that isn't verified/available in VCFloat today.
It is sometimes useful to know that the product of two floating-point numbers, or of a floating-point number and an integer, is exact if the numbers have many zeros at the start/end. For example, if you have a multiplication 3 * x, you know that 3 only has the last two bits set, so if x ends in a zero bit, the multiplication is exact. This trick is prominently used in math libraries, in a few ways.
One common use case is in Cody-Waite range reduction. This algorithm computes x % PI (or another constant) in several steps:
First, divide x by PI and cast to integer k
Subtract PI * k
To handle rounding error in PI, have another constant PI_2 (so that PI_2 is real PI minus float PI), and subtract PI_2 * k
This algorithms works only if the product PI * k is exact; otherwise, that step introduces a larger error than the PI_2 * k correction. To guarantee this, the PI constant doesn't use all 53 bits; instead, it uses, say, only 33 bits, and then k can be as large as 20 bits meaning that it works for |x| < 2^20 * PI. There's a tradeoff between the number of bits used for the PI constants and the number of bits used for k, and the implementor can choose values based on their desired performance tradeoffs for different inputs.
That case uses exact integer-float multiplication, but another use case of this exact multiplication is float-float. Consider acos in fdlibm, where (in the else branch) we compute:
s = sqrt(z);
df = s;
__LO(df) = 0;
c = (z-df*df)/(s+df);
Here the s and c variables are a double-double representation of the square root of z. To compute this, we first compute s directly and then correct for the error by squaring it, subtracting from z, and rescaling. But first, we define df which drops the lower half of s, so that the squaring step is done exactly. Then z - df*df is also exact through Sterbenz subtraction, which is important to preserving the error. If the multiply isn't exact, the error in df * df is as large as z - df*df, meaning that c is basically totally useless.
I'm not sure what it would take to add exact multiplication to VCFloat, but a POPL'18 paper on verifying math libraries tracks bit widths of various values and uses that as a verification condition for exact multiplies, perhaps something similar is possible.
The text was updated successfully, but these errors were encountered:
I chatted with Andrew at POPL and he asked me to file this issue, about a floating-point arithmetic trick that isn't verified/available in VCFloat today.
It is sometimes useful to know that the product of two floating-point numbers, or of a floating-point number and an integer, is exact if the numbers have many zeros at the start/end. For example, if you have a multiplication
3 * x
, you know that3
only has the last two bits set, so ifx
ends in a zero bit, the multiplication is exact. This trick is prominently used in math libraries, in a few ways.One common use case is in Cody-Waite range reduction. This algorithm computes
x % PI
(or another constant) in several steps:x
byPI
and cast to integerk
PI * k
PI
, have another constantPI_2
(so thatPI_2
is real PI minus float PI), and subtractPI_2 * k
This algorithms works only if the product
PI * k
is exact; otherwise, that step introduces a larger error than thePI_2 * k
correction. To guarantee this, thePI
constant doesn't use all 53 bits; instead, it uses, say, only 33 bits, and thenk
can be as large as 20 bits meaning that it works for|x| < 2^20 * PI
. There's a tradeoff between the number of bits used for thePI
constants and the number of bits used fork
, and the implementor can choose values based on their desired performance tradeoffs for different inputs.That case uses exact integer-float multiplication, but another use case of this exact multiplication is float-float. Consider
acos
infdlibm
, where (in theelse
branch) we compute:Here the
s
andc
variables are a double-double representation of the square root ofz
. To compute this, we first computes
directly and then correct for the error by squaring it, subtracting fromz
, and rescaling. But first, we definedf
which drops the lower half ofs
, so that the squaring step is done exactly. Thenz - df*df
is also exact through Sterbenz subtraction, which is important to preserving the error. If the multiply isn't exact, the error indf * df
is as large asz - df*df
, meaning thatc
is basically totally useless.I'm not sure what it would take to add exact multiplication to VCFloat, but a POPL'18 paper on verifying math libraries tracks bit widths of various values and uses that as a verification condition for exact multiplies, perhaps something similar is possible.
The text was updated successfully, but these errors were encountered: