Exact multiplication by short bitvectors #27

pavpanchekha · 2024-01-17T16:21:06Z

I chatted with Andrew at POPL and he asked me to file this issue, about a floating-point arithmetic trick that isn't verified/available in VCFloat today.

It is sometimes useful to know that the product of two floating-point numbers, or of a floating-point number and an integer, is exact if the numbers have many zeros at the start/end. For example, if you have a multiplication 3 * x, you know that 3 only has the last two bits set, so if x ends in a zero bit, the multiplication is exact. This trick is prominently used in math libraries, in a few ways.

One common use case is in Cody-Waite range reduction. This algorithm computes x % PI (or another constant) in several steps:

First, divide x by PI and cast to integer k
Subtract PI * k
To handle rounding error in PI, have another constant PI_2 (so that PI_2 is real PI minus float PI), and subtract PI_2 * k

This algorithms works only if the product PI * k is exact; otherwise, that step introduces a larger error than the PI_2 * k correction. To guarantee this, the PI constant doesn't use all 53 bits; instead, it uses, say, only 33 bits, and then k can be as large as 20 bits meaning that it works for |x| < 2^20 * PI. There's a tradeoff between the number of bits used for the PI constants and the number of bits used for k, and the implementor can choose values based on their desired performance tradeoffs for different inputs.

That case uses exact integer-float multiplication, but another use case of this exact multiplication is float-float. Consider acos in fdlibm, where (in the else branch) we compute:

	    s = sqrt(z);
	    df = s;
	    __LO(df) = 0;
	    c  = (z-df*df)/(s+df);

Here the s and c variables are a double-double representation of the square root of z. To compute this, we first compute s directly and then correct for the error by squaring it, subtracting from z, and rescaling. But first, we define df which drops the lower half of s, so that the squaring step is done exactly. Then z - df*df is also exact through Sterbenz subtraction, which is important to preserving the error. If the multiply isn't exact, the error in df * df is as large as z - df*df, meaning that c is basically totally useless.

I'm not sure what it would take to add exact multiplication to VCFloat, but a POPL'18 paper on verifying math libraries tracks bit widths of various values and uses that as a verification condition for exact multiplies, perhaps something similar is possible.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exact multiplication by short bitvectors #27

Exact multiplication by short bitvectors #27

pavpanchekha commented Jan 17, 2024

Exact multiplication by short bitvectors #27

Exact multiplication by short bitvectors #27

Comments

pavpanchekha commented Jan 17, 2024