Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scrutinizer: add custom big Int implementation #249

Merged
merged 1 commit into from
Aug 6, 2021
Merged

Conversation

p4u
Copy link
Member

@p4u p4u commented Aug 5, 2021

The JSON Marshaler for the standard type big.Int optputs an integer
number instead of an string (such as 123456789 instead of "123456789").
This behavior creates a problem when importing the JSON
data to a different implementation (i.e Javascript) since it forces to
read the field as an integer.

To fix this we introduce a new custom big.Int implementation which
Marshals always to a bigInt string.

Note that this fix is widely used by projects such as go-ethereum
(common/hexutil)

Signed-off-by: p4u [email protected]

@p4u p4u force-pushed the f/custom_bigint branch 3 times, most recently from 7a48e02 to 555183a Compare August 5, 2021 14:57
@p4u p4u requested review from mvdan and ed255 August 5, 2021 14:58
@p4u p4u force-pushed the f/custom_bigint branch from 555183a to cb6ffd1 Compare August 5, 2021 15:00
The JSON Marshaler for the standard type big.Int optputs an integer
number instead of an string (such as "123456789") or an hexString (such
as "0x123456). This behavior creates a problem when importing the JSON
data to a different implementation (i.e Javascript) since it forces to
read the field as an integer (instead of bigInt).

To fix this we introduce a new custom big.Int implementation which
Marshals always to a bigInt string.

Note that this fix is widely used by projects such as go-ethereum
(common/hexutil)

Signed-off-by: p4u <[email protected]>
@p4u p4u force-pushed the f/custom_bigint branch from cb6ffd1 to f29773d Compare August 5, 2021 15:17
return (*big.Int)(i)
}

// Add sum x+y
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adding this arithmetic function is a bit arbitrary, as math/big.Int supports many more arithmetic operations. I would remove this method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added those which we are currently using. Its mainly Add and Mul. So to me its fine to have a simplified wrapper of the standard big.Int implementation

@ed255
Copy link
Contributor

ed255 commented Aug 5, 2021

I have a question regarding the scope of this BigInt. Is the BigInt a wrapper to be used everywhere in vocdoni-node so that we never use *big.Int internally?

Or is it just a wrapper used at the edge to control the decoding and encoding of JSON?

@p4u
Copy link
Member Author

p4u commented Aug 5, 2021

I have a question regarding the scope of this BigInt. Is the BigInt a wrapper to be used everywhere in vocdoni-node so that we never use *big.Int internally?

Or is it just a wrapper used at the edge to control the decoding and encoding of JSON?

In general we follow the approcah "use the standard libraries always if possible". So the types.BigInt type would only be used on those packages that require its modified functionallity (those that handle structs that are marshaled to JSON). Currently just the scrutinizer.

@p4u p4u changed the title WIP: scrutinizer: add custom big Int implementation scrutinizer: add custom big Int implementation Aug 5, 2021
@ed255
Copy link
Contributor

ed255 commented Aug 5, 2021

In general we follow the approcah "use the standard libraries always if possible". So the types.BigInt type would only be used on those packages that require its modified functionallity (those that handle structs that are marshaled to JSON). Currently just the scrutinizer.

In that case, have you considered in vochain/scrutinizer/indexertypes/results.go doing:

	r.Weight.ToInt().Add(r.Weight.ToInt(), weight)

	// [...]

		for q, value := range voteValues {
			r.Votes[q][0].ToInt().Add(
				r.Votes[q][0].ToInt(),
				new(big.Int).Mul(
					new(big.Int).SetUint64(uint64(value)),
					weight),
			)
		}

instead of

	r.Weight.Add(r.Weight, (*types.BigInt)(weight))

	// [...]

		for q, value := range voteValues {
			r.Votes[q][0].Add(
				r.Votes[q][0],
				new(types.BigInt).Mul(
					new(types.BigInt).SetUint64(uint64(value)),
					(*types.BigInt)(weight)),
			)
		}

This way there's no need to add new arithmetic methods to the wrapper.

Also, maybe it's worth it renaming ToInt to AsInt because the returned value uses the same reference as the original object, it's just a type cast internally.

@p4u
Copy link
Member Author

p4u commented Aug 5, 2021

In general we follow the approcah "use the standard libraries always if possible". So the types.BigInt type would only be used on those packages that require its modified functionallity (those that handle structs that are marshaled to JSON). Currently just the scrutinizer.

In that case, have you considered in vochain/scrutinizer/indexertypes/results.go doing:

	r.Weight.ToInt().Add(r.Weight.ToInt(), weight)

	// [...]

		for q, value := range voteValues {
			r.Votes[q][0].ToInt().Add(
				r.Votes[q][0].ToInt(),
				new(big.Int).Mul(
					new(big.Int).SetUint64(uint64(value)),
					weight),
			)
		}

instead of

	r.Weight.Add(r.Weight, (*types.BigInt)(weight))

	// [...]

		for q, value := range voteValues {
			r.Votes[q][0].Add(
				r.Votes[q][0],
				new(types.BigInt).Mul(
					new(types.BigInt).SetUint64(uint64(value)),
					(*types.BigInt)(weight)),
			)
		}

This way there's no need to add new arithmetic methods to the wrapper.

Also, maybe it's worth it renaming ToInt to AsInt because the returned value uses the same reference as the original object, it's just a type cast internally.

Well in this case results.go already uses the BigInt wrapper, so for consistency I'd make a package use either our implementation or the math/big one, but try not to mix it.

@p4u
Copy link
Member Author

p4u commented Aug 5, 2021

Lets also hear @mvdan voice regarding these matters, a third opinion is welcome :)

@mvdan
Copy link
Contributor

mvdan commented Aug 5, 2021

I agree that we should marshal big ints as strings in JSON. I get that upstream says using numbers is OK, because JSON doesn't specify precision, so in theory it's fine. But in practice, most JSON libraries will use float64 or int64 for integers, meaning loss of precision.

Doing this change in Go without writing our own wrapper should be easy, but unfortunately the current encoding/json library is just not powerful enough for it. I was hoping that the ,string option would do it, just like it does for regular integer types, but it does not work: https://play.golang.org/p/iu6NjKWFNE-

If golang/go#5901 were implemented today we could use that instead, but it doesn't exist yet.

So, yup, a wrapper type is our only bet I think. That said, I'm not a fan of this particular wrapper, because it hides all methods from the underlying big.Int and defines its own methods instead. I'd rather we did something like:

type StringyBigInt struct {
    *big.Int
}

func (s *StringyBigInt) MarshalJSON(...) { ... }
func (s *StringyBigInt) UnmarshalJSON(...) { ... }

Any places that need to do operations on these can do stuff like:

var n1, n2 StringyBigInt

n1.Int = n1.Add(n2.Int)

Similarly, you can convert a big.Int to our type like:

var n1 *big.Int

resp.Field = StringyBigInt{Int: n1}

I get that that's slightly more verbose, but we also avoid mixing our type with our methods with big.Int and its methods. If our type is only about JSON marshaling and unmarshaling, then it's crystal clear that consuming or modifying the actual integer values uses big.Int, not our type. Plus we don't end up with two sets of methods that look the same and do the same, but actually are incompatible with each other as they take different types :)

@mvdan
Copy link
Contributor

mvdan commented Aug 5, 2021

Here's a working example of what I mean: https://play.golang.org/p/62r5FjaQfpN

@ed255
Copy link
Contributor

ed255 commented Aug 6, 2021

[...]
So, yup, a wrapper type is our only bet I think. That said, I'm not a fan of this particular wrapper, because it hides all methods from the underlying big.Int and defines its own methods instead. I'd rather we did something like:

type StringyBigInt struct {
    *big.Int
}

func (s *StringyBigInt) MarshalJSON(...) { ... }
func (s *StringyBigInt) UnmarshalJSON(...) { ... }

[...]

I'm in favor of this approach, mainly because it focuses only on the encoding part but leaves the rest of the functionality of the big.Int to the original methods, without wrappers.

@p4u
Copy link
Member Author

p4u commented Aug 6, 2021

I see your points, however IMHO its a too complex approach (from the developer point of view) in relation with the benefits of it.

The wrapper approach works very similar to the standard big.Int implementation, thus it makes very easy for someone used to play with math/big. Its true that it hiddes methods but the scope of its usage is very limited and in the worst scenario you can do bigIntVar.Int().AnyOriginalMethod().

If exporting methods partially is what causes here the disruption, I'd instead still use the wrapper but remove Add() and Mul() and do instead bigIntvar.Int().Add(bigIntvar.Int(),bigIntVar2.Int()). However in order to avoid accessing to Int() constantly, in my opinion in makes sense to export Add() and Mul() (as shortcuts). But as said, if that's what causes the disagreement I'm OK removing them and using always Int().

At the end both solutions work and its more of a matter of personal taste than a pragmatic set of reasons for one or the other.

@p4u p4u closed this Aug 6, 2021
@p4u p4u reopened this Aug 6, 2021
@p4u
Copy link
Member Author

p4u commented Aug 6, 2021

Ummm, ok maybe I need to put the hands on the @mvdan approach to better understand the benefits of it. The snippet code is not compiling btw.

@p4u
Copy link
Member Author

p4u commented Aug 6, 2021

So, maybe I'm not understanding something from the proposal, but with the struct approach any method used returns a big.Int type which makes it not convinient when writting code

type BigInt struct {
    *big.Int
}

a := new(BigInt).SetUint64(100)
b := new(BigInt).SetUint64(200)

var c BigInt
c.Add(c, b) // type casting error

@ed255
Copy link
Contributor

ed255 commented Aug 6, 2021

So, maybe I'm not understanding something from the proposal, but with the struct approach any method used returns a big.Int type which makes it not convinient when writting code

type BigInt struct {
    *big.Int
}

a := new(BigInt).SetUint64(100)
b := new(BigInt).SetUint64(200)

var c BigInt
c.Add(c, b) // type casting error

I think that in mvdan's proposal, the wrapper is not intended to be used as a pointer (it already wraps a pointer to big.Int). For example, you can use it like this:

	x := BigInt{big.NewInt(123)}
	x.Add(x.Int, big.NewInt(1))

Edit1: Ah no, I don't think that's related to the type casting error you encountered. I think the error comes from the fact that SetUint64 returns a *big.Int (https://pkg.go.dev/math/big#Int.SetUint64), and not a *BigInt. So you're assigning a *big.Int to a and b

@mvdan
Copy link
Contributor

mvdan commented Aug 6, 2021

Yep. Or, as a more direct version of what Pau was trying to do:


	a := StringyBigInt{Int: big.NewInt(100)}
	b := StringyBigInt{Int: big.NewInt(200)}

	c := StringyBigInt{Int: big.NewInt(0)}
	c.Add(a.Int, b.Int)

It does require more verbosity with the Int field, but you always do math with *big.Int directly, so IMO it's much clearer. Here's a working example again: https://play.golang.org/p/qR8eXOVsx9D

And if you want to do lots of arithmetic, you can always obtain the *big.Int variable like:

stringy := StringyBigInt{...}

normal := stringy.Int
// use normal like a *big.Int

@p4u
Copy link
Member Author

p4u commented Aug 6, 2021

Ok, still I don't see a clear benefit. I'd go for the current approach (which just works) since we need to have this upstream ASAP (in order to finish the processArchive stuff) and I don't have time now for re-implementing it.

I'll not oppose to a new PR changing to the @mvdan approach. If some of you want and have the time for doing it today, I'd wait. Else I'll merge and we can add a new issue for doing this in the future.

@p4u
Copy link
Member Author

p4u commented Aug 6, 2021

Issue opened #250

@p4u p4u merged commit 210e3d3 into master Aug 6, 2021
@p4u p4u deleted the f/custom_bigint branch August 6, 2021 13:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants