proposal: encoding/json/jsontext: add constants for each Kind #71756

dsnet · 2025-02-14T20:09:09Z

Proposal Details

This is a sub-issue of the "encoding/json/v2" proposal (#71497).

Here, we discuss whether to add constants for each Kind.
This builds on top of the v2 API and does not block the acceptance of v2.

Specifically, this issue would result in the following changes to the API:

  package jsontext

  type Kind byte

  const (
+ 	KindNull        Kind = 'n'
+ 	KindFalse       Kind = 'f'
+ 	KindTrue        Kind = 't'
+ 	KindString      Kind = '"'
+ 	KindNumber      Kind = '0'
+ 	KindBeginObject Kind = '{'
+ 	KindEndObject   Kind = '}'
+ 	KindBeginArray  Kind = '['
+ 	KindEndArray    Kind = ']'
  )

The choice of using a Kind prefix matches the http.MethodGet-like constants. We need at least a prefix or a suffix since Null, False, True, etc. conflict with the existing Token declarations.

Analysis of this change:

✔️ Always using the constants protects against accidental typos with literals (e.g., 0 vs '0'). This problem could be alleviated by govet or staticcheck.
❌ The literals are already sufficiently readable that many authors will avoid referencing the constants. For example, 75% of code avoid http.MethodGet literal and instead just use "GET". It is almost certain that the world will be inconsistent about usage. In the case of "jsontext", it is notably more effort to reference jsontext.KindNull rather than use the 'n' literal.

Alternatives

What about making `Kind` an opaque type?

We could, but there was a performance hit since you are no longer performing a comparison against a constant or
constant literal. For example, let's suppose we had:

type Kind struct { k kind }

var (
	KindNull  Kind = Kind{'n'}
	KindFalse Kind = Kind{'f'}
	KindTrue  Kind = Kind{'t'}
	...
)

Opaque types cannot be declared as constants. Referencing a global variable requires referencing the memory location for that variable, which is slower.

What about a separate package?

It should be noted that this could also just be a third-party package:

package jsonkind // could be even shorter as jsonk or even jk

const (
	Null        jsontext.Kind = 'n'
	False       jsontext.Kind = 'f'
	True        jsontext.Kind = 't'
	...
)

which is shorter to reference as jsonkind.Null instead of jsontext.KindNull.

The text was updated successfully, but these errors were encountered:

mateusz834 · 2025-02-14T20:15:33Z

I am in favour of this change (with the Kind prefix), because these kinds of constants are autocompleted by gopls, thus it is easier to work with them.

mateusz834 · 2025-02-14T20:19:02Z

Personally, i would change the names to:

- KindBeginObject Kind = '{'
- KindEndObject   Kind = '}'
- KindBeginArray  Kind = '['
- KindEndArray    Kind = ']'
+ KindObjectBegin Kind = '{'
+ KindObjectEnd   Kind = '}'
+ KindArrayBegin  Kind = '['
+ KindArrayEnd    Kind = ']'

dsnet · 2025-02-14T20:21:55Z

RFC 8259, section 2 uses the terms begin-array, begin-object, end-array, and end-object as the formal grammatical names for these constructs. We aim to use proper JSON terminology if relevant.

As an aside, the term "kind" is unique to the Go "jsontext" package, but we need a way to describe the kind of a token. The term "type" would be incorrect since that's used by the RFC to describe null, strings, numbers, booleans, objects, and arrays (i.e., all complete values).

nemith · 2025-02-14T20:34:36Z

For example, 75% of code avoid http.MethodGet literal and instead just use "GET". It is almost certain that the world will be inconsistent about usage. In the case of "jsontext", it is notably more effort to reference jsontext.KindNull rather than use the 'n' literal.

I don't think this is a fair comparison and trying to abstract lessons from that doesn't feel right for this situation.

HTTP RFC doesn't actually limit the methods being used. So the API must be able to support additional methods so string MUST be the underlying type. There is no great way in making it opaque. Comparison with the jsontext.Token where the value is from a limited set and shouldn't ever support additional values. You could make the types anything besides characters and things will works just fine (iota with int8?)
HTTP Methods in their non-const form are very readable and, I dare say, more readable than the constant versions. "GET", "POST" , "DELETE" are whole words, capitalized and are exactly map to what is on the wire. Compare that to jsontext.Token and the only 4([,],{,}) out of the 7 map directly. The other are contrived replacements which require some careful document reading to figure out. "GET" means HTTP Get is different than '0' means number.

Also on the "effort" front are we talking just about number of character to type? Code is read more than it's is written. 'n' may be low effort to write than jsontext.KindNull but much higher effort to read and comprehend which, in my opinion, is much more important.

It should be noted that this could also just be a third-party package.

That would increase the representations rather than shrink them which seems to be the exact argument against making constants?

As for a complete separate stdlib package jsonkind I am not in huge favor of packages for pure namespacing and seems against the rest of packages in the stdlib.

gabyhelp · 2025-02-14T20:39:55Z

Related Issues

proposal: encoding/json/v2: new API for encoding/json #71497

_{(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)}

neild · 2025-02-14T20:40:05Z

If we want opaque(-ish) constants, we could define:

type Kind byte

const (
  kindNull = Kind(' ')
  // ...
)

const (
  KindNull = kindNull
  // ...
)

...although if we do that, I think the values should be integers (iota+1, etc.) rather than characters.

Either way, I think we should define these constants:

I find the argument for better code completion compelling.
The character values are cute, but not obvious. KindNull is obvious, 'n' is not.
While Go does not have support for enums, we have a convention for how to define one. I don't know if any linters will detect an enum defined in the conventional form as a set of values in a single const ( ) block and warn if a switch statement doesn't exhaustively cover the options, but it's the sort of thing you could do. Using character values would subvert this.
...and even in the absence of such a linter, I can hit "go to definition" on KindNull and have my editor take me to an exhaustive list of cases that need covering.

I also agree with @nemith's comments regarding http.MethodGet. (I started writing something, but they said it better than I was going to.)

mateusz834 · 2025-02-14T20:43:42Z

Yeah, I think that if we go with constants it should be using iota.

nemith · 2025-02-14T20:43:58Z

While Go does not have support for enums, we have a convention for how to define one. I don't know if any linters will detect an enum defined in the conventional form as a set of values in a single const ( ) block and warn if a switch statement doesn't exhaustively cover the options, but it's the sort of thing you could do. Using character values would subvert this.

https://golangci-lint.run/usage/linters/#exhaustive

Now that being said the use of exhaustive switch statement outside of the stdlib is probably limited. Most cases you are going to look for a specific token and if not move on.

dsnet · 2025-02-14T20:45:16Z

...although if we do that, I think the values should be integers (iota+1, etc.) rather than characters.

We tried that in the past, and renumbering the constants was also a performance hit since it involved many unpredictable branches. We could use a 256B lookup table, but I'm hesitant about using LUTs to solve re-mapping as it's adding more pressure on the data cache. The current approach is fast since it's just grabbing the first byte of the token (and only needs to branch for numbers).

jimmyfrasche · 2025-02-14T20:47:49Z

If it were just "{}[] I'd be fine without constants since there's a 1:1 mapping with the actual token. The other kinds may have sensible one character names assigned but they are not obvious. That is not readable. Constants are needed here for clarity.

nemith · 2025-02-14T20:55:10Z

Also for the record I didn't much care that there was no Kind constants and agreed with the initial assessment that they were not needed. That was until I went to read this example: https://pkg.go.dev/github.com/go-json-experiment/json#example-WithUnmarshalers-RawNumber

				if dec.PeekKind() == '0' {
					*val = jsontext.Value(nil)
				}

Iit look a significant amount of time to realize that '0' meant a number and that was with the context of the example saying that. Without proper context we better hope that every instance of matching a token to number, string, null, or boolean is well commented.

With constants it's obvious/self-commenting.

ChrisHines · 2025-02-16T15:53:59Z

IMHO the readability argument wins the day and we should add constants for these.

willfaught · 2025-02-16T23:51:36Z

We tried that in the past, and renumbering the constants was also a performance hit since it involved many unpredictable branches.

@dsnet I don't follow. What constants are you renumbering, and why? What perf is involved? I'm imagining the code doing something like this:

func (e *Encoder) PeekKind() Kind {
  switch e.token {
  case "{": return KindObjectBegin
...

I don't see how there's a perf difference there when KindObjectBegin is a custom byte value like '{') vs. is an automatic number from iota like 3.

puellanivis · 2025-02-17T08:38:31Z

Do you see a performance difference between:

func (e *Encoder) PeekKind() Kind {
  return e.token[0]
}

and:

func (e *Encoder) PeekKind() Kind {
  switch e.token {
  case "{": return 1

godcong · 2025-02-18T10:15:50Z

Don't you need to add an exception Kind?
What if an incorrect json is passed in to get a Kind that is not in the currently defined scope?

willfaught · 2025-02-18T16:50:47Z

func (e *Encoder) PeekKind() Kind {
 return e.token[0]
}

token[0] doesn't work for numbers, null, true, false, etc.

puellanivis · 2025-02-19T17:05:08Z

func (e *Encoder) PeekKind() Kind {
 return e.token[0]
}
token[0] doesn't work for numbers, null, true, false, etc.

What do you mean? It absolutely does work for null, true, and false. That was the whole point of using 'n', 't', 'f'.

willfaught · 2025-02-19T22:42:51Z

What do you mean? It absolutely does work for null, true, and false. That was the whole point of using 'n', 't', 'f'.

I see what you mean about those. Still, it doesn't work for numbers. In any case, the code just becomes

func (e *Encoder) PeekKind() Kind {
  switch e.token[0] {
  case "{": return KindObjectBegin
  case "}": ...
  ...
  default: return KindNumber
}

I don't see how using iota for Kind* values makes that perform worse.

puellanivis · 2025-02-20T14:20:09Z

func (e *Encoder) PeekCharKind() Kind {
	tok := Kind(e.token[0])
	if tok >= '1' && tok <= '9' {
		return '0'
	}
	return tok // bug, I didn’t consider '-'
}

func (e *Encoder) PeekIotaKind() IotaKind {
	switch e.token[0] {
	case 'n': return IotaKindNull
	case 'f': return IotaKindFalse
...
	default: return IotaKindNumber
	}
}

Running off a loop of tokens, one of each possible starting character:

$ go test -bench=. -count=10
BenchmarkCharKind-32            1000000000               0.2803 ns/op
BenchmarkCharKind-32            1000000000               0.2816 ns/op
BenchmarkCharKind-32            1000000000               0.2820 ns/op
BenchmarkCharKind-32            1000000000               0.2816 ns/op
BenchmarkCharKind-32            1000000000               0.2802 ns/op
BenchmarkCharKind-32            1000000000               0.2803 ns/op
BenchmarkCharKind-32            1000000000               0.2806 ns/op
BenchmarkCharKind-32            1000000000               0.2821 ns/op
BenchmarkCharKind-32            1000000000               0.2812 ns/op
BenchmarkCharKind-32            1000000000               0.2808 ns/op
BenchmarkIotaKind-32            1000000000               0.5613 ns/op
BenchmarkIotaKind-32            1000000000               0.5620 ns/op
BenchmarkIotaKind-32            1000000000               0.5599 ns/op
BenchmarkIotaKind-32            1000000000               0.5598 ns/op
BenchmarkIotaKind-32            1000000000               0.5603 ns/op
BenchmarkIotaKind-32            1000000000               0.5608 ns/op
BenchmarkIotaKind-32            1000000000               0.5627 ns/op
BenchmarkIotaKind-32            1000000000               0.5588 ns/op
BenchmarkIotaKind-32            1000000000               0.5616 ns/op
BenchmarkIotaKind-32            1000000000               0.5575 ns/op

extemporalgenome · 2025-02-21T16:57:56Z

Since json/v2 would be a net-new stdlib addition (thus any decision we make now can't break anyone yet), and since vet is automatically run via go test, it seems that we can avoid the http.MethodGet adoption issue by launching with vet enforcement.

extemporalgenome · 2025-02-21T17:08:04Z

Would '-' be a better byte value to represent numbers? It's punctuation, like many of the other constants, and it has a unique role among numeric leading bytes, i.e. there's no + or any other modifier that's permitted in that position, whereas 0 has somewhat less distinctiveness compared to 1-9. Aside from numbers, all other kinds have a unique leading byte.

jimmyfrasche · 2025-02-21T17:44:03Z

If the constants are named it doesn't especially matter what the values assigned to each name are.

puellanivis · 2025-02-22T08:07:42Z

Since json/v2 would be a net-new stdlib addition (thus any decision we make now can't break anyone yet), and since vet is automatically run via go test, it seems that we can avoid the http.MethodGet adoption issue by launching with vet enforcement.

Honestly, I think we’ll avoid the http.MethodGet in general, because json.KindNull != "null". I presume a chief cause of the problem with the http.Method* names is that they’re shorter and more readable as a bare string, and that value is repeated essentially verbatim in the constants’ names.

extemporalgenome · 2025-02-24T17:55:02Z

Nonetheless, we can add vet coverage for these, and not rely on adoption via path-of-least-resistance.

dsnet added the Proposal label Feb 14, 2025

gopherbot added this to the Proposal milestone Feb 14, 2025

dsnet added the LibraryProposal Issues describing a requested change to the Go standard library or x/ libraries, but not to a tool label Feb 14, 2025

dsnet mentioned this issue Feb 14, 2025

proposal: encoding/json/v2: new API for encoding/json #71497

Open

ianlancetaylor added this to Proposals Feb 14, 2025

ianlancetaylor moved this to Incoming in Proposals Feb 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: encoding/json/jsontext: add constants for each Kind #71756

proposal: encoding/json/jsontext: add constants for each Kind #71756

dsnet commented Feb 14, 2025 •

edited

Loading

mateusz834 commented Feb 14, 2025

mateusz834 commented Feb 14, 2025

dsnet commented Feb 14, 2025

nemith commented Feb 14, 2025 •

edited

Loading

gabyhelp commented Feb 14, 2025

neild commented Feb 14, 2025

mateusz834 commented Feb 14, 2025

nemith commented Feb 14, 2025

dsnet commented Feb 14, 2025 •

edited

Loading

jimmyfrasche commented Feb 14, 2025

nemith commented Feb 14, 2025 •

edited

Loading

ChrisHines commented Feb 16, 2025

willfaught commented Feb 16, 2025

puellanivis commented Feb 17, 2025

godcong commented Feb 18, 2025

willfaught commented Feb 18, 2025

puellanivis commented Feb 19, 2025

willfaught commented Feb 19, 2025

puellanivis commented Feb 20, 2025 •

edited

Loading

extemporalgenome commented Feb 21, 2025

extemporalgenome commented Feb 21, 2025 •

edited

Loading

jimmyfrasche commented Feb 21, 2025

puellanivis commented Feb 22, 2025

extemporalgenome commented Feb 24, 2025

proposal: encoding/json/jsontext: add constants for each Kind #71756

proposal: encoding/json/jsontext: add constants for each Kind #71756

Comments

dsnet commented Feb 14, 2025 • edited Loading

Proposal Details

Alternatives

What about making Kind an opaque type?

What about a separate package?

mateusz834 commented Feb 14, 2025

mateusz834 commented Feb 14, 2025

dsnet commented Feb 14, 2025

nemith commented Feb 14, 2025 • edited Loading

gabyhelp commented Feb 14, 2025

neild commented Feb 14, 2025

mateusz834 commented Feb 14, 2025

nemith commented Feb 14, 2025

dsnet commented Feb 14, 2025 • edited Loading

jimmyfrasche commented Feb 14, 2025

nemith commented Feb 14, 2025 • edited Loading

ChrisHines commented Feb 16, 2025

willfaught commented Feb 16, 2025

puellanivis commented Feb 17, 2025

godcong commented Feb 18, 2025

willfaught commented Feb 18, 2025

puellanivis commented Feb 19, 2025

willfaught commented Feb 19, 2025

puellanivis commented Feb 20, 2025 • edited Loading

extemporalgenome commented Feb 21, 2025

extemporalgenome commented Feb 21, 2025 • edited Loading

jimmyfrasche commented Feb 21, 2025

puellanivis commented Feb 22, 2025

extemporalgenome commented Feb 24, 2025

dsnet commented Feb 14, 2025 •

edited

Loading

What about making `Kind` an opaque type?

nemith commented Feb 14, 2025 •

edited

Loading

dsnet commented Feb 14, 2025 •

edited

Loading

nemith commented Feb 14, 2025 •

edited

Loading

puellanivis commented Feb 20, 2025 •

edited

Loading

extemporalgenome commented Feb 21, 2025 •

edited

Loading