Skip to content

proposal: database/sql: add configuration option to ignore null values in Scan #57099

Open
@MisterSquishy

Description

@MisterSquishy

like #28414, but for all null values

copied from #57066:

Currently, when scanning a NULL database value into a go type that is not nil-able, this package throws an error like "sql: Scan error on column index 0, name "foo": converting NULL to int is unsupported". This makes logical sense; nil and 0 are different values, and only the latter can be assigned to an int. The Null* types (NullString, NullInt, etc) account for this difference, giving programmers the ability to safely bridge the gap between nullable SQL values and primitive go types.

I propose we give programmers the option to change that behavior and instead ignore NULL values. There are a few reasons I think this option makes sense:

Alignment with programmer intent

In go, if I want a struct field with a string field, I would write type foo struct{ bar string }
In mysql, if I want a table with a string column, I would write CREATE TABLE foo (bar varchar(255));

But this produces incompatible types -- in mysql, any value can be NULL unless you explicitly say otherwise. If we wanted to enforce that our foo.bar column could never be NULL, we would have to write CREATE TABLE foo (bar varchar(255) NOT NULL); This nuance is seldom clear to programmers, and is a lesson that generally only gets learned the hard way. For instance, if you google "go sql null", you will get back numerous pages of blog posts, stack overflow posts, etc -- whether or not it is "right", the undeniable fact is that this behavior subverts programmer expectations on a regular basis.

Furthermore, when a programmer writes some go code to fetch sql data and work with it in a go program, choosing to type their variable as a non-nilable type is a signal that they do not need to write special handling for the null case. It isn't up to us to determine whether or not they should; by writing their code this way, the programmer has indicated that they have no use for this distinction. When we force them to care (at runtime, often long after the code has been written, when we unexpectedly encounter a NULL value in the database), this produces broken software that isn't always straightforward to triage/remediate.

Fundamentally, the responsibility of this package is to bridge the gap between sql and go. Accounting for the disparate treatment of NULL in these two languages is currently the responsibility of the programmer, but by reconciling this difference in the translation layer, we can prevent bugs and make it easier to write great go code.

Consistency with json unmarshaling behaviors

As shown in this playground, unmarshaling json with NULL values into non-nilable types already works according to the programmer expectations described above -- we leave these values untouched. Whether or not the json unmarshaler is "right" is up for debate (see #33835), but the fact of the matter is that this will likely have to remain the default behavior forever. The fact that these two standard libraries have opposite opinions is unfortunate, and can probably never be changed. However, we compound this issue by giving programmers no mechanism to make them behave the same way. So, every go programmer has to be aware of this distinction and correctly design their types for the environment(s) that they expect to instantiate them from. This is clunky at best, and in practice it creates vast swaths of bug habitat.

It would be great to introduce an analogous configuration option in the standard json unmarshaler (or perhaps even reuse the same configuration flag), but that is out of scope for this issue.

For the above reasons, it's important to give programmers the ability to opt out of this behavior. Using Null* types or pointers to primitives remain viable for situations where we need to distinguish between NULL and other values, but in situations where the programmer does not care about this distinction, we should allow them this freedom.

I'm not totally sure of the most standard way to specify this bit of configuration; I think it would likely suffice to be a global setting (not per connection, transaction, statement, etc)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    Incoming

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions