Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
A rough proposal for sum types in Go (2018) (manishearth.github.io)
101 points by isaacimagine on Nov 15, 2021 | hide | past | favorite | 71 comments



The type sets proposal for Go has already been accepted as a clarification to the generics proposal [0]:

    type SignedInteger interface {
        ~int | ~int8 | ~int16 | ~int32 | ~int64
    }
Interfaces that contain type sets are only allowed to be used in generic constraints. However, a future extension might permit the use of type sets in regular interface types:

> We have proposed that constraints can embed some additional elements. With this proposal, any interface type that embeds anything other than an interface type can only be used as a constraint or as an embedded element in another constraint. A natural next step would be to permit using interface types that embed any type, or that embed these new elements, as an ordinary type, not just as a constraint.

> We are not proposing that today. But the rules for type sets and methods set above describe how they would behave. Any type that is an element of the type set could be assigned to such an interface type. A value of such an interface type would permit calling any member of the corresponding method set.

> This would permit a version of what other languages call sum types or union types. It would be a Go interface type to which only specific types could be assigned. Such an interface type could still take the value nil, of course, so it would not be quite the same as a typical sum type.

> In any case, this is something to consider in a future proposal, not this one.

This along with exhaustive type switches would bring Go something close to the sum types of Rust and Swift.

[0]: https://github.com/golang/go/issues/45346


One difference between this proposal and Rust enums (i.e. tagged unions) is that enums let you use the same type more than once with a different tag. Obviously Go doesn't have generics yet, but something like `Result<String, String>` doesn't seem like it would be straightforward as a non-generic type either with a type set, since you don't have any way of differentiating between which "type" of string you might have. I _think_ this might be possible with a typeset by defining a newtype for one or both of the string types, but I haven't used Go in long enough that I don't remember if newtypes will implicitly convert to the type they wrap or not.


With this proposal the variants are simply types themselves, so you'd have a struct Ok[T] and Err [T], and thus Result<T, E> would be something like

    interface Result[T, E] {
        for Ok[T], Err[E]
    }
And there’s no issue with having T=string and E=string.


They don't implicitly convert (though you can still use literals to create them). So yes, specifying Ok and Error string types should hypothetically work for this.


Yeah, the standard solution is to create a newtype for each such variant. I believe this is the preferred idiom in languages like TypeScript that have union types but not discriminated union types.


Typescript does have discriminated union types, you just have to use objects with a discriminating property which is a bit less ergonomic than the way those work in other languages. It's done this way to preserve TypeScript's goal of having little to no impact on runtime behavior


I believe this would work with type sets:

    type Result[T any, E any] interface {
        T | E
    }


If T=int32 and E=int32, does this behave like int32 | int32? If so, how can you tell which it is? (Presumably you have generic code that needs to know whether this case is T or E). If not, that's pretty surprising.


No. T | E should only accept T or E. If their underlying type is int32 then int32 will not be accepted.

If we want to accept underlying types then we would need ~T | ~E.

Either way this doesn't help since typesets cannot be used as normal interfaces, only generic constraints.


That looks like a great feature to add to Go! It seems to make the original article (which is from 2018) obsolete.


What's the best way to read about all the new accepted grammar developments in Go in the last year or two?


The Go grammar/language changes very slowly -- most of the improvements are in the tooling and libraries, so there isn't much. But the best way is the release notes. For the last two years (four versions):

https://golang.org/doc/go1.17: three very minor changes for array pointers and unsafe

https://golang.org/doc/go1.16: no language changes

https://golang.org/doc/go1.15: no language changes

https://golang.org/doc/go1.14: minor change to allow "overlapping interfaces"

Of course, in 1.18 -- coming out in about three months -- there's the huge change to add generics (type parameters). The best way to read about that is in the type parameters proposal here: https://go.googlesource.com/proposal/+/refs/heads/master/des...

Go 1.18 will also add built-in fuzzing support, but again, that's a tooling/library change, not a grammar/language one.


Yes please! I really dislike every golang function returning "foo*, error" when it really means either foo or an error.


Sum types would be nice to describe protocols such a grpc as well (given suitable syntactic sugar to switch on them).

I think sum types, possibly in combination with tuples would have been nicer than multiple return values. But I guess everyone has their particular wish list of what a programming language should be :-)


I disliked it initially. After 5-6 years of using Go as my primary language I’ve come to appreciate it. It is clearer than having to figure out what is returned. And if an error can be returned. It’s easier when it is right there in the function signature.


I don't think it's clearer personally, but the problem is really that it's not secure. If you forget to check the `err` (or the `nil`) then you might shoot yourself in the foot. As a pentester, most of the bugs I found in Golang codebases would have been avoided with sum types.


But a Result<foo*, error> is clearer than foo*, error. I think that's the point the parent is making.


Yes, and the point I was making was that I don't think it actually makes things much clearer. One has more syntax and alters the effective return value, the other just states what you will get back regardless of result. From a practical point of view when you are consuming an API, which requires more concentration?


> From a practical point of view when you are consuming an API, which requires more concentration?

I think the current Go one requires more concentration, or at least more work. Lots of functions return data, err. You have to check for each of these functions if the presence of err means that data is going to be invalid/nil or add it to your code.

The same argument could be made for nil by the way. Since everything can be nil, you have to check for nil often. In languages that make a difference between a type, and a type or nil, this is easier.


The one that doesn't attempt to tell me which will be valid in relation to eachother, so I have to go divine that rather than just reading the function signature?


Exactly, at least it's clearer in the vast majority of cases where that's what I actually mean (but golang won't let me encode easily).

The way golang ~makes me encodes it makes using these functions annoying, because then you ~need to test for the nonsense cases where you have neither an error or a result, and maybe where you have both.


But how is it clearer? I can see the point of where you get neither error nor result, but you can get that in both scenarios, so I don't see how anything is gained. And you still have the same burden of checking whether an error condition did arise. That is the bit that usually matters when you call a function: knowing if it has error conditions (many functions don't) and checking if/what they are.

I don't see any practical advantage here. Explain it to me like I'm five :)


> I can see the point of where you get neither error nor result, but you can get that in both scenarios

No you can't? The whole point of a sum type is that it's definitely exactly one of a result or an error; the language will not let it be both or neither.


Look at it from the receiving end: you are calling a function and you need to decide what to do with the response. The only thing that's different is that you get a single value you have to check for errors rather than a separate value. It is the same amount of code.

I can appreciate that people interested in language design think this is an important distinction, but in practice it really doesn't make a huge difference when you consume an API.

And if we move to the more general case, where a function in essence has multiple return types (albeit narrowed to a set of possible types), you still end up with something akin to a type switch. Yes, you can skip a default clause and you can have the compiler complain if you don't handle all types explicitly, but it is unclear to me if that doesn't just create new kinds of annoyances.

(I deal a lot with sum types in Protobuffers and it isn't always as nice as I had hoped. In fact, I use them for entirely different reasons than correctness and clarity)


> The only thing that's different is that you get a single value you have to check for errors rather than a separate value..

The type system confines you to a set of reasonable cases that allow a caller to reason about the state of the program. This has two benefits for the caller:

1. It is required that the caller check whether a return value is success or failure in order to access the value they want. There is no possibility to mistake one case for another.

2. In the space of valid return values for idiomatic Go function signatures, 50% of them are unidiomatic and end up being ignored. It is far clearer for a caller to understand what is expected of them when valid values exactly overlap with the space. It is far clearer for an author to convey expectations to a caller for the same reason.

Now I must admit, good conventions and tooling in Go account for the vast majority of cases and I don't personally mind that much, but that's a different conversation than API design.


I'm coming at it solely on the basis of practical experience. Removing a single boilerplate check might seem insignificant, but it compounds: when these types are lightweight to work with you can extract more tiny helper functions, and then you have a whole vocabulary of common operations / ways of doing composition, and you can write these really readable functions that just get all the plumbing out the way (but without making it completely invisible) and let you focus on the business logic.


The thing sum types would allow is to make a clear distinction between functions that return either a result or an error and functions that can return both a result and an error. Many functions in Go will always return either res, nil or nil, err. Those functions would benefit from a way to show that in the signature, to differentiate them from the functions that can return res, err at the same time. This is not a replacement for multiple return values, as there are many cases where multiple return value are useful.


I can't count the number of times this has forced me to make a better interface/abstraction that the one I was making. It's too bad this idiom is going to be the 'wrong way' to do things soon, because, for some reason, every code Guru out there thinks number of characters/lines is the single defining cost function for software maintenance, and that's who the kids coming out of college are going to listen to.


It's just fundamentally the wrong type for what it's supposed to represent. What I'd prefer isn't any more concise in lines or characters, it simply actually represents what it's supposed to.

"a*, error" has 4 different members, if you consider just "exists" vs "doesn't exist" for each. Of those, "nil, nil" and "&a{}, errors.New("whatever")" are broken. Fully half of the things you can represent are just wrong.

(Yes there are times when those are valid returns, but easily 99% of the time they're not. The type system should be able to represent both ways.)


> The type system should be able to represent both ways.

Go's type system can already do that, the (x, error) idiom is just standard practice in the absence of Generics and Throwables. Adding generics simply flips the default and exception.

Currently, the exception is to define a custom type which captures and constrains your possible return states, and the default is to return a value plus an error. When that flips, and the default is to collapse error states into a single return value, that will cease to give me the same signals (or at least make them less obvious) as to whether or not my abstraction sucks, which I will sorely miss :P


How does go's type system allow the "exactly one of these two" case? What's the syntax?


I'm not sure what you're asking. I was pointing out that if you need to enforce a return result that will have either a value or an error, you can simple write a struct/interface that does that.

Like, you can create an interface that's read only for the return result like:

interface ReadableResult { HasResult() bool Result() whatever Error() error }

Then to create the result you have a writable interface for setting values, and enforce the 'only one of the two' semantics from the setters:

interface WritableResult { SetResult(whatever) SetError(error) }

Struct MyResult{ err error result whatever }

.... Implementation here.

It's extra code, but it should be a rare necessity, at least in my experience. Res, err := getResult() always worked fine for me.


How can you know that half the things I can represent are wrong? There are many situations where both returning a value and not returning a value are both valid. And where errors conditions may not dictate what the returned value should be.

Language design choices have to make sense. And this doesn't make practical sense to me since I've written code, just in the past 24 hours, that contradicts your assertion. And no, I don't think the code would be better if we robbed Go of expressive power - adding complexity elsewhere is just moving the problem if poor discipline around.


> How can you know that half the things I can represent are wrong? There are many situations where both returning a value and not returning a value are both valid. And where errors conditions may not dictate what the returned value should be.

In cases where it's valid, you use a type that makes it valid. Much of the point of having a type system is that the return type tells you exactly what the valid things for the function to return are.

Having an explicit declaration of which functions might return both a result and an error and which functions will definitely return one or the other but not both makes your code easier to understand and work with. As it stands, I would bet that most Go programmers ignore the possibility of returning both a result and an error even in functions that do it deliberately, because the idiomatic thing a function with that signature does is return one or the other but not both.


I agree. I would be extremely surprised and look for refactoring opportunities if I came across a function that returned a success case that was expected to be used in combination with an error value.

If anyone has an example of this being a pattern used somewhere that they think is good I'd love for them to share it.


The only case I can think of is partial success.

Probably bad example off the top of my head: You're returning say a "User" which has a user_id, name, and a list of recent purchases.

If the function errors out on the recent purchases, the caller may prefer that it still gets back a "User" object with the other information filled in, and then an "error" also set so the caller knows the information isn't complete.


A reduced version of this is maybe where Go and Rust have a pretty similar contract for things that “Read” and lean hard into permitting partial success. Both allow for some relatively sharp edge cases with zero-sized reads.

The contract is always right … how well it is expressed through language mechanics is maybe another way to contrast things here. (FWIW I think Go could tighten up a few things - dropping or shadowing errors is a bit worrying - but I’m fairly convinced sum vs product is a little too harsh of a reduction; a useful distinction but not a complete one.)


> How can you know that half the things I can represent are wrong? There are many situations where both returning a value and not returning a value are both valid. And where errors conditions may not dictate what the returned value should be.

That's why I wrote: Yes there are times when those are valid returns, but easily 99% of the time they're not. The type system should be able to represent both ways.

The golang type system can _already_ encode the case you're talking about, and that should stay. My complaint is that is the _only_ case it can represent, it cannot represent the more common case.


Are you implying that sum types wouldn't be expressed in a function signature?


No, I'm pointing out that some functions don't return errors.


That's right. This is about the language too, if Go doesn't have pattern matching, using sum types (or tagged unions) is a bit gnarly.

This is the reason for instance in C# Hot Chocolate GraphQL library the newly proposed return type for fallible mutations is, e.g. for hypothethtical register function:

    {
        user: Option<User>,
        errors: [Errors]
    }
Instead of more type-correct way of:

    union result = user | Errors
Precisely because using unions is gnarly in JS who are the main consumers of the API.

Similarly if GO doesn't have a way to pattern match, the actually using sum types is very annoying and this "foo*, error" can be seen as more convinient.


The problem here is not the absence of pattern matching, it is the absence of the "tag" part of tagged unions. Union types like the one you presented (union result = user | Errors) are different from sum types, as they don't have a tag. Without the tag you need a way to know the type of the result at runtime. For example in TypeScript, you can use a type assertion and control-flow analysis to emulate sum types and pattern matching: https://www.typescriptlang.org/docs/handbook/2/everyday-type....

While sum types and union types are close, they are not exactly the same. Let's take an example:

    union bools = bool | bool

    enum Bools = A(bool) | B(bool)
The example is a bit weird but shows an important point: bools has 2 values, true and false. Bools has 4 values, A(true), A(false), B(true), B(false). In bools (the union), true and true is always the same. In Bools (the sum/enum), A(true) is not the same as B(true).

I don't know enough about type theory to know if union types have a relationship with sum types. They can be used the same for most cases but they have differences around the edges.


Yeah I was a bit sloppy when I wrote my answer. With GraphQL you can mimick tagged unions by giving each branch an object type.

I actully wrote just few days ago about how cool it would be if TypeScript-like language too had tags:

https://github.com/Ciantic/thoughts/blob/master/2021/dynamic...


Won't the generics feature type sets fulfill this use case?


I feel like an object lesson here would be context.Context in Go, which sort of rhymes with Result types in a way … I figure this sort of programming in Go post-generics won’t emerge as uniform baseline but thematic variations will arise in some particular domains. But I really don’t know.


Many functions return foo AND the error. This is occasionally very useful.


Yeah, and without sum types those functions declare the exact same type as the ones that return either one or the other.


That's not an error then, or, that's not what I'd consider a result, or both.

Compare Rust's https://doc.rust-lang.org/std/str/fn.from_utf8.html

If you've got a buffer full of bytes, which you hope are UTF8 text but maybe aren't, call std::str::from_utf8 and you either get back a string slice with your text in it, or you get a std::str::Utf8Error structure which explains what the problem was in an actionable format. You can write code to process this Utf8Error and your bytes - maybe you're reading blocks of data and it's all valid UTF8 but the last character is continued in the next block for example, you can discern that from Utf8Error and just turn the valid part into UTF8 and keep the rest until the next block is available.


Ocassionally, but pretty rarely in my experience. Both types of functions should exist, it should not be so hard to return an either/or.


When it's occasionally useful return (type, err), but generally the two results are distinct so there's no need to return two results in the general case when one suffices

Indeed, I recently updated some code to return a relevant value in the error case: https://github.com/wal-g/wal-g/pull/1143/files#diff-d896e5d5... where true means the error is rooted in a corrupted page, & false means that the error isn't due to a corrupted page. It does feel a little dirty & a comment is probably necessary to make this case clear

Still, Result can work here. For example, Rust's binary_search https://doc.rust-lang.org/std/vec/struct.Vec.html#method.bin... which returns Ok(idx) when that's the location of the element, or Err(idx) when that's the location the element would be if it were there (this can be a bit annoying when you just want to insert, since you end up writing `let idx = match result { Ok(idx) => idx, Err(idx) => idx }`)


Not an error then, it doesn't make semantic sense.


So why not use Rust instead since it already implements sum types?


They said they liked sum types, that doesn't automatically mean they want to learn Rust. I liked the STL containers when I read about them in the 1990s, but I don't like C++ and the existence of the containers did not change that.


Because I have a job and am not yet a dictator at my company for technical decisions.


Because Rust throws out the GC which means that you have to deal with complexities of lifetime management.

(I don't agree that sum types should be added to Go, as much as I dislike the language as it is)


As someone coming from C++ I don’t comprehend this complaint. How do you even do anything in a language without deterministic destructors? All my experiences with garbage collection have been painful. RAII is so much easier to use.


For resource management, Java/C# have try-with-resources/using statements. Python has `with`, golang has "defer".

GC addresses other issues like memory sharing, cyclic data structures, etc.


I didn't see a discussion of equality checking here, so I'll mention it. It's really useful if sum types (aka variants or algebraic data types) do not have identity, i.e. that creating two identical values with the same tag (or case) with the same values compare equal. This allows the compiler to represent sums efficiently, e.g. by packing two 32-bit values into a single 64-bit word. With reference identity (two constructed values are only equivalent if they refer to the same on-heap representation), then this optimization becomes harder (basically only works with escape analysis).

Virgil has variants and also enums. They are slightly different things, though treated much the same under the hood. Neither have identity. (How they are different is that enums can have arguments in Virgil, but there are still is a fixed list of named values, whereas variants have named cases and are potentially recursive).


I’m not a daily go programmer. I’ve learned the language two years ago and know some the background behind it. Based on that I don’t see a place for sum types in Go because it does not solve a problem that is hard to solve with modern Go code.

But considering the proposal itself. I would approach it much differently.

Sum types could be represented as a closed enum.

  type Result enum int{…}
  const ( Ok Result = iota{interface{}}, Err Result = iota{string} )

The initialization would be done like:

  Ok{10}
  Err{“Something went wrong”}
The variable itself would act just like and other enum with unpacking being done somewhat close to this:

  Ok{variableName} := result
Where the variableName would be set to the zero value in case of a bad match.

The enum prefix would enforce a single const block and no extension outside the module package.


You are seeing it through the lense of an application developer. As a library author the world looks different though - here you cannot know/define the types in advance.


What does this get you that you don't get by putting an unexported method in your Go interface today? Honest question, since I'm not 100% sure I'm sharing the terminology of the author, and I don't know if this is a matter of the author being unaware of this possibility, or aware and not satisfied for some reason I don't understand.

You don't get a literal enumeration of all implemented types in the interface specification itself, but it's still trivial/mechanical to extract all implementations of such an interface with some code tool, and no external package can add to the list.

(Since I've learned from experience this always comes up: If you put a method named 'unexported' on your interface, it doesn't matter if a value in some other package creates a method called 'unexported', the compiler will not consider it a match. An interface in some package with an unexported method can not be implemented by any other package.)


In this proposal, the compiler will yell at you if you switch over the contained type of a value of the interface type and don't have a case for each possible type.


I think a difference is that current behavior is to check the interface at runtime vs static check ensuring that the interface's underlying value is one of those types. Have personally seen a number of bugs (ie panics) from the runtime checks


Go still seems to be a bit behind. Sumtypes should be standard in every new language - and better type-systems also offer union types now.


Note that this is from three years ago, and should have (2018) appended to the title.


Added. Thanks!


[flagged]


Templates? Generics are already on the way, this is a very unrelated suggestion, and seems much less complicated.


> Stop trying to make Go complicate.

You mean enterprise-y.


but golang already have a sum type: multiple return values


Multiple returns are a product, not a sum; in the case of (*Foo, error), you could return nil for both, or neither, which gives you four possible combinations (nil and nil, nil and an error, Foo and nil, and Foo and an error). A sum type would not let you return both a Foo and an error, just one or the other.


This is a huge misunderstanding of what Sum types are. Someone already answered why multiple return types are completely different from sum types, I will also mention that a type is usually something that you define for variable types like int and string and structs, not only for what a function returns.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: