-
Notifications
You must be signed in to change notification settings - Fork 1.1k
GraphQL can be less strict and complicated #932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
regarding recursive refs in fragments, this is under discussion: by the way, in the sample, did you mean "...Node" as fragment spread? fragment Node on TreeNode {
label
children {
...Node @RecursionLimit(100)
}
} |
Using interfaces instead of 'Object types' - interesting approach, I think it works already as-is, and for backward compatibility we have to keep 'type' forever. |
Here's some reasons why input (
I'm sure we can come up with solutions to all these issues (and more) by adding complexity to GraphQL - e.g. making the nullability of a field dependent on whether it's input/output, making some fields input-only or output-only, etc - but the current solution is also the simple solution, i.e. that input types and output types are inherently separate because their concerns do not align. |
Consider these examples: # This is fine.
type Query { a: A! }
type A { b: B! }
type B { a: A! name: String! } query {a{b{a{b{a{b{
name # cycle is broken
}}}}}}} However, this schema can't be queried legally: # this cycle can't be broken and should be illegal
# there is no valid way to query it.
type Query { a: A! }
type A { b: B! }
type B { a: A! } query {a{b{a{b{a{b{
a # illegal
}}}}}}} The spec allows this construct but it really shouldn't. |
it's illegal not because of loop of non-null references, but because there are no LEAF type fields!; if you make fields a and b nullable, you still won't be able to construct the query, because there's no field that can end the query! |
That's correct. What about the |
Your query can be legal, you just need to provide a selection set since query {a{b{a{b{a{b{
a { __typename }
}}}}}}} |
It's in the "Input Objects" section under the heading "Circular References": https://spec.graphql.org/draft/#sec-Input-Objects.Circular-References |
So... There are three glaringly obvious reasons why input and output types should not be distinct:
If there is a credible motivating explanation for this division, I haven't seen it. I've looked pretty hard, and if somebody can point me the right way I'd be grateful. The issues that @benjie identifies aren't things that a protocol can fix - they are fundamental to the evolution of the underlying information architecture. I really do understand why being able to rapidly evolve a protocol without going through protocol versioning hell is very appealing. The approach taken by Kenton Vardy with protocol buffers and CapnProto accomplishes a lot of the same goals without breaking the concept of types. In effect, they distinguish usefully between the static and the dynamic type of the object without giving up either. The separation also adds obscurity to a more fundamental issue that arises from the interaction between the query language and the client cache merging strategy: inconsistency. Suppose I do a query that requests fields A and B of some object. Later, I do a second query that accepts fields D and C of that same object. There is no reason to believe - and no way to check - that I now have a consistent version of the object in my client cache. It's entirely possible that there have been 85 mutations of the underlying object in the persistent store between these two queries, and that one or more of them modified the A field or the B field. If so, my cached copy on the client now blends two different versions of the object. The very last thing you want to do in this case is send that object back as an update. This is one of the issues that ultimately made me put GraphQL down. A potential "fix" here, if you want one, is to ensure that all object [fragments] have a version number (or a lastStored field) whether you asked for it or not, and that the version number gets passed in both directions. This wouldn't significantly impeded the ability to expose a third party API through a GraphQL interface. A more extreme approach would make queries persistent, in effect creating a subscription on the server. The challenge with that is that it would require a significant amount of per-client server state, which makes service replication for scaling much harder. Ultimately I suspect this would run fairly hard into some of the scaling issues that Meteor pub/sub ran into. The other issue that drove me away is that field-level selectivity and statically typed programming languages really play badly together. If all of your client targets are some variant of a browser (potentially including things like electron), great. But a GraphQL API is quite difficult to invoke from, say, C#. In my opinion, GraphQL fits the browser-based client niche extremely well, but there's more to the world than browsers. Which, ironically, is one of the reasons Geoff Schmidt left Meteor to found Apollo GraphQL. As an example, Shopify's adoption of a GraphQL-based API made life very dramatically harder for a large number of clients written in C#. We've been building APIs since the earliest days of networking - the first API specification language I leraned came from Apollo Computer more than forty years ago. I'm not a fan of discarding new ideas without a clear reason, but skepticism is healthy. GraphQL's type partitioning idea has many strikes against it. The biggest one, from my perspective, is that it is mathematically unsound. |
@jsshapiro , |
ah, and about input types and mutations - in my app. There are around 40 types to be loaded/updated, so we recognized a problem from the start - we didn't want to define extra 40+ input types and mutations, so we came up with a concept of 'one mutation', using one update endpoint, when you send a list of update packs, each is ObjectType and list of field-newValue pairs. we integrated it with server-side ORM, using mapping to entities which is already there for querying. |
This is actually a solved problem in the type-system of many programming languages. The relevant term here is variance. In this example, the type of the input would be contravariant (= client needs to use exact input type or something less specific), whereas the type of the output would be covariant (= server needs to return that exact input or something more specific). Instead of reinventing the wheel, we should learn from other languages that already solved this. Here is an example.
When correctly applying variance rules, then it is indeed possible for the client to do this! So this is not a problem. Or put in other words: when using the same type as an input and output type, then making a field nullable is a breaking change because it violates the covariance rules (i.e. the server response could break the client). However, certain changes will be possible. Such as adding a new field that is nullable. This is not breaking any variance rule.
I believe you are mainly thinking about the primary input and return type. However, imagine using dedicated input and output types, while having these types contain (= reuse) other types. This is very often a great compromise.
This would indeed add complexity and be a bad idea. I think it's much better to remove currently already existing complexity by using better concepts as a foundation that automatically allow the use cases described by the OP. |
Valentin:
I apologize that this reply comes so late - September is when a three month
period of madness started at our business.
Some of the issues you raise are valid, but they are mostly orthogonal to
the artificial distinction between input and output types. Having two
separate type systems (input and output) - and they *are* two separate type
systems - is not justified by the fact that some use cases are unusual.
The heart of the issue is what you call "symmetry." I would argue that
symmetry is an essential attribute of a comm specification language,
because without it you cannot describe agents that forward messages. For my
purposes, the inability to do that is a pretty foundational shortcoming in
GraphQL.
Some of the counter-arguments you make turn out to be examples of what
programming language people would describe as contravariance vs. covariance
problems. If subtyping exists in the language (which, in GraphQL, it does),
then these become major design concerns. There are well-visited solutions
in the PL community, but it's something that needs real care in design.
I'm not starting with the [sole] goal of building a simple API. The ability
to forward messages is, for my purposes, a requirement. Which means, in
practice, that GraphQL isn't the right solution for what I want. Also, I
understand the issues in web application scaling much better now than I did
a year ago, and in consequence GraphQL is *definitely* not what I want.
If GraphQL is solving your problem, that's all good and more power to you!
Jonathan
…On Wed, Sep 18, 2024 at 3:17 AM Valentin Willscher ***@***.***> wrote:
Here's some reasons why input (InputObject) and output object (Object)
types probably shouldn't be the same thing:
* Nullability: Changing an Object's field to be non-nullable is a non-breaking change, changing it to be nullable is a breaking change. Conversely, making an InputObject's field nullable is a non-breaking change, but making it non-nullable is a breaking change. Combine these two together (because the type is now both input and output) and you can no-longer change the nullability of the fields, which limits schema evolution options.
* Field addition: Adding any field to an Object type is a non-breaking change (because GraphQL makes you explicitly state which fields you need, so a new field will not affect any existing queries). Adding a _nullable_ field to an InputObject type is a non-breaking change (a non-nullable field would be a breaking change because it would make all previous operations using this InputObject invalid since this field will not have been supplied). Therefore you'd only be able to add nullable fields to the GraphQL schema, limiting schema evolution options.
This is actually a solved problem in the type-system of many programming
languages. The relevant term here is *variance*. In this example, the
type of the input would be *contravariant* (= client needs to use exact
input type or something less specific), whereas the type of the output
would be *covariant* (= server needs to return that exact input or
something more specific).
Instead of reinventing the wheel, we should learn from other languages
that already solved this. Here is an example
<https://docs.scala-lang.org/tour/variances.html>.
* Symmetry: If you use the same type for input as output then a user might expect to be able to supply an Object that they queried back as input to a future operation; however this may or may not be valid depending on what their selection set was, and as the schema evolves it all but guarantees that people doing this will lose data.
When correctly applying variance rules, then it is indeed possible for the
client to do this! So this is not a problem. Or put in other words: when
using the same type as an input *and* output type, then making a field
nullable is a breaking change because it violates the covariance rules
(i.e. the server response could break the client).
However, certain changes will be possible. Such as adding a new field that
is nullable. This is not breaking any variance rule.
* Asymmetry: Whereas output objects link to many other types, input objects much more rarely want inputs for these links, so when a link like this wants to be added to an existing type that's used in both input and output much thought will have to be given to whether or not to add the relating field (since it will also affect inputs). This again limits schema evolution options.
I believe you are mainly thinking about the primary input and return type.
However, imagine using dedicated input and output types, while having these
types *contain* (= reuse) other types. This is very often a great
compromise.
I'm sure we can come up with solutions to all these issues (and more) by
adding complexity to GraphQL - e.g. making the nullability of a field
dependent on whether it's input/output
This would indeed add complexity and be a bad idea. I think it's much
better to *remove* currently already existing complexity by using better
concepts as a foundation that automatically allow the use cases described
by the OP.
—
Reply to this email directly, view it on GitHub
<#932 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEJX3CQ3GNYEMPRV7PXQ35TZXFHM5AVCNFSM5POWYKEKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMZVHAYDQMBWGE3Q>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
On Fri, Feb 23, 2024 at 7:08 AM Roman Ivantsov ***@***.***> wrote:
@jsshapiro <https://github.com/jsshapiro> ,
One question, why are you saying consuming graphql from c# is a
trouble/problem? GraphQL client (the one that I have), returns c#
types/objects which are partially filled, depending on the query....
Long answer short, it comes down to the fact that *any* client-specified
query language makes static typing difficult. You can give a type for the
query return value, but what you have in the returned set isn't actually
the objects whose types you have defined - they are *projections *of those
objects. In PL terms, they are row subtypes.
At the C# level - or really, in *most* statically typed languages - you
can't up-convert those projections to the underlying object types. Which
means that the code you build in C#, or go, or whatever either has to be
very careful about where those projection objects end up or it has to treat
the input as essentially dynamic in nature.
Jonathan
… Message ID: ***@***.***>
|
@jsshapiro can you ellaborate on why you're saying the GraphQL type system is "unsound"? My current understanding of "soundness" in a type system relates to the fact that the type at build time matches the runtime one (said it otherwise, "if it builds, it runs"). I think this applies to GraphQL as well? The fact that input and output types are separated doesn't necessarily make GraphQL "unsound"? Or does it? For some context, Here is a proof that Kotlin is "unsound" I recently bumped into. But I fail to find a matching example for GraphQL. Do you have an example/proof that the GraphQL type system is unsound? |
@martinbonin: I didn't say that the GraphQL type system is unsound - that's
a whole different kettle of fish, and since GraphQL does not describe
*programs*, it would take some thought to figure out what soundness and
completeness *mean* where GraphQL is concerned.
Offhand, I think the question would depend on how the GraphQL types are
rewritten to types in the implementation language, whether that translation
is faithful to the GraphQL types (which would probably need formalization
in order to answer that question), and whether the target implementation
language's type system is sound and/or complete.
GraphQL (at least for now) doesn't have subtyping, so the
covariance/contravariance issue does not arise in the usual way. Adding
inheritance would change this. But there's a similar problem hiding in the
notion of non-breaking changes. Here's an example that benjie gave in issue
932:
Changing an Object's field to be non-nullable is a non-breaking change,
changing it to be nullable is a breaking change. Conversely, making an
InputObject's field nullable is a non-breaking change, but making it
non-nullable is a breaking change. Combine these two together (because the
type is now both input and output) and you can no-longer change the
nullability of the fields, which limits schema evolution options.
This is the same sort of thing you run into in contravariance/covariance
collisions. Another issue is that output types can have a selection set at
the query while input types cannot.
So broadly speaking, the GraphQL design prioritizes non-breaking changes
(to avoid version updates) and selection sets over symmetry of input and
output. Over the last few years I've come to feel that selection sets
aren't actually a good thing from an end-to-end protocol design point of
view:
- A series of returned object fragments can end up caching field values
from different versions of an object on the client as if they were in the
same object. They aren't.
- Because they aren't visible to the various network caches that exist
between the client and the server, selection sets mean that GraphQL query
results cannot be cached by (e.g.) CDNs.
- It is often true that different components or pages will fetch the
same objects with different but overlapping field selections in rapid
succession. Since the return values cannot be cached, this leads to
*more* data motion rather than less, and the cache cannot usefully be
consulted to suppress fetches of objects that are already present on the
client - there's a more general issue hiding in here about blind aggregate
fetches (which often exists in REST APIs as well).
In many (perhaps most) applications, the performance issue isn't bytes
transferred. There are a lot of "parsely" fields (semantically dull) like
"status" that have a limited number of values and are nicely removed by
compression.The app performance issue is more often the combination of high
latency combined with sequential dependencies between requests. The *huge*
win in GraphQL selection sets is that they eliminate many forms of
sequential dependencies. But when you actually dig down and look at how
nested selection sets get used, they are very often cases where a static
protocol definition should have anticipated the need. The most common use
cases are:
- Value types (e.g. I want both a currency value and a currency unit in
a wrapper that makes them inseparable).
- "Owned child" relationships - an order isn't very useful without its
order items.
- References to common reference objects, which are often immutable.
E.g. you can have a lot of tee shirts that reference "size small" and
"color navy", both of which turn out to be objects in production systems
because you need various information about them.
One shortcoming in GraphQL is that it doesn't make great use of HTTP/2
streaming. A lot of the sequential dependency problem can be mitigated
using streaming, and the fact that you can stick a filter in the middle
while building the second request means that you don't need to request
things you already have cached on the client. REST protocols don't do a
great job at this either.
I'm getting way past what was ever contemplated for GraphQL at this point.
I think the main thing I'm trying to say is that there's more than one way
to solve the underlying latency problems, and that those - at least in my
experience - are the primary issue.
Jonathan
…On Mon, Mar 24, 2025 at 3:23 AM Martin Bonnin ***@***.***> wrote:
@jsshapiro <https://github.com/jsshapiro> can you ellaborate on why
you're saying the GraphQL type system is "unsound"?
My current understanding of "soundness" in a type system relates to the
fact that the type at build time matches the runtime one (said it
otherwise, "if it builds, it runs"). I think this applies to GraphQL as
well? The fact that input and output types are separated doesn't
necessarily make GraphQL "unsound"? Or does it?
For some context, Here <https://arxiv.org/abs/2408.10804> is a proof that
Kotlin is "unsound" I recently bumped into. But I fail to find a matching
example for GraphQL.
Do you have an example/proof that the GraphQL type system is unsound?
—
Reply to this email directly, view it on GitHub
<#932 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEJX3CUDZVLFG5C2M7KN7G32V7MLBAVCNFSM6AAAAABZTLYHYSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONBXGYYTCOBVGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
[image: martinbonnin]*martinbonnin* left a comment
(graphql/graphql-spec#932)
<#932 (comment)>
@jsshapiro <https://github.com/jsshapiro> can you ellaborate on why
you're saying the GraphQL type system is "unsound"?
My current understanding of "soundness" in a type system relates to the
fact that the type at build time matches the runtime one (said it
otherwise, "if it builds, it runs"). I think this applies to GraphQL as
well? The fact that input and output types are separated doesn't
necessarily make GraphQL "unsound"? Or does it?
For some context, Here <https://arxiv.org/abs/2408.10804> is a proof that
Kotlin is "unsound" I recently bumped into. But I fail to find a matching
example for GraphQL.
Do you have an example/proof that the GraphQL type system is unsound?
—
Reply to this email directly, view it on GitHub
<#932 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEJX3CUDZVLFG5C2M7KN7G32V7MLBAVCNFSM6AAAAABZTLYHYSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONBXGYYTCOBVGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@jsshapiro thanks for the quick reply!
You did there:
Since I learnt about "unsoundness" recently, I was curious to learn more about it. But I agree with you that it doesn't really apply to GraphQL as GraphQL isn't describing programs. As for your other points, I disagree that GraphQL is not cacheable. You can use persisted queries and basically turn your GraphQL API into a very cacheable REST API. Client side normalized caches also help. I'm intrigued by the HTTP2 streaming possibilities. Maybe there's something that GraphQL could do better there? In which case a separate, more focused issue would help a ton! Same for input/output disymmetry. It feels a bit of an oddity but I'm sure there are tradeoff involved and a focused discussion would help (if there is not one already!). |
Okay, that's funny. I actually did look to check what I said and I totally
missed it. When I wrote that, I was thinking about inheritance and
contravariance/covariance. When I prepared the later email I realized that
inheritance isn't an issue.
GraphQL queries are not cacheable *in the network fabric*. They can
definitely be cached in the client (as Apollo does), but the objects stored
are inconsistent. When your eCommerce site gets hit with 800,000
simultaneous users and 20 purchases per minute you really want to cache
some of the query results directly in the CDN backed by SSR/ISR and a
timeout.
I haven't thought enough about HTTP2 streaming for GraphQL. I don't think
it's straightforward for REST. The key to it is to be able to partially
process a first stream of objects in order to build the dependent requests
as the original list is still arriving. GraphQL gets *some* of this from
nested selection sets. I suspect that doing anything beyond that requires a
very different query language.
Benjie did a pretty good job on the disymmetry issue in Issue 932.
…On Mon, Mar 24, 2025 at 10:36 AM Martin Bonnin ***@***.***> wrote:
@jsshapiro <https://github.com/jsshapiro> thanks for the quick reply!
I didn't say that the GraphQL type system is unsound
You did there
<#932 (comment)>
:
Formally speaking, the partition of input and output types makes the
GraphQL type system unsound (I'm using "unsound" here in a strictly
technical mathematical sense, not as a value judgement).
Since I learnt about "unsoundness" recently, I was curious to learn more
about it. But I agree with you that it doesn't really apply to GraphQL as
GraphQL isn't describing programs.
As for your other points, I disagree that GraphQL is not cacheable. You
can use persisted queries
<https://www.apollographql.com/docs/kotlin/advanced/persisted-queries>
and basically turn your GraphQL API into a very cacheable REST API. Client
side normalized caches also help.
I'm intrigued by the HTTP2 streaming possibilities. Maybe there's
something that GraphQL could do better there? In which case a separate,
more focused issue would help a ton! Same for input/output disymmetry. It
feels a bit of an oddity but I'm sure there are tradeoff involved and a
focused discussion would help (if there is not one already!).
—
Reply to this email directly, view it on GitHub
<#932 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEJX3CTDT7KOP3WIBK5K2OT2WA7BXAVCNFSM6AAAAABZTLYHYSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONBYHEYTMNBSHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
[image: martinbonnin]*martinbonnin* left a comment
(graphql/graphql-spec#932)
<#932 (comment)>
@jsshapiro <https://github.com/jsshapiro> thanks for the quick reply!
I didn't say that the GraphQL type system is unsound
You did there
<#932 (comment)>
:
Formally speaking, the partition of input and output types makes the
GraphQL type system unsound (I'm using "unsound" here in a strictly
technical mathematical sense, not as a value judgement).
Since I learnt about "unsoundness" recently, I was curious to learn more
about it. But I agree with you that it doesn't really apply to GraphQL as
GraphQL isn't describing programs.
As for your other points, I disagree that GraphQL is not cacheable. You
can use persisted queries
<https://www.apollographql.com/docs/kotlin/advanced/persisted-queries>
and basically turn your GraphQL API into a very cacheable REST API. Client
side normalized caches also help.
I'm intrigued by the HTTP2 streaming possibilities. Maybe there's
something that GraphQL could do better there? In which case a separate,
more focused issue would help a ton! Same for input/output disymmetry. It
feels a bit of an oddity but I'm sure there are tradeoff involved and a
focused discussion would help (if there is not one already!).
—
Reply to this email directly, view it on GitHub
<#932 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEJX3CTDT7KOP3WIBK5K2OT2WA7BXAVCNFSM6AAAAABZTLYHYSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONBYHEYTMNBSHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
For caching, if using persisted queries/trusted documents + HTTP GET, you can store a lot of things in your CDN. I've been doing that in Confetti. And while the traffic is nowhere near a real world website, the query responses are served from the edge in a few milliseconds. Obviously, caching is one of the most difficult thing in computer science so it's all tradeoffs, etc.. And GraphQL being very dynamic can make things look more complicated. But there are solutions out there. And if PQs + client side cache is not enough, you can also do entity-caching in your API gateway, etc... At the end of the day, GraphQL shares a lot of caching characteristics with REST. GraphQL + persisted queries is just REST where the client gets to build its own endpoints on demand without having to bother a backend engineer. |
It does have subtyping though. The main aspect of subtyping is substitutability. Therefore, the fact that you can substitute a nullable type (X) with an non-nullable type (X!) in the response(!) without breaking the schema is proof of that. You can find similar examples in the official spec. The same is true for interfaces, unions and so on. |
I'm a software engineer for almost 20 years and I recently started using GraphQL extensively.
While I understand the original thoughts of making GraphQL as simple as possible, we end up with a language that's too complicated to handle taking too much control out from the user.
For instance:
type
seems redundant, we can just use interfaces instead:input
types are just making everything more complicated and harder to share the same interfaces between inputs and outputs. It's better to define a workaround for field input with arguments than making everything duplicated with input types (Ie. ignore fields with arguments on inputs or allow a way to provide field arguments on input queries as well).I understand that this may not be the best place to communicate my thoughts but all of those points are shared across the internet with many StackOverflow and questions from frustrated users. If at least a few more people will see and agree with me then we can start pushing for a change in the proper channels.
We love GraphQL, we just want to make it better ❤️
The text was updated successfully, but these errors were encountered: