Description
Relevant discussion:
- Keyword for extending a schema json-schema-spec#907
- How to do inheritance? json-schema-spec#348 (also has lots of reference links)
Examples of current use:
- https://www.jsonschema2pojo.org/ (search "extends")
Typically, inheritance is modelled something like this:
There are no inherent relationships between the data types these schemas describe, and a codegen could simply create three disparate classes, each with all of the required properties. However, that's not the intent behind the schemas. In order for that intent to be more explicitly expressed while still allowing for normal validation to occur, I propose we add an new annotation keyword that a schema should represent a "derived" type.
derived
is a boolean that directs codegen tools to consider in-place references that appear either on their own (sibling $ref
) or in an allOf
as base types.
The above would change to
// base
{
"$id": "base",
"type": "object",
"properties": {
"foo": { "type": "string" }
},
"required": ["foo"]
}
// derived
{
"$id": "derived-1",
"$ref": "base",
"derived": true,
"properties": {
"bar": { "type": "integer" }
},
"required": ["bar"]
}
{
"$id": "derived-2",
"$ref": "base",
"derived": true,
"properties": {
"baz": { "type": "boolean" }
},
"required": ["bar"]
}
Without this keyword, or with a value of false
, codegen will be directed to create disparate types.
Multiple inheritance
If we had a schema that had multiple references in an allOf
, each reference would be a base type.
{
"$id": "derived-3",
"allOf": [
{ "$ref": "base" },
{ "$ref": "extension-data" }
],
"derived": true,
"properties": {
"baz": { "type": "boolean" }
},
"required": ["bar"]
}
for some extension-data
schema.
This, however, requires multiple inheritance, which represents a problem for many languages. For example, JavaScript and .Net don't support multiple inheritance, whereas C++ and Python do. The workaround for those that don't may be to render the base types as interfaces (i.e. type definitions with no implementation). A "derived" class can then implement any number of these interfaces. You still get the polymorphism, but you don't get a class hierarchy.
For example, in C#, derived-3
may be generated as
interface IBase
{
string Foo { get; set; }
}
interface IExtensionData
{
// ... extension data
}
class Derived3 : IBase, IExtensionData
{
public string Foo { get; set; }
// ... extension data
}
Derived3
can be used anywhere an IBase
or IExtensionData
could be used, and any JSON instance that validates against derived-3
also validates against base
and extension-data
, so we have polymorphism in that respect.
However if we need to instantiate an IBase
, we'd also need to create a class Base
that implements IBase
, and that Base
class would not be polymorphic with Derived3
.
It will need to be up to the tool to discern when to create base classes vs base interfaces as required by the generated language.
(This also illustrates why the JSON Schema team has historically recommended that generative logic should only be used as a developer tool, not in production. Generative logic cannot cater to every scenario, and any generated code should be verified before it's used.)
Other subschemas within an allOf
This issue only covers $ref
schemas inside allOf
. Subschemas could be handled as either as new types to be "inherited" from or merely as additional definition on the current type.
I've opened a separate issue for this since how to handle subschemas ties in with simple objects (#46).
Allowing undeclared properties
Many comments on this topic seem to want to apply additionalProperties
to the base. But this is wrong. It would mean that a derived-*
is not a base
. But in terms of inheritance, a derived-*
is a base
. Leaving additionalProperties
off solves the problems that arise when it's there.