-
Notifications
You must be signed in to change notification settings - Fork 213
Extension types #2727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I like the simplicity of this proposal in particular. It has all the semantics expected of a wrapper/newtype and in its current form does not seem to have many caveats. How does this interact with generics? I'm guessing Also, wondering if this could avoid the field. Like: inline class IdNumber on int {
IdNumber(super);
operator <(IdNumber other) => super < other.super;
} This may be breaking though, as it'll start treating |
Exactly.
That would certainly be possible. The use of an However, the main reason why the proposal does not do these things today is that we gave a lot of weight to the value of having an inline class declaration which is as similar to a regular class declaration as possible. For instance, the use of constructors makes it possible to establish the representation object in many different ways, using a mechanism which is well-known (constructors, including redirecting and/or factory constructors). We may very well get rid of the field in a different sense: We are considering adding a new kind of constructor known as a 'primary' constructor. This is simply a syntactic abbreviation that allows us to declare a constructor with a parameter, and implicitly get a final field with the parameter's type. This is similar to the primary constructors of Scala classes: It is part of the header of the declaration, it goes right after the name, and it looks like a declaration of a list of formal parameters. inline class IdNumber(int value) {
bool operator <(IdNumber other) => value < other.value;
} The likely outcome is that primary constructors will be added to Dart, and they will be available on all kinds of declarations with constructors (that's classes and inline classes, for now). However, the design hasn't been finalized (and, of course, at this point we can't promise that Dart will have primary constructors at all), so they aren't available at this point. |
Does this feature replace the previous proposal of views? |
Probably you have already considered it, but I wonder if it could be useful to extend the specification to support also inline classes with multiple fields. |
An inline class with multiple fields cannot be assignable to With records, you can store several values in one object, so an inline class on a record type could behave like an inline class on multiple fields, with an extra indirection. We could consider allowing multiple fields, and then implicitly making the representation type into a record, but then we'd have to define the record structure (all positional in source order, or all named?). |
@purplenoodlesoop wrote:
Yes. The main differences are that the construct has been renamed, and the support for primary constructors has been taken out (such that we can take another iteration on a more general kind of primary constructors and then, if that proposal is accepted, have them on all kinds of classes). |
@eernstg It seems that this new proposal doesn't have the possibility to |
True. I'm pushing for an extension to the inline class feature to do those things. It won't be part of the initial version of this feature, though. |
I'm curious whether a future version of this feature might include some standardized support for enforcing that the type is a subset of all available values? For example, the Even without a formal ranged value feature, it would be nice to have a convention for asserting correctness of a value, for example, having something like |
The current version definitely do not provide any such guarantee. An alternative where we retain the inline-class type at runtime, so the type I don't think it's impossible. I think making That'd be a feature with much different performance tradeoffs, and would need deep hooks into the runtime system in order to have types (as sets of values) that are detached from our existing types (runtime types of objects). |
@timsneath wrote:
We could choose to use a different run-time representation of an inline type (so the representation of We could then make I proposed a long time ago that we should create a connection between casts and execution of user-written code (so It would probably be possible to maintain a discipline where every value of an inline type is known to have been vetted by calling one of the constructors of the inline type, and similarly it might be possible to prevent that the representation object leaks out (with its "original" type). So there are several things we could do, but we haven't explored this direction in the design very intensively. With the proposal that has been accepted, the run-time representation of an inline type is exactly the same as the run-time representation of the underlying representation type (for instance I believe it would be possible to introduce a variant of the inline class feature which uses a distinct representation of the type at run time and gives us some extra guarantees. We could return to that idea at some point in the future (or perhaps not, if it doesn't get enough support from all the stakeholders ;-). |
Thanks to both of you. Yeah, the feature you describe sounds pretty exciting. I guess I'm wondering whether the current proposal leads us to having two slightly different solutions to the same use case: inline classes and type aliases. In both cases, the type is erased at runtime; the former is available at compile-time. I'm very much speaking as a layperson here, so my comments don't have a ton of weight. But if we can only have two of the three (erased at compile/runtime, hybrid, distinct at compile/runtime), would we want to pick the two that have the greatest distinction in usage scenarios, to maximize their value? Inline classes seem to make type aliases redundant, at least for the things I use them for today. |
Both type aliases and inline classes can be directly mapped to Haskell's Type synonyms and Newtypes, I think I even saw direct references in the feature specifications. I think both have their uses – typedefs carry a semantic value when inline classes explicitly offer new functionality. I see it as two separate use cases. |
@timsneath, I agree with @purplenoodlesoop that the two constructs are different enough to make them relevant to different use cases. Here's a bit more about what those use cases could be: An inline class allows us to perform a replacement of the set of members available on an existing object (a partial or complete replacement as needed, except members of So we can make a decision to treat a given object in a customized manner—because that's convenient; or because we want to protect ourselves from doing things to that object which are outside the boundary of the inline class interface; or because we want to raise a wall between different objects with the same representation type, so that we don't confuse objects that are intended to mean different things (e.g., A typedef makes absolutely no difference for the interface, and it doesn't prevent assignment from the typedef name to the denoted type, or vice versa. On the other hand, we can use a typedef as a simple type-level function, and this makes it possible to simplify code that needs to pass multiple type arguments in some situations. Here is an example where we also use the typedef'd name in an instance creation: class Pair<X, Y> {
final X x;
final Y y;
Pair(this.x, this.y);
}
typedef FunPair<X, Y, Z> = Pair<Y Function(X), Z Function(Y)>;
void main() {
var p = FunPair((String s) => s.length, (i) => [i]);
print(p.runtimeType);
// 'Pair<(String) => int, (int) => List<int>>'.
}
In the first line of I think this illustrates that typedefs and inline classes can at least be used in ways that are quite different, and my gut feeling is that there won't be that many situations where we could use one or we could use the other, and there's real doubt about which one is better. |
This is a really helpful explanation, thanks @eernstg! (We should make sure that we add something like this to the documentation when we get to that point...) |
I could be missing a bunch of aspects to this, but I think given Dart's direction towards a sound language, it's worth exploring how much the soundness can be preserved with this feature. Specifically, the most useful soundness guarantee would be during the One issue we'd want to avoid is reintroduce the wrapper object problem for containers, since to get a type-safe conversion from This is where having inline class InlineList<F, inline T on F> {
InlineList(this._it);
final List<F> _it;
T operator[] (int index) => _it[index] as T;
List<T> cast() => List.generate(_it.length, (i) => _it[index] as T);
} Here Now we can safely view a typedef SocialSecurityNumberList = InlineList<int, SocialSecurityNumber>;
void main() {
final rawList = <int>[1, 2, 3, 4, 5];
final ssnList = rawList as SocialSecurityNumberList;
print(ssnList[3]); // SocialSecurityNumber constructor invoked here
} The following key properties are preserved:
I'm not sure what the issue is with hot reload. Object representations are still the same. If the issue is with soundness, then for hot reload it can be relaxed. Hot reload already has type related issues. After all, all we do is validate the permitted value spaces. This is already an issue with hot reload. Consider expression:
It extracts odd numbers out of a list, and potentially stores it somewhere in the app's state. Now what happens when you change it to:
Then hot reload. Any new invocations will produce even numbers. However, all existing outputs of this function stored in the app state will continue holding onto odd numbers. All bets are off at this point, and you have to restart the app (luckily there's hot-restart). So I think it's totally reasonable for hot reload not preserve that level of soundness. |
The It cannot work with the current design. When we completely erase the inline class at runtime, the Even if we know that the In short: You cannot abstract over inline classes, because abstracting over types is done using type variables, which only get their binding at runtime, and inline classes do not exist at runtime. We could make all inline classes automatically subtype a platform type That will allow one to enforce that a static type is an inline class type, and be used to bound type parameters like suggested above. (No need for special syntax like We can make So, basically, the current inline class design does not provide a way to guard a type against some values of the representation type, no way to create subsets of types. Anything which requires calling a validation operation at runtime is impossible to enforce, because we can skirt it using the completely unsafe The only security you get against using an incompatible If your program contains no |
Thanks for the detailed explanation! Yeah, this would need extra RTTI to work.
Cool. If we require explicit casts, then maybe this is safe enough as is. Anyway, looking forward to trying this out! This is my favorite upcoming language feature so far 👍 |
Is this feature will be shipped in Dart 3.0? It seems perfect to create value objects by the inline class. |
It is not part of 3.0. |
@yjbanov wrote:
Note that we had some discussions about a feature which was motivated by exactly this line of thinking: In an early proposal for the mechanism which is now known as inline classes, there is a section describing 'Protected extension types'. The idea is that a protected extension type is given a separate representation at run time when used as a part of a composite type (that is, as a type argument or as a parameter/return type of a function type). This means that we can distinguish a The main element of this feature is that an inline class can be protected inline class SocialSecurityNumber {
final int value;
SocialSecurityNumber(this._value); // Implicit body: { if (!isValid) throw SomeError(...); }
bool get isValid => ...; // Some validation.
}
void main() {
var ssn = SocialSecurityNumber(42); // Let's just assume that the validation succeeds.
// Single object.
int i = ssn.value; // The inline class can always choose to leak the value.
i = ssn as int; // We can also force the conversion: Succeeds at run time.
ssn = i as SocialSecurityNumber; // Will run `isValid`.
// Higher-order cases.
List<int> xs = [42];
List<SocialSecurityNumber> ys = xs; // Compile-time error.
ys = xs as List<SocialSecurityNumber>; // Throws at run time.
xs.forEach((i) => ys.add(i as SocialSecurityNumber)); // Runs `isValid` on each element.
void f<Y>(List xs, List<Y> ys) {
xs.forEach((x) => ys.add(x as Y)); // Runs `Y.isValid` on `x` when `Y` is a protected inline type.
}
f(xs, <SocialSecurityNumber>[]); // OK.
f([41], <SocialSecurityNumber>[]); // Throws if 41 isn't valid.
} This mechanism would rely on changing the semantics of performing a dynamic type check. This is the action taken by evaluation of The fact that the invocation of Of course, if This would work also in the case where the source code has a type variable We could express the inline class InlineList<X, I> {
final List<X> _it;
InlineList(this._it);
I operator [](int index) => _it[index] as I; // Runs `isValid` if `I` is protected.
List<I> cast() => List.generate(_it.length, (i) => _it[index] as I);
} The whole idea about protected inline classes was discussed rather intensively for a while, e.g., in #1466, but it did not get sufficient support from the rest of the language team. Surely there would still be a need to sort out some details, and in particular it would be crucial that it is a pay-as-you-go mechanism, not an added cost for every dynamic type check. |
I totally agree that it should be pay-as-you-go. The zero-cost abstraction is the main feature of this. Otherwise, the current classes would be just fine. The kind of safety I'm looking for is mostly development-time safety, where performance is not as important as ability to catch bugs. There's not much recovery to do if But yeah, if this makes the language too complex, then this is fine as specified. |
Silly idea: We could have a special variant of inline classes: When assertions are not enabled it would be a plain inline class. This means that List<I> castList<X, I>(List<X> xs) {
if (X == I) return xs as List<I>;
var ys = <I>[];
for (var x in xs) ys.add(x as I);
} During a production execution we would have It is a silly idea, of course, because we would make production execution substantially different from development execution (some types are different in one mode and identical in another), and that is always a serious source of difficulties, e.g., because some bugs occur in one mode and not in the other. But it's still kind of fun. ;-) |
is currently possible to test this? |
The inline class feature can be enabled (using an experimental and not entirely complete implementation) by giving the option |
If I clone |
@eernstg: void testC(int mask) {
var numbers = <NumberC>[];
print('Measure `NumberC` $mask:');
for (int i = 0; i < count; ++i) {
if ((i&mask==0)) numbers.clear();
numbers.add(NumberC(i));
}
speak('${numbers[0]}');
} The size of the batch affects the performance greatly (we already saw this effect earlier). My theory is that in dart, the GC is configured so that the "young generation" is limited to some value between 8K and 16K objects (maybe 10K?). Here's the thing: in a typical use case for union types (passing parameters to functions) the lifetime of the object is short. The tests show that the overhead of allocating/deallocating such objects is very small - probably negligible in the context of the application. In javascript, the GC is configured differently (I ran similar tests to find out). The performance on batches drops sharply much later - after 64K or even later). I speculate that in dart, GC configuration was fine-tuned based on some tests of flutter apps or something. I don't think we have to put too much emphasis on runtime costs for unions. We pay a reasonable price for a valuable feature. EDIT: forgot to mention: the overhead of NumberC vs int is 1.6:1 for batch size < N, where N is approx. 10K objects. |
I would say that such a thing already exists - its called sealed types, and we wouldnt need a package for it.
what we need is proper unions, but thats the burden of the language team. the most we can do is this, and maybe some linter assistance, if you intend on going that far. |
I created @TekExplorer wrote:
Right, if you can commit to always taking the exact same union of types for each of its operands, and you have the ability to edit each of the operand types, and you can manage to declare them in the same library—then you're better off just writing a small sealed hierarchy with the wannabe union type at the top and each of the operands as immediate subtypes. sealed class UnionOfABC {}
class A implements UnionOfABC ... {...}
class B implements UnionOfABC ... {...}
class C implements UnionOfABC ... {...} However, if you want to create a union where one or more operands is an existing type that you can't edit then I don't think it would be very helpful. You could of course use the sealed hierarchy as a set of wrapper objects holding the desired operand types, but that's pretty much a reimplementation of |
@tatumizer wrote:
You could try to play around with In the end, it's the application level performance that matters. If developers have a choice like "use |
Perhaps, but doing that myself means I can give each class a semantically relevant name, and define interface methods as desired. This is what we call "tagged unions" and thanks to sealed classes, it's viable. As a parameter, it's annoying because you can't provide a value directly - which you also can't do with extension types anyway. The only fix for that is either proper unions, or implicit constructors. As a return type, an extension type is actually pretty good since it supports the existing interface, so that something such as jsonDecode could cleanly support it, while still allowing is and as checks. Of course, untagged unions are still better here, since it would be exhaustive. |
@eernstg: I think you are overlooking the argument that union types are mostly used for passing parameters, in which case the objects are always short-lived. So by introducing a "cheap" version of union type, you might be targeting a use case that will rarely (or never) materialize. Why everything in dart has to come in 2 colors? :-) I also have 2 specific questions:
sum type Encodable (int | String | List<Encodable>); // strawman |
@TekExplorer wrote:
Right, and it should also be noted that a sealed hierarchy allows for built-in exhaustiveness analysis. So if it fits into your use case to use a sealed hierarchy then by all means do so! @tatumizer wrote:
As everything that's got to do with performance, it gets more tricky if we try to understand more details. For example, the virtual machine which is running if you do On web platforms compiling to JavaScript, the With About the recursive union types: There is no problem if we want to have algebraic data types similar to the basic ones in SML: datatype arith =
Plus of arith * arith
| Times of arith * arith
| Negate of arith
| Var of string
| Num of int This is very similar to the following (using primary constructors in order to remain somewhat concise): sealed class const Arith;
class const Plus(Arith a, Arith b) implements Arith;
class const Times(Arith a, Arith b) implements Arith;
class const Negate(Arith a) implements Arith;
class const Var(String s) implements Arith;
class const Num(int i) implements Arith; However, the point is that this kind of types will reify the structure of the type declaration, in the sense that if you have a With an untagged union type we want to specify a type which contains the operand type values directly, not as a value which can only be accessed indirectly through reifications of the entire union type structure. I'd assume that you want to be able to test at run time whether a given object is sum type Encodable (int | String | List<Encodable>);
// Another notation that we've used.
typedef Encodable = int | String | List<Encodable>; Consider the situation where we do not reify If we insist that this type includes In particular, if you build an Note that it is possible that someone changes an If All these difficulties are a direct consequence of the fact that we do not want to reify the In JavaScript, types are not reified at run time (roughly, it is still possible to determine that 1 is a number and such), and there is no support for run-time enforcement of statically known types (that is, soundness). As far as I can see, there's no other way to support recursive type aliases in Dart than to reify them. That's a whole other can of worms. ;-) |
@eernstg: WRT smi, things don't add up. :-) Yes, I'm aware of the smi thing. That's why I'm surprised to see the results of the benchmark. And that's exactly why I'm asking.
void testInt(int mask) {
var numbers = <int>[];
print('Measure `int` $mask:');
for (int i = 0; i < count; ++i) {
if ((i&mask==0)) numbers.clear();
numbers.add(i);
}
speak("${numbers[0]}");
} The value of int doesn't exceed 100000000, so it certainly fits in smi. If nothing is boxed, where does the degradation come from? Later in the day, I conducted a crucial experiment: modified the program so that neither int not NumberC variant fill the list at all: in the loop, they just sum up the values: BTW, the absolute performance was about 5x better for int, and 8 times better for NumberC, compared with the previous test where we used to fill the list, even with N=511. Maybe I did something wrong though - the results are too good to be true 😄 |
😁 Running something that would be similar to your example (I used The I then added
As far as I can see, it does make a difference.
Sounds like the optimizer somehow managed to remove all the code that makes those two cases different. That is probably not going to be true when you scale up the complexity of the program. Here are the results I get with the following code: Codeconst count = 1_000_000_000;
late Timer timer;
class Timer {
int start;
Timer(): start = _now();
int get time => _now() - start;
void clear() => start = _now();
static int _now() => DateTime.now().millisecondsSinceEpoch;
}
void speak(String message) {
print(' ${timer.time}: $message');
timer.clear();
}
class NumberC {
final int value;
const NumberC(this.value);
NumberC operator +(NumberC other) => NumberC(value + other.value);
}
// Example from tatumizer -- I guessed the code, wasn't included
void testInt(int mask) {
int sum = 0;
print('Measure `int` $mask:');
for (int i = 0; i < count; ++i) {
if ((i&mask==0)) sum = 0;
sum += func(dummy = i);
}
speak("$sum");
}
// Added by me
dynamic dummy = 0;
int func(int arg) {
if (1 > 2) arg = dummy;
return arg;
}
NumberC anotherFunc(NumberC arg) {
if (1 > 2) arg = NumberC(dummy);
return arg;
}
void testNumberC(int mask) {
int sum = 0;
print('Measure `NumberC` $mask:');
for (int i = 0; i < count; ++i) {
if ((i&mask==0)) sum = 0;
sum += anotherFunc(dummy = NumberC(i)).value;
}
speak("$sum");
}
void main() {
timer = Timer();
testInt(0xFFFF);
testInt(0x7FFF);
testInt(0x3FFF);
testInt(0x1FFF);
testInt(0x0FFF);
testNumberC(0xFFFF);
testNumberC(0x7FFF);
testNumberC(0x3FFF);
testNumberC(0x1FFF);
testNumberC(0x0FFF);
} Results:
If we eliminate the list allocation work (by using the same backing store all the time and storing directly at the desired index), we get the following results:
Here's the code: Codeconst count = 100000000;
late Timer timer;
class Timer {
int start;
Timer(): start = _now();
int get time => _now() - start;
void clear() => start = _now();
static int _now() => DateTime.now().millisecondsSinceEpoch;
}
void speak(String message) {
print(' ${timer.time}: $message');
timer.clear();
}
class NumberC {
final int value;
const NumberC(this.value);
NumberC operator +(NumberC other) => NumberC(value + other.value);
}
// Example from tatumizer.
void testInt(int mask) {
var numbers = List<int>.filled(mask + 1, 0);
var index = 0;
print('Measure `int` $mask:');
for (int i = 0; i < count; ++i) {
if (i & mask == 0) index = 0;
numbers[index] = i;
++index;
}
speak("${numbers[0]}");
}
// Added by me
void testNumberC(int mask) {
var numbers = List<NumberC>.filled(mask + 1, const NumberC(0));
print('Measure `NumberC` $mask:');
var index = 0;
for (int i = 0; i < count; ++i) {
if (i & mask == 0) index = 0;
numbers[index] = NumberC(i);
++index;
}
speak("${numbers[0].value}");
}
void main() {
timer = Timer();
testInt(0xFFFF);
testInt(0x7FFF);
testInt(0x3FFF);
testInt(0x1FFF);
testInt(0x0FFF);
testNumberC(0xFFFF);
testNumberC(0x7FFF);
testNumberC(0x3FFF);
testNumberC(0x1FFF);
testNumberC(0x0FFF);
} This again illustrates that [NB: This is still somewhat relevant to the topic 'extension types' because they are treated like the representation object at run time, and that could be an |
Yeah, I missed the trick with dynamic. I'm thinking of a different test though. When we pass a union as a parameter, the function has to implement a switch over all member types. So any test should incorporate the whole thing, with the switch. I'm going to figure out how to simulate int vs NumberC in this scenario. Will report my findings here. At least one effect got demistified. I changed a growable list to a fixed list of size (mask+1), and the data now agrees with common sense. The culprit behind degrading performance on int was not GC - most likely, it was cache. The cache lines of all L1, L2, L3 caches are all of the same size (64 bytes). As we grow a list, we have more L1 cache misses, then L2 cache misses, etc. (they become progressively more expensive). That's my theory. As usual, I can be wrong. On every realistic scenario I've tried, the compiler optimizes away NumberC while calling a function |
Consider the experiment that uses a pre-allocated list, such that we're focusing more precisely on the cost of object allocation and creation ( The relative cost is relevant if we're considering all kinds of representation objects: In short, if you recommend that developers stop using
I'd suspect that those scenarios are exactly the opposite: They are unrealistic, in the sense that this optimization will not be possible in an actual application, with the increase in size and complexity.
Built-in, nominal, non-recursive union types with reification is indeed a possible way ahead for union types in Dart. It's a huge effort (like everything that introduces a new kind of types), and controversial (e.g., because it's likely to make type inference a lot slower), but I think it is possible. That's for #83. |
The test of int vs NumberC is a bit contrived: we are comparing the weight of infusoria with the weight of infusoria-in-a-box - it's not completely unexpected that the box weighs more than its contents. The difference is just ~15%. There's no optimization here - these are realistic figures. Code```dart void testString(int mask) { var strings = List.filled(mask + 1, ''); var index = 0; print('Measure `String` $mask:'); for (int i = 0; i < count; ++i) { if (i & mask == 0) index = 0; strings[index] = '$i'; ++index; } speak(strings[0]); } class StringInABox { final String value; const StringInABox(this.value); } void testStringInABox(int mask) { var boxedStrings = List.filled(mask + 1, const StringInABox('')); print('Measure `StringInABox` $mask:'); var index = 0; for (int i = 0; i < count; ++i) { if (i & mask == 0) index = 0; boxedStrings[index] = StringInABox('$i'); ++index; } speak(boxedStrings[0].value); } ```My point was not to question the fact that the boxes affect performance. I was just trying to determine how much they affect performance in the case of reified union types. If we agree that 15% is not too much (for bigger behemoths the overhead in % terms will be much lower), then the next question is: I remember that union types were requested as a way of interfacing with Javascript APIs, which often come with parameters of essentially union types. This goal can still be achieved with our stripped-down union types. It's not necessarily "all or nothing at all". For example, dart currently supports a very limited subset of constant expressions. But even this limited subset is useful! My theory from the previous message was that the performance degradation we see on It tells us that it the expected slowdown factor while creating a boxed object can be estimated roughly as
For the Even if GC comes into play at bigger sizes, this ratio will remain more or less the same - because the size of new gen is configured in terms of MB, not the total number of objects). (*) these estimates are valid only in the scenario when the list is big enough to hit the limit of L1 cache, which is 64KB. |
I'm thinking along the following lines: what will happen if we support union types like typedef MyUnion=int | String;
// or, better yet
union type MyUnion = int | String; such that MyUnion is not an Object? MyUnion u1 = 1; // OK
MyUnion u2 = 'abc'; // OK
Object o = whatever;
o is MyUnion; // warning : expression is always false
o as MyUnion; // ERROR impossible cast: o cannot have type MyUnion
List<MyUnion> list = [1, 'abc']; // OK
list as List<Object>; // ERROR impossible cast:
u1 is int; // ERROR: impossible cast
u1.value is int; // OK
print(u1.value); Do you know any counterexamples? (I can see one problem with this: if somebody asks "What is the static type of |
That seems unnecessary. Why would we need another non- What you're suggesting is just a pointer, which provides no real benefit. Especially when the whole point of the union is to be able to treat it Your suggestion would also prevent adoption by interfaces that need it, like Even internal stuff like in Let alone allowing for accepting multiple types, making stuff like Not to mention that making it not an object for some reason would prevent it from being used in most places. |
This is necessary if we want to implement recursive types like Encodable. Object map=<int, String>{};
map is Encodable; // how do you implement that? To be able to answer the question whether a given object is Encodable, the runtime has to have enough information passed in Type objects. With the current structure of Type objects, there's no way to do that. So, to implement this feature alone, dart would have to come up with a different, much more complicated, design of Type objects1, which will hurt the performance of everything. f(Encodable e) {};
var map = <int, String>{}; // static type <int, String>
f(map); // OK - because compiler can statically verify that Map<int, String> is encodable
f(<Foo, int>{}); // STATIC error Footnotes
|
@tatumizer wrote:
First, this is the 'extension type' issue and it does make sense to talk about specific usages (including That said—it could still be considered relevant to the overall topic of extension types to discuss reified vs. non-reified types, and their inherent capabilities and limitations.
(or was that "not an The most crucial question here is whether or not
If the union type becomes a ghost type for the compiler, my interpretation would be that it is not reified (at least, that's what I meant when I said 'ghost type'). That is, I think this implies that the type declarations named In any case, This kind of type is not useless, but An important example is that the system provided collection types ( extension type Name(String _it) {}
extension on String { Name get toName => Name(this); }
List<X> uniquify<X>(List<X> list) {
final set = list.toSet();
return set.toList();
}
void main() {
final names = [
'Phileas Misst'.toName,
'Sherlock Isletes'.toName,
'Phileas Misst'.toName
];
final uniqueNames = uniquify(names);
List<Name> _ = uniqueNames; // OK.
} The point is that even without reification, the invocation of I proposed in #3620 that we should support parametric type parameters (that is, the compiler should check that the run-time value of the actual type argument is actually never used), but even without such a mechanism it may be useful to be aware of the fact that non-reified types can safely and usefully be used as actual type arguments. Agreeing with @TekExplorer on ..
.. I don't immediately understand how it would help. We already have that kind of type, by the way: Every extension type that doesn't declare that it If we're talking about "another non- The language team has discussed that kind of type a number of times. They could be used to disable a lot of OO features for specific types, which could then be very low-level. In particular, we could support bit fields as parts of bit vectors (perhaps If that's the point ("they are never the value of a type variable, or a subterm in a composite type which is the value of a type variable") then we could just make it an error for such types to occur in those ways. Another way to say this is that union types are not first class types. I suspect that we won't get anything else out of introducing a non- However, I do think somebody is going to say "but I want to have a @tatumizer wrote (about non-Object or non-Object? types):
I already argued here that there's no way we can have recursive union types without reification. If we have a declaration like this: typedef Encodable = int | String | List<Encodable>; where If it's an
Right, this is precisely the difference between the non-reified and the reified version of I don't think there is anything particularly complex about the run-time representation of the reified
class Foo {} // Just a class, unrelated to `Encodable`.
f(Encodable e) {}
void main() {
var map = <int, String>{}; // Static type `Map<int, String>`.
f(map); // OK - because compiler can statically verify that Map<int, String> is encodable.
f(<Foo, int>{}); // Compile-time error.
} I adjusted the code in a couple of locations. I couldn't find a declaration of typedef Encodable = int | String | List<Encodable> | Map<Encodable, Encodable>; We might have subtype relationships like |
I admit I made a mistake when I mentioned the type object for Encodable - indeed, if union types don't extend Object, then this assumption was unnecessary. Here's why I was thinking about union types that don't implement Object. I assumed that while evaluating the predicate If the union type is not an Object, then the predicate like If we allow a union to be an Object (that is, a union becomes a fully reified type), the compiler will often be able to eliminate redundant checks based on static information. This would cover most of the real-life scenarios. I overlooked this argument, too. Footnotes
|
I think it would be useful to make a couple of things explicit: If a union type like In any case, I would not expect a union type to be reified in any way at run time in an instance of any of the non-recursive operand types: typedef Encodable = int | String | List<Encodable> | Map<Encodable, Encodable>;
Encodable e = "Hello!"; The value of On the other hand, an In other words, when a union type (recursive or not) is reified as a type argument, it is possible to determine in time O(1) whether any given object does or does not have that type. The need to traverse an arbitrarily large data structure in order to be able to say "yes" or "no" only arises when the union type is not reified as a type argument. I don't see any connection to the subtype relationship: An entirely separate question is whether a value of type Y upcast<X extends Y, Y>(X x) => x;
void main() {
Encodable e = upcast<String, Encodable>("Hello!");
} The upcast occurs in the body of Wrapper objects are relevant to some emulations of union types, but they wouldn't be part of a real language mechanism.
That wouldn't suffice: void main() {
dynamic d = someCondition ? "Hello!" : #aSymbol;
if (d is Encodable) { // Is true if and only if `d` is the string.
...
}
} We can't know at compile time whether However, we can compile the expression to a bigger expression which will do the right thing: void main() {
dynamic d = someCondition ? "Hello!" : #aSymbol;
if (d is int || d is String || d is List<Encodable> || d is Map<Encodable, Encodable>) {
...
}
} If
This is where I get a little bit confused—does 'allow a union to be an Object' mean that I don't think any approach that includes a wrapper object will be an acceptable foundation for union types as a language feature, but the phrase 'a union becomes a fully reified type' does sound a bit like there is a wrapper object? The wrapper object can be part of an emulation of union types (in particular, that's true for class_unions), but this will definitely prevent subtype relationships (at compile time and at run time) from the operand types to the emulated union type. This again implies that every value of that emulated union type must be "injected" into the union type by means of some sort of computation (e.g., |
The "connection" between Encodable and Object I had in my head was an illusion - I failed to find the counterexamples like the one with For the reified variant, one option I had in mind is the following. The compiler can see that Encodable is a union type (doesn't matter if it's recursive or not in this context) Examples: 5 as Encodable; // compiler says: unnecessary cast
Object five = 5;
five as Encodable; // runtime cast successful
var list=<Encodable>[1, 'A'];
list.add(1); // OK
list.add('a'); // OK
list.add(RandomFoo()); // static error: type mismatch
List<Object> list1 = <Encodable>[1, 'A'];
list1.add(5); // OK in comptime, OK in runtime
list1.add(RandomFoo()); // accepted by compiler, but fails in runtime Can you find any contradictions in this model? If not, there's another observation: unless the program loosens the type of Encodable to Object, all the checks are implemented in comp. time and have zero runtime cost 1. And, without the wrapper, the whole thing has zero cost also in terms of memory. Footnotes
|
I'm considering a union type mechanism which would be reified as a type argument, using no wrapper objects, having only the subtype relationships with its operands (so This seems to be compatible with your assumptions. I think the 'extra pass' in the compiler would not be needed: If we can determine whether a subtype relationship exists then there's nothing that forces this query to rely on data in the subtype (so // We can say `implements Object` because every operand type is a subtype.
union type Encodable implements Object =
int | String | List<Encodable> | Map<Encodable, Encodable>;
void main() {
// `unnecessary_cast` reported, no error, yields `true`.
5 as Encodable;
Object five = 5;
// Does not throw.
five as Encodable;
// OK: `int <: Encodable` and `String <: Encodable`.
var list = <Encodable>[1, 'A'];
list.add(1); // OK.
list.add('a'); // OK.
list.add(RandomFoo()); // Compile-time error, not assignable to `Encodable`.
// OK, `Encodable <: Object` based on explicit `implements` clause.
List<Object> list1 = <Encodable>[1, 'A'];
list1.add(5); // No error at compile time, does not throw at run time.
list1.add(RandomFoo()); // No error at compile time, throws at run time.
} Yes, going through the cases I do get the exact same outcome. So the models we're considering are at least compatible.
That wouldn't be true with the model I'm considering: For instance, five is int
? five
: five is String
? five
: five is List<Encodable>
? five
: five is Map<Encodable, Encodable>
? five
: throw new TypeError(...) In the situation where So we can't expect to settle all questions about type queries involving union types at run time, or even to generate code that simply unfolds every given union type to its operand types. But we can perform the tasks that are needed for those queries at run time. |
The same is true with my model. I see you have an issue with the "extra pass", right? It's in fact a "micro-pass", it will be very fast. Suppose the problem of detecting whether type is Encodable is solvable - maybe we have an oracle or something. Footnotes
|
The point is that this is not necessary. Consider two function types likes The same thing would be true for union types: Each union type has a representation that allows us to find the list of operand types at run time. We may then use the operand types to determine whether or not there is a subtype relationship with any other type. So it's simply not necessary (and not even a good idea) to try to represent the subtype relationships directly in the run-time representation of structural types (such as union types or function types).
The crucial distinction here is "being reified as a type argument". If we have that property then the run-time subtype and instance-of queries will have complexity O(1) in the size of the structure of the type, if we don't have it then there's no limit on the amount of work we may need to perform during an instance-of query (and the subtype query will probably be equivalent to graph isomorphism on the structure of the types, which is also expensive). But that doesn't even matter much, because it won't work in the first place (as I mentioned here: immediately after we've traversed an object structure consisting of a million objects and confirmed that it satisfies the constraints associated with a specific type, the requirements could already be violated. In other words, being reified as a type argument is an absolute requirement. If we have that then the subtype and instance-of queries will trivially be O(1). If we don't have it then nothing works anyway, and it may take an unbounded amount of time to do that. ;-) |
Your example of Functions is not convincing. I understand that for functions, it's an expensive operation. This doesn't mean it's OK to make the cast an expensive operation for any other type.
That's what I don't quite understand. List<String> list=['a', 'b'];
list is Encodable; // unnecessary cast! Here, the predicate List<Object> list=['a', 'b'];
list is Encodable; // runtime cast How are you going to have a runtime cost of O(1) here, without precomputing the set of Encodable types in comptime and passing it to the runtime? If that information is not encoded in a kind of hashset, and not encoded in the RTO of |
I don't think it's particularly expensive (we need to perform 2 subtype checks in order to determine whether or not
True, that's an unnecessary cast, but it will still be performed at run time whenever your program reaches that point in the code. The lint just says that this is wasteful, that doesn't mean we won't do it. So it's still O(1) work (where "1" is considered to represent the number of operand types in the union type
This will give rise to a small amount of work at run time (namely: one subtype check per operand type, and we have 4 of those). The result is I think it's fair to say that 4 subtype tests is O(1) even though it is strictly O(n) where n is the number of operands in the union type. But for any given program we can scan it and find all union types and choose a constant
I think it's highly unrealistic that the subtype relationship from |
Oh, I made a mistake in my example, I meant List<Object> list=<String>['a', 'b']; // born with the type List<String>
list is Encodable; // runtime cast, returns true Now it should return true, right? You remember, we started the discussion with the question: can we express union types such that they have zero extra cost? Are you satisfied with your model? Do you expect any "worms" that are still unaccounted for? BTW, I'm surprised there's no mechanism for the compiler to pass a flag to the runtime: I've checked this value statically, I guarantee it conforms to the runtime type, so don't insert runtime checks. |
Indeed!
Oh, but the run-time representation certainly needs to somehow encode each of the operand types such that it is possible for the runtime to iterate over them when it is testing whether there is a subtype relation to some other type. The work is still proportional to the number of operand types, except that in the case where an operand type is recursive (like
I'd like to have a more firm opinion about how to specify supertypes of a union type. I think it's really, really useful for performance reasons (during inference, in particular) to say that there are no other direct supertypes than the ones that the |
[Edit, mit-mit, Feb 2024 This feature launched in Dart 3.3; we have docs here.]
[Edit, eernstg: As of July 2023, the feature has been renamed to 'extension types'. Dec 2023: Adjusted the declaration of
IdNumber
—that example was a compile-time error with the current specification.][Edit, mit: Updated the text here too; see the history for the previous Inline Class content]
An extension type wraps an existing type into a new static type. It does so without the overhead of a traditional class.
Extension types must specify a single variable using a new primary constructor syntax. This variable is the representation type being wrapped. In the following example the extension type
Foo
has the representation typeint
.Extension types are entirely static. This means there is no additional memory consumption compared to using the representation type directly.
Invocation of a member (e.g. a function) on an extension type is resolved at compile-time, based on the static type of the receiver, and thus allows a compiler to inline the invocation making the extension type abstraction zero-cost. This makes extension types great for cases where no overhead is required (aka zero-cost wrappers).
Note that unlike wrapper creates with classes, extension types do not exist at runtime, and thus have the underlying representation type as their runtime type.
Extension types can declare a subtype relation to other types using
implements <type>
. For soundness,implements T
whereT
is a non-extension type is only allowed if the representation type is a subtype ofT
. For example, we could haveimplements num
in the declaration ofIdNumber
below, but notimplements String
, becauseint
is a subtype ofnum
, but it is not a subtype ofString
.Example
Specification
Please see the feature specification for more details.
This feature realizes a number of requests including #1474, #40
Experiment
A preview of this feature is available under the experiment flag
inline-class
.Implementation tracking
See dart-lang/sdk#52684
The text was updated successfully, but these errors were encountered: