I understand the desire here. Runtime type checking is often necessary for data validation, and we can see lots of libraries developed to help fill the gap here. But I think the fact that there are so many libraries with different design decisions is pretty indicative that this is not a solved problem with an obvious solution. We knew this going into the early design of TypeScript, and it's a principle that's held up very well.
What have been happy to find is that we've grown TypeScript to be powerful enough to communicate precisely what runtime type-checking libraries are actually doing, so that we can derive the types directly. The dual of this is that people have the tools they need to build up runtime type validation logic out of types by using our APIs. That feels like a reasonable level of flexibility.
I'm relatively new to programming and had a question about TypeScript's functionality. Is there any specific reason why TypeScript doesn't allow for the creation of custom and intricate data types? For example, I'm unable to define a number type within a specific range, or a string that adheres to a certain pattern (like a postal code).
I'm imagining a language where I could define a custom data type with a regular function. For instance, I could have a method that the compiler would use to verify the validity of what I input, as shown below:
function PercentType(value: number) {
if (value > 100 || value < 0) throw new Error();
return true;
}
Is the lack of such a feature in TypeScript (or any language) a deliberate design decision to avoid unnecessary complexity, or due to technical constraints such as performance considerations?
You could trivially define a `parsePostalCode` function that accepts a string and yields a PostalCode (or throws an error if it's the wrong format).
Ranges like percent are much trickier—TypeScript would need to compute the return type of `Percent + Percent` (0 <= T <= 200), `Percent / Percent` (indeterminate because of division by zero or near-zero values), and so on for all the main operators. In the best case scenario this computation is very expensive and complicates the compiler, but in the worst case there's no clear correct answer for all use cases (should we just return `number` for percent division or should we return `[0, Infinity]`?).
In most mainstream programming languages the solution to this problem is to define a class that enforces the invariants that you care about—Percent would be a class that defines only the operators you need (do you really need to divide Percent by Percent?) and that throws an exception if it's constructed with an invalid value.
This is a feature some (experimental) programming languages have - look into dependent types. The long-and-short of it is that it adds a lot of power, but comes at an ergonomic cost - the more your types say about your code, the more the type checker needs to be able to understand and reason about your code, and you start to run up against some fundamental limits of computation unless you start making trade-offs: giving up Turing-completeness, writing proofs for the type checker, stuff like that.
Another interesting point of reference are "refinement types", which allow you to specify things like ranges to "refine" a type; the various constraints are then run through a kind of automated reasoning system called an SMT solver to ensure they're all compatible with each other.
> Is the lack of such a feature in TypeScript (or any language) a deliberate design decision to avoid unnecessary complexity, or due to technical constraints such as performance considerations?
It makes a lot of things impossible. For example, if you defined two different types of ranges, OneToFifty and OneToHundred similarly to your PercentType above, the following code would be problematic:
let x: OneToFifty = <...>;
let y: OneToHundred = <...>;
y = x;
Any human programmer would say the third line makes sense because every OneToFifty number is also OneToHundred. But for a compiler, that's impossible to determine because JavaScript code is Turing-complete, and so it can't generally say that one is certainly a subset of the other.
In other words, any two custom-defined types like that would be unassignable from and to each other, making the language much less usable. Now add generics, co-/contravariance, type deduction, etc., and suddenly it becomes clear how much work adding a new type to the type system is; much more than just a boolean function.
That said, TypeScript has a lot of primitives, for example, template string types for five-digit zip codes:
type Digit = '0' | '1' | '2' | '3' | <...> | '9';
type FiveDigitZipCode = `${Digit}${Digit}${Digit}${Digit}${Digit}`;
(Actually, some of these are Turing-complete too, which means type-checking will sometimes fail, but those cases are rare enough for the TS team to deem the tradeoff worth.)
It's the fundamental programming language design conundrum: Every programming language feature looks easy in isolation, but once you start composing it with everything else, they get hard. And hardly anything composes as complexly as programming languages.
There's sort of a meme where you should never ask why someone doesn't "just" do something, and of all the people you shouldn't ask that of, programming language designers are way, way up there. Every feature interacts not just with itself, not just with every other feature in the language, but also in every other possible combination of those features at arbitrary levels of complexity, and you can be assured that someone, somewhere out there is using that exact combination, either deliberately for some purpose, or without even realizing it.
type Enumerate<N extends number, Acc extends number[] = []> = Acc['length'] extends N
? Acc[number]
: Enumerate<N, [...Acc, Acc['length']]>
type NumberRange<F extends number, T extends number> = Exclude<Enumerate<T>, Enumerate<F>>
type ZeroToOneHundred = NumberRange<0, 100>
One limitation is that this has to be bounded on both ends so constructing a type for something like GreaterThanZero is not possible.
Similarly for zip codes you could create a union of all possible zip codes like this:
type USZipCodes = '90210' | ...
Often with the idea you have in mind the solution is to implement an object where the constructor does a run time check of the requirements and if the checks pass instantiate the instance and otherwise throw a run time error.
In functional programming this is often handled with the Option which can be thought of as an array with exactly 0 or 1 elements always. 0 elements when a constraint is not met and 1 element when all constraints are met.
This [0] is a library I wrote for JS/TS that provides an implementation of Options. Many others exist and other languages like Rust and Scala support the Option data structure natively.
Maybe official preprocessor plugins for TypeScript compiler could help?
I understand that everybody who needs it can already put their own preprocessor that generates runtime objects from type information before the code is passed to tsc for compilation.
But the effort is inconsistent and distributed.
If TypeScript officially supported pluggable preprocessor and plugin ecosystem for it some good solutions might get discovered.
This is the curse of guest languages, after the initial adoption pain everyone wants idiomatic libraries and pretends the underlying platform doesn't exist.
Until they hit a roadblock caused by a leaky abstraction, that proves them otherwise.
Type script does a very good job not to hide the underlying platform. In it's essence it is just a development time linter and does not interfere with the JavaScript runtime at all (except enums).
And I think that's actually the reason why it won the competition against Googles Dart. They even used Microsofts TypeScript for Angular instead of their own language Dart.
I understand the desire here. Runtime type checking is often necessary for data validation, and we can see lots of libraries developed to help fill the gap here. But I think the fact that there are so many libraries with different design decisions is pretty indicative that this is not a solved problem with an obvious solution. We knew this going into the early design of TypeScript, and it's a principle that's held up very well.
What have been happy to find is that we've grown TypeScript to be powerful enough to communicate precisely what runtime type-checking libraries are actually doing, so that we can derive the types directly. The dual of this is that people have the tools they need to build up runtime type validation logic out of types by using our APIs. That feels like a reasonable level of flexibility.