Downcasting constrained generic type

Topics: General, Language Specification
Jul 3, 2014 at 3:49 PM
I ran into a problem with trying to downcast a variable of a constrained generic type. Here's an example of what doesn't work. It seems to me that both casts should succeed, but when I try to downcast the argument of generic type, I get a compile error, even though T is constrained to the base type of the type being cast to. Is this an intentional limitation? Thanks.
class Base { }
class Derived1 extends Base {
    public name1 = "Derived1";
}
class Derived2 extends Base {
    public name2 = "Derived2";
}

class Container<T extends Base> {
    public aMethod(arg: T): string {
        // Next line gives compiler error: error TS2012: Cannot convert 'T' to 'Derived1'
        return (<Derived1>arg).name1;  
    }
    
    public anotherMethod(arg: Base): string
    {
        return (<Derived1>arg).name1;  // this compiles fine
    }
}
    
Coordinator
Jul 3, 2014 at 4:47 PM
This is because we know that T is a subtype of Base, but we don't know which one. Because of this, it can't be shown that you can safely move from this subtype of Base to Derived1 or any other subtype of Base.

In the second part of the example, we know that arg is of type Base locally, so we know you can safely downcast from that to Derived1 because we're cleanly going down in the subtype hierarchy.
Jul 3, 2014 at 5:16 PM
Thanks for the response. That makes sense. I realized after posting that C# has the same behavior. Of course my real code did a check that the cast was valid, like:
if (arg instanceof Derived1) {
    return (<Derived1>arg).name1;
}
I got the code to compile by changing the cast to
if (arg instanceof Derived1) {
    return (<Derived1><any>arg).name1;
}
I realize the most desirable solution would be to refactor the code so that Container doesn't need to know about specific subtypes of Base.
Jul 3, 2014 at 5:16 PM
Edited Jul 3, 2014 at 5:17 PM
but we don't know which one.
 
@jonturner
How is this different from the second case? It is still unknown "which one".
Coordinator
Jul 3, 2014 at 6:12 PM
@jamesnw

The case where you have
anotherMethod(arg: Base): string {
    return (<Derived1>arg).name1;  // this compiles fine
}
Locally, we know the type is Base, not "a subtype of Base". One thing that might be confusing at first is that functions are typed by what we know locally, not how they're called. You may have passed in a type that is a subtype of Base, but locally all we know is Base.

This is different from "T extends Base" because here we have an unknown type that we're marshalling around. We know it's a subtype of Base, and we know we'll be using and returning this subtype, but we don't know which one it is, only that it has at least the same shape as Base (but could have more).
Jul 3, 2014 at 6:24 PM
Edited Jul 3, 2014 at 6:31 PM
Unfortunately that doesn't make it more clear. "but we don't know which one it is" applies to the code you posted as well. "it has at least the same shape as Base (but could have more)" could also apply to the posted code.
Coordinator
Jul 3, 2014 at 8:12 PM
Edited Jul 3, 2014 at 8:16 PM
When you say "anotherMethod(arg: Base)" we know that arg has type Base in the context of the anotherMethod function. Locally, there isn't anything we don't know about it. We know it's Base. "T extends Base" means "a type I don't know, having a shape I don't know except for the Base component I do know". They're two different things.

The operator <> actually isn't a cast, it's a "type assertion". These can go up to less-specific types or down to more-specific types but not sideways to siblings of the same parent type. Type asserting a "T extends Base" to some other subtype of Base is not up or down, but rather sideways to siblings. You can check out more about the type assertion operator in 4.13 of the spec.

As an example:
interface Animal {
  legs: number;
}

interface Dog extends Animal {
  pants: boolean;
}

interface Bat extends Animal {
  wingLength: number;
}

var x : Dog;
var y  = <Bat> Dog; // error: Cannot convert 'Dog' to 'Bat': Type 'Dog' is missing property 'wingLength' from type 'Bat'. Type 'Bat' is missing property 'pants' from type 'Dog'.
Meaning the types are mutually incompatible.
function f(x: Animal) { return <Bat>x; }
function g<T extends Animal>(x: T) { return <Bat>x; }  // error to be consistent with the non-generic error for type assertions given above
Edit: for consistency
Jul 3, 2014 at 9:35 PM
Edited Jul 3, 2014 at 9:35 PM
Those examples are not much different from before ... but I think you are saying that "T extends base" doesn't mean that "base" is the guaranteed super class, but instead that the SIGNATURE of T must have a similar signature of "base", but may be of another type...? In contrast, a function parameter expecting an explicit base type must be of THAT type, or a derived type (explicitly)...?

Anyhow, in either case, I still think the logic is weak. Typescript doesn't even enforce specific types.
class A {
    x = 0;
}
class B {
    x = 0;
}
function func(b: A) {
}
func(new B()); // ok
It's the signature of the object that really defines the type, not the type itself. I'm still not convinced that there should be a difference:
class A {
    x = 0;
}
class B extends A {
    y = 0;
}
class C<T extends A> {
    a: T;
    func1(t: T) {
        return <B>t; // <- Error, BUT, we know T has the SIGNATURE of A, like "a: A" below, so this should be allowed.
    }
    func2(a: A) {
        return <B>a;        
    }
}
Jul 3, 2014 at 9:54 PM
Hi James,
I think the difference between the two cases is: in your C class, T doesn't represent type A, but rather represents whatever type you actually instantiated C with, which could be a subtype of A. T might be B, or it might be another class derived from A. This is different from func2, where the parameter type is explicitly A.

Here's an example that might illustrate the difference. Imagine your funcs had these signatures:

func1(t: T): T
func2(a: A): A

You might think these are the same, since T was declared as extending A. But really they're not the same. func1 can only accept arguments which are the same subtype of A that C was instantiated with. So for C<string> the argument would have to be string. Additionally the return type of func1 is guaranteed to be the same type as the input. For func2, the input can be any subtype of A, and the return value can be any other subtype of A. So the point is that compiler isn't treating T as a synonym for A, but rather you can think of T as a synonym for whatever subtype of A C was instantiated with.
Jul 3, 2014 at 11:14 PM
Edited Jul 3, 2014 at 11:15 PM
Ok, tell you what, give me a scenario where func1 will fail, and func2 will not:
class A {
    x = 0;
}
class B extends A {
    y = 0;
}
class C<T extends A> {
    a: T;
    func1(t: T) {
        return <B><any>t; // <- forced using "any"
    }
    func2(a: A) {
        return <B>a;        
    }
}
Jul 4, 2014 at 12:13 AM
Edited Jul 4, 2014 at 12:25 AM
Hi James,
Hopefully Jon can correct me if I get anything wrong in what I'm about to say.

I believe you are correct in thinking that in your example func1 and func2 compile down to exactly the same code, so there are no cases where func1 will fail at runtime and func2 won't. That's why I expected func1 to work and started this thread.

After reading Jon's responses and also considering that C# has the same rule, I think I understand the reasoning for this. The reason they are treated differently by the compiler is all about type safety. A big part of the point of using generics is to enlist the compiler's help in catching type errors in your code. Using your class C as an example, let's say T is the generic type parameter, and T' is the type which is bound to T when you instantiate C with a concrete type. So for the declaration C<string>, T' = string, and for C<number>, T' = number.

One guarantee that the compiler makes with generics is:

Guarantee 1: Any operation which wouldn't be allowed on a value of type T' won't be allowed on a value of type T.

However, the compiler has to ensure that that guarantee is met without knowing what T' actually will be. The only thing it knows about T' when it's compiling C is whatever you declared in the constraint on the generic parameter, in this example that it extends A. So if the compiler allowed you to cast a value of type T to B, it would possibly be violating guarantee 1 because the for all the compiler knows T' might be some other subtype of A.

It's also important to note there is a converse guarantee which the compiler does not make:

Guarantee 2: Any operation which would be allowed on a value of type T' will be allowed on a value of type T.

If we were talking about C++ templates instead of generics, then guarantee 2 would exist. In C++, template classes are just that, templates. When the compiler sees a declaration like C<string>, it actually generates a new version of the C class with T replaced with string. So in C++, any operation allowed on T' is also allowed on T. If your template code contains any statements that aren't valid on T', the compiler will catch them when it creates the new version of C specialized for T'. In your example, for C<B>, the compiler would translate func1 into:
func1(t: B) { return <B>t; }
which would of course be valid. If on the other hand you declared C<F>, where F was some other subtype of A, then func1 would be generated as
func1(t: F) { return <B>t; }
which would cause a compile error because B and F are sibling classes.

In TypeScript or C# though, the C class is only compiled once, so that one implementation has to work for all T'. So the compiler can't implement guarantee 2 without violating guarantee 1.

Was that at all clear?

Edit: for some reason, the forum is converting the plus signs after C to entity references, so where I tried to type C-plus-plus it prints it as C-ampersand-43
Jul 4, 2014 at 1:16 AM
Ok, I think I see it now. So simply, this would break it then:
class A {
    x = 0;
}
class B1 extends A {
    y1 = 0;
}
class B2 extends A {
    y2 = 0;
}
class C<T extends A> {
    a: T;
    func1(t: T) {
        return <B1><any>t; // <- forced using "any"
    }
    func2(a: A) {
        return <B1>a;        
    }
}
var c = new C<B2>();
c.func1(new B2()); // <- would return as 'B1', which is wrong.
Thanks all for the clarification. I just wanted to get it all straight in my head in case I find myself in the same situation. ;)
Jul 4, 2014 at 1:44 AM
Edited Jul 4, 2014 at 1:46 AM
That's right, assuming you declared the return type of func1 and func2 to be T. Without the <any> cast, the compiler wouldn't have allowed that to compile in the first place to prevent you from accidentally making that error. By adding the <any> cast you're opting out of the compiler's type checking guarantee, guarantee 1 in my post.

Also I wanted to clarify my rule that the compiler guarantees, to make it clear that I'm talking about compile-time guarantees. So the guarantee the compiler makes about generics, where T is a generic type parameter, and T' is a type which is bound to T by a declaration, is:

Guarantee 1: Any operation on an expression of type T' which would fail to compile, will also fail to compile on an expression of type T.

So in your class, if func1 looked like this:
func1(t: T): number {
    return (<B1>t).y1;
}
the compiler won't allow that because it can't uphold the above guarantee, since there could be values of T' for which that cast wouldn't be allowed.
Jul 4, 2014 at 1:49 AM
assuming you declared the return type of func1 and func2 to be T.
 
Actually, TypeScript will infer the return type, so yes, the code is broken in that "func1()" is no longer reliable. ;)
Jul 4, 2014 at 1:52 AM
Aha, didn't realize it will infer the return type. I only started learning TypeScript a couple of days ago.
Jul 4, 2014 at 2:50 AM
Edited Jul 4, 2014 at 2:52 AM
First piece of advice then - don't take the C++/C# concepts with you and you'll do just fine. Many people keep getting in trouble with that. ;) Think "JavaScript" only.