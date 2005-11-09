Goals

Here are some design notes for JS2, starting with my goals, shared in large part by ECMA TG1 for ECMA-262 Edition 4:

Support programming in the large with stronger types and naming. Enable bootstrapping, self-hosting, and reflection. Backward compatibility apart from a few simplifying changes.

(Goal 2 implies many things beyond what is discussed in these notes.) Non-goals, again shared (mostly!) by ECMA TG1 going back to Waldemar’s Edition 4 drafts:

To become more like any other language (including Java!). To be more easily optimized than the current language.

Types

In JS today, every expression has a type, as specified by ECMA-262 Edition 3

Chapter 8. The visible types, by their spec names, are Undefined, Null, Boolean, Number, String, and Object. These types are disjoint sets of values:

Undefined = {undefined}, Null = {null}, etc.

Int32 and UInt32 are subsets of Number (IEEE-754 double precision) subject to

different operators from Number, and they appear only in the bitwise operators,

Array length , and a few other special cases.

Edition 3 Chapter 9 defines somewhat ad-hoc, mostly useful conversion rules between types. Chapter 15 contains constructor specifications that may also convert according to the Chapter 9 rules, or ad-hoc variations on those rules.

One oddness to JS1: the so-called primitive types Boolean, Number, and String each have a corresponding Object subtype: Boolean , Number , and String respectively. When a primitive value is used as an object, it is automatically “boxed” or wrapped by a new instance of the corresponding object subtype. When used in an appropriate primitive type context, the box/wrapper converts back to the primitive value.

For JS2 and ECMA-262 Edition 4, we would like to use modern type theory to avoid the pitfalls and contradictions of less formal, ad-hoc approaches. We define a lattice of all type value sets induced by the set contains

subset relation, in order to:

Define new operators to enable programmers to test and enforce invariants using type annotations (goal 1). Let users define their own types by writing class extensions that can do anything the native classes can do (goal 2). Eliminate primitive types and boxing, coalescing each Edition 3 primitive type with its object wrapper (simplifying exception from goal 3). Provide nullability and rationalize Undefined (goals 1 and 3).

The lattice is as follows, with arcs directed downward by default (arcs directed otherwise have an arrowhead showing direction):

___⊤___ / / Void Object?__________ / / Null<----String? Object__________ / | / | String Number Boolean User... (double) / | / | int uint ...

⊤ is the top type, not named T in the language. Edition 3’s Undefined is renamed to Void, as in Waldemar’s Edition 4 drafts.

For all object types t, there exists a nullable type t? = t ∪ Null. Only Object? and String? are shown above, but every object subtype is nullable. Note that this is just a specification notation; we have not committed to adding the ? typename suffix for nullability to the language.

The User… type stands for a hedge of user-defined type trees. I’ve left out Array, RegExp, Date, etc., because they can be thought of as

user-defined Object subtypes. Also, not all proposed numeric types are shown (not all are subtypes of IEEE double).

Type operators

Given a value and a type, you can ask whether the value is a member of the type’s set using the is relational operator:

v is t ≡ v ∈ t's value set ⇒ Boolean

A class defines an object type, and class C extends B {...} defines a subclass C of base class B. All values of a subclass type are members of its superclass type, so (new C is B) .

Given a value of unknown type, the as relational operator coerces (or downcasts) the value to the type, resulting in null if the value is not a member of the type:

v as t ≡ (v is t ? v : null) ⇒ t?

So, e.g., undefined as Object === null — this shows how the type of an as t expression is t? .

Given a value of arbitrary type, the to relational operator converts the value to be a member of the nullable extension type, or throws a TypeError exception.

v to t ≡ (v is t ? v : v converted to t) ⇒ t? or throw TypeError

The to operator may result in t rather than t? , at the discretion of the class implementing t (e.g., null to Boolean === false ). A class may define its own to operator using the following syntax:

class C extends B { ... function to C(v) {...} }

We will redefine the type conversions specified variously in Edition 3 Chapters 9 and 15 in terms of the to operator applied to the native classes.Our current thinking is that to conversions follow Chapter 9, except for any of (Null ∪ Void) to (String ∪ Object), which all result in null , not "null" , "undefined" , or a TypeError throw.

Type annotations

Testing and enforcing invariants using these type operators in expressions governing control flow is sometimes useful, but often tedious, error-prone, and bloaty. We wish for typed declarations that enable the language implementation to do the testing and enforcing for us. Therefore for each of the three type operators is , as ,and to , there is a corresponding type annotation that may be used with var , const , and function declarations to specify type:

var v is t = x ≡ if (!(x is t)) throw TypeError; var v = x var v as t = x ≡ var v = x as t var v to t = x ≡ var v = x to t

The initializer is optional as usual; if missing, a sane default value for the annotated type is used. For all assignments v = x following such a type-annotated variable declaration, the production on the right of ≡ above, stripped of var , is evaluated. Function formal parameters and the function’s return value may be annotated similarly:

function f(a is int, b as Object, c to String) is Number {...}

Type annotations are optional. To support strict options that require every declaration to be annotated, * may be used for ⊤ (the top type), e.g. var v is * , which is equivalent to var v . Note that * is used differently for E4X, but its meaning as ⊤ is unambiguous in type operator and annotation right operand contexts.

In a nutshell, is t annotations insist on type t and defend against null and undefined (no more “foo has no properties” errors; with static analysis, an error that can’t be avoided at runtime can even be reported at compile time). as t annotations enforce ( is t )-or-null invariance. And to t annotations convert according to cleaner, class-extensible rules.

Coming soon

In the next update, I’ll list the small number of incompatible changes to Edition 3 that we are considering. In a subsequent item, I will discuss stronger naming mechanisms to support programming in the large.