JS2 Design Notes

Goals

Here are some design notes for JS2, starting with my goals, shared in large part by ECMA TG1 for ECMA-262 Edition 4:

  1. Support programming in the large with stronger types and naming.
  2. Enable bootstrapping, self-hosting, and reflection.
  3. Backward compatibility apart from a few simplifying changes.

(Goal 2 implies many things beyond what is discussed in these notes.) Non-goals, again shared (mostly!) by ECMA TG1 going back to Waldemar’s Edition 4 drafts:

  1. To become more like any other language (including Java!).
  2. To be more easily optimized than the current language.

Types

In JS today, every expression has a type, as specified by ECMA-262 Edition 3
Chapter 8. The visible types, by their spec names, are Undefined, Null, Boolean, Number, String, and Object. These types are disjoint sets of values:

Undefined = {undefined}, Null = {null}, etc.
Int32 and UInt32 are subsets of Number (IEEE-754 double precision) subject to
different operators from Number, and they appear only in the bitwise operators,
Array length, and a few other special cases.

Edition 3 Chapter 9 defines somewhat ad-hoc, mostly useful conversion rules between types. Chapter 15 contains constructor specifications that may also convert according to the Chapter 9 rules, or ad-hoc variations on those rules.

One oddness to JS1: the so-called primitive types Boolean, Number, and String each have a corresponding Object subtype: Boolean, Number, and String respectively. When a primitive value is used as an object, it is automatically “boxed” or wrapped by a new instance of the corresponding object subtype. When used in an appropriate primitive type context, the box/wrapper converts back to the primitive value.

For JS2 and ECMA-262 Edition 4, we would like to use modern type theory to avoid the pitfalls and contradictions of less formal, ad-hoc approaches. We define a lattice of all type value sets induced by the set contains
subset
relation, in order to:

  1. Define new operators to enable programmers to test and enforce invariants using type annotations (goal 1).
  2. Let users define their own types by writing class extensions that can do anything the native classes can do (goal 2).
  3. Eliminate primitive types and boxing, coalescing each Edition 3 primitive type with its object wrapper (simplifying exception from goal 3).
  4. Provide nullability and rationalize Undefined (goals 1 and 3).

The lattice is as follows, with arcs directed downward by default (arcs directed otherwise have an arrowhead showing direction):

___⊤___
/       
/         
Void        Object?__________
/               
/                 
Null<----String?    Object__________
     /   |           
   /    |            
String  Number  Boolean  User...
(double)
/  |  
/   |   
int  uint  ...

⊤ is the top type, not named T in the language. Edition 3’s Undefined is renamed to Void, as in Waldemar’s Edition 4 drafts.

For all object types t, there exists a nullable type t? = t ∪ Null. Only Object? and String? are shown above, but every object subtype is nullable. Note that this is just a specification notation; we have not committed to adding the ? typename suffix for nullability to the language.

The User… type stands for a hedge of user-defined type trees. I’ve left out Array, RegExp, Date, etc., because they can be thought of as
user-defined Object subtypes. Also, not all proposed numeric types are shown (not all are subtypes of IEEE double).

Type operators

Given a value and a type, you can ask whether the value is a member of the type’s set using the is relational operator:

v is t ≡ v ∈ t's value set ⇒ Boolean

A class defines an object type, and class C extends B {...} defines a subclass C of base class B. All values of a subclass type are members of its superclass type, so (new C is B).

Given a value of unknown type, the as relational operator coerces (or downcasts) the value to the type, resulting in null if the value is not a member of the type:

v as t ≡ (v is t ? v : null) ⇒ t?

So, e.g., undefined as Object === null — this shows how the type of an as t expression is t?.

Given a value of arbitrary type, the to relational operator converts the value to be a member of the nullable extension type, or throws a TypeError exception.

v to t ≡ (v is t ? v : v converted to t) ⇒ t? or throw TypeError

The to operator may result in t rather than t?, at the discretion of the class implementing t (e.g., null to Boolean === false). A class may define its own to operator using the following syntax:

class C extends B {
...
function to C(v) {...}
}

We will redefine the type conversions specified variously in Edition 3 Chapters 9 and 15 in terms of the to operator applied to the native classes.Our current thinking is that to conversions follow Chapter 9, except for any of (Null ∪ Void) to (String ∪ Object), which all result in null, not "null", "undefined", or a TypeError throw.

Type annotations

Testing and enforcing invariants using these type operators in expressions governing control flow is sometimes useful, but often tedious, error-prone, and bloaty. We wish for typed declarations that enable the language implementation to do the testing and enforcing for us. Therefore for each of the three type operators is, as,and to, there is a corresponding type annotation that may be used with var, const, and function declarations to specify type:

var v is t = x ≡ if (!(x is t)) throw TypeError; var v = x
var v as t = x ≡ var v = x as t
var v to t = x ≡ var v = x to t

The initializer is optional as usual; if missing, a sane default value for the annotated type is used. For all assignments v = x following such a type-annotated variable declaration, the production on the right of ≡ above, stripped of var, is evaluated. Function formal parameters and the function’s return value may be annotated similarly:

function f(a is int, b as Object, c to String) is Number {...}

Type annotations are optional. To support strict options that require every declaration to be annotated, * may be used for ⊤ (the top type), e.g. var v is *, which is equivalent to var v. Note that * is used differently for E4X, but its meaning as ⊤ is unambiguous in type operator and annotation right operand contexts.

In a nutshell, is t annotations insist on type t and defend against null and undefined (no more “foo has no properties” errors; with static analysis, an error that can’t be avoided at runtime can even be reported at compile time). as t annotations enforce (is t)-or-null invariance. And to t annotations convert according to cleaner, class-extensible rules.

Coming soon

In the next update, I’ll list the small number of incompatible changes to Edition 3 that we are considering. In a subsequent item, I will discuss stronger naming mechanisms to support programming in the large.

JavaScript 1, 2, and in between

With DHTML and AJAX hot (or hot again; we’ve been here before, and I don’t like either acronym), I am asked frequently these days about JavaScript, past and future. In spite of the fact that JS was misnamed (I will call it JS in the rest of this entry), standardized prematurely, then ignored and stagnated during most of its life, its primitives are strong enough that whole ecologies of toolkit and web-app code have emerged on top of it. (I don’t agree with everything Doug Crockford writes at the last two links, but most of his arrows hit their targets.)

Too many of the JS/DHTML toolkits have the “you must use our APIs for everything, including how you manipulate strings” disease. Some are cool, for example TIBET, which looks a lot like Smalltalk. Some have real value, e.g. Oddpost, which Yahoo! acquired perhaps as much for its DHTML toolkit as for the mail client built on that toolkit.

Yet no JS toolkit has taken off in a big way on the web, probably more on account of the costs of learning and bundling any given API, than because of the “you must use our APIs and only our APIs” problem. So people keep inventing their own toolkits.

Inventing toolkits and extension systems on top of JS is cool. I hoped that would happen, because during Netscape 2 and 3 days I was under great pressure to minimize JS-the-language, implement JS-the-DOM, and defer to Java for “real programming” (this was a mistake, but until Netscape hired more than temporary intern or loaner help, around the time Netscape 4 work began, I was the entire “JS team” — so delegating to Java seemed like a good idea at the time). Therefore in minimizing JS-the-language, I added explicit prototype-based delegation, allowing users to supplement built-in methods with their own in the same given single-prototype namespace.

In listening to user feedback, participating in ECMA TG1 (back during Edition 1 days, and again recently for E4X and the revived Edition 4 work), and all the while watching how the several major “JS” implementors have maintained and evolved their implementations, I’ve come to some conclusions about what JS does and does not need.

  • JS is not going away, so it ought to evolve. As with sharks (and relationships, see Annie Hall), a programming language is either moving forward, or it’s dead. Now dead languages (natural and programming) have their uses; fixed denotation and grammar, and in general a lack of “versionitis”, are virtues. You could argue that JS’s stagnation, along with HTML’s, was beneficial for the “Web 1.0” build-out of the last decade. But given all the ferment on the web today, in XUL and its stepchildren, and with user scripting, there should be a JS2, and even a JS1.6 on the way toward JS2.
  • JS does not need to become Java, or C#, or any other language.
  • JS does need some of its sharp corners rounded safely. See the table below for details.
  • Beyond fixing what was broken in JS1, JS should evolve to solve problems that users face today in the domains where JS lives: web page and application content (including Flash), server-side scripting (whether Rhino or .NET), VXML and similar embeddings, and games.
  • For example, it should be trivial in a future version of JS to produce or consume a “package” of useful script that presents a consistent interface to consumers, even as its implementation details and new interfaces evolve to better meet existing requirements, and to meet entirely new requirements. In no case should internal methods or properties be exposed by default.
  • It’s clear to me that some users want obfuscated source code, but I am not in favor of standardizing an obfuscator. Mozilla products could support the IE obfuscator, if someone wants to fix bug 125525. A standard obfuscator is that much less obscure, besides being unlikely to be adopted by those who have already invented their own (who appear to be the only users truly motivated by a need for obfuscation at this point).
  • A more intuitive numeric type or type tower would help many users, although to be effective it would have to be enabled via a new compile-time option of some sort. Numeric type improvements, together with Edition 4’s extensible operator and unit proposals, would address many user requests for enhancement I’ve heard over the years.
  • Too much JS, in almost every embedding I’ve seen, suffers from an execution model that appears single-threaded (which is good for most users) yet lacks coroutining or more specific forms of it such as generators (Boo has particularly nice forms, building on Python with a cleanup or two). So users end up writing lots of creepy callbacks, setTimeout chains, and explicit control block state machines, instead of simply writing loops and similar constructs that can deliver results one by one, suspending after each delivery until called again.

That’s my “do and don’t” list for any future JS, and I will say more, with more specifics, about what to add to the language. What to fix is easier to identify, provided we can fix compatibly without making a mess of old and new.

Here are the three most-duplicated bug reports against core language design elements tracked by Mozilla’s bugzilla installation:


Bug #

Dupe
Count


Component

Severity

Op Sys

Target
Milestone


Summary
98409 6 JavaScript Engine normal All literal global regular expression (regexp) instance remembers lastIndex
22964 55 JavaScript Engine normal All JavaScript: getYear returns “100” for 2000
5856 15 JavaScript Engine normal Windows 98 javascript rounding bug

I argue that we ought to fix these, in backward-compatible fashion if possible, in a new Edition of ECMA-262. If we solve other real problems that have not racked up duplicate bug counts, but fail to fix these usability flaws, we have failed to listen to JS users. Let’s consider these one by one:

  1. Unlike object and array initialisers, and E4X’s XML literals, regular expression literals correspond one-for-one with objects created during parsing. While this is often optimal and even useful, when combined with the g (global) flag and the lastIndex property, these singleton literals make for a pigeon-hole problem, and a gratuitous inconsistency with other kinds of “literals”. To fix this compatibly, we could add a new flag, although it would be good to pick a letter not used by Perl (or Perl 6, which fearlessly revamps Perl’s regular expression sub-language in ways that ECMA-262 will likely not follow).
  2. The Date.prototype.getYear method is a botch and a blight, the only Y2K bug in Mozilla-based browsers that still ships for compatibility with too many web sites. This bug came directly from java.util.Date, which was deprecated long ago. I’d like to get rid of it, but in the mean time, perhaps we should throw in the towel and emulate IE’s non-ECMA behavior (ECMA-262 did standardize getYear in a non-normative annex).
  3. The solution here is a new default number type, with arbitrary precision and something equivalent to decimal radix. Mike Cowlishaw has advocated and implemented his own flavor of decimal arithmetic, but it is not popular in ECMA TG1. Still, I bet we could make life better for many JS users with some innovation here.

There are other bugs in JS1 to fix, particularly to do with Unicode in regular expressions, and even in source text (see the infamous ZWNJ and ZWJ should not be ignored bug). More on these too, shortly, but in a wiki, linked with informal discussion here.

/be

OpenLaszlo and Eclipse

Back in my February 2004 Developer Day slides, I promoted the idea of using Eclipse to create a XUL application builder, with direct-manipulation graphical layout construction and editing, project management wizards, etc.

Although a few people expressed interest and even did some hacking (the MozCreator project being the most conspicuous example, although not Eclipse-based), no one actually created an Eclipse project and built on its Graphical Editor Framework to realize a XUL app-builder.

The good news this week is Open Laszlo and IBM releasing the Eclipse IDE for Laszlo. LZX is cool, and similar in spirit, and in many ways in flesh, to XUL.

So the thought occurs: why not patch the Eclipse IDE for Laszlo to support XUL as an alternative target language, and Firefox (or any new-style XUL app, soon enough unified via XULRunner) as the target runtime? Any takers?

/be