A Brief History of JavaScript

It’s good to be back. I let the old blog field lie fallow in order to focus on work in Ecma TC39 (JS standards), Firefox 3.5, 3.6 and 4; and recently on a new project that I’ll blog about soon.

In the mean time [UPDATE and in case the embedded video fails], here’s the video link from my JSConf 2010 surprise keynote in April. Highlights include:

  • What would happen in a battle between Chuck Norris and Bruce Campbell
  • Clips from Netsca^H^H^H^H^H^HEvil Dead 2 and Army of Darkness
  • Discussion of where JS has been and what lies in its immediate future

True facts:

  • I did meet John McCarthy of LISP fame in 1977
  • My haircut was influenced by Morrissey’s (hey, it was the late ’80s)
  • JS’s function keyword did come from AWK

/be

TraceMonkey: JavaScript Lightspeed

I’m extremely pleased to announce the launch of TraceMonkey, an evolution of Firefox’s SpiderMonkey JavaScript engine for Firefox 3.1 that uses a new kind of Just-In-Time (JIT) compiler to boost JS performance by an order of magnitude or more.

Results

Let’s cut straight to the charts. Here are the popular SunSpider macro- and micro-benchmarks average scores, plus results for an image manipulation benchmark and a test using the Sylvester 3D JS library’s matrix multiplication methods:

asbenchlg

Here are some select SunSpider micro-benchmarks, to show some near-term upper bounds on performance:

ff31wtracingsm

This chart shows speedup ratios over the SpiderMonkey interpreter, which is why “empty loop with globals” (a loop using global loop control and accumulator variables) shows a greater speedup — global variables in JavaScript, especially if undeclared by var, can be harder to optimize in an interpreter than local variables in a function.

Here are the fastest test-by-test SunSpider results, sorted from greatest speedup to least:

ff31vsff3

The lesser speedups need their own chart, or they would be dwarfed by the above results:

ff31wtracingsm

(Any slowdown is a bug we will fix; we’re in hot pursuit of the one biting binary-trees, which is heavily recursive — it will be fixed.)

With SunSpider, some of the longest-running tests are string and regular-expression monsters, and since like most JS engines, we use native (compiled C++) code for most of the work, there’s not as much speedup. Amdahl’s Law predicts that this will bound the weighted-average total Sunspider score, probably to around 2. No matter how fast we JIT the rest of the code, the total score will be . . . 2.

But this is only a start. With tracing, performance will keep going up. We have easy small linear speedup tasks remaining (better register allocation, spill reduction around built-in calls). We will trace string and regular expression code and break through the “2” barrier. We will even trace into DOM methods. The tracing JIT approach scales as you move more code into JS, or otherwise into view of the tracing machinery.

Finally, schrep created a screencast that visually demonstrates the speedup gained by TraceMonkey. These speedups are not just for micro-benchmarks. You can see and feel them.

How We Did It

We’ve been working with Andreas Gal of UC Irvine on TraceMonkey, and it has been a blast. We started a little over sixty days (and nights 😉 ago, and just yesterday, shaver pushed the results of our work into the mozilla-central Hg repository for inclusion in Firefox 3.1.

The JIT is currently pref’ed off, but you can enable it via about:config — just search for “jit” and, if you are willing to report any bugs you find, toggle the javascript.options.jit.content preference (there’s a jit.chrome pref too, for the truly adventurous).

Before TraceMonkey, for Firefox 3, we made serious performance improvements toSpiderMonkey, both to its Array code and to its interpreter. The interpreter speedups entailed two major pieces of work:

  • Making bytecode cases in the threaded interpreter even fatter, so the fast cases can stay in the interpreter function.
  • Adding a polymorphic property cache, for addressing properties found in prototype and scope objects quickly, without having to look in each object along the chain.

I will talk about the property cache and the “shape inference” it is based on in another post.

By the way, we are not letting moss grow under our interpreter’s feet. Dave Mandelin is working on a combination of inline-threading and call-threading that will take interpreter performance up another notch.

While doing this Firefox 3 work, I was reminded again of the adage:

Neurosis is doing the same thing over and over again, expecting to get a different result each time.

But this is exactly what dynamically typed language interpreters must do. Consider the + operator:

a = b + c;

Is this string concatenation, or number addition? Without static analysis (generally too costly), we can’t know ahead of time. For SpiderMonkey, we have to ask further: if number, can we keep the operands and result in machine integers of some kind?

Any interpreter will have to cope with unlikely (but allowed) overflow from int to double precision binary floating point, or even change of variable type from number to string. But this is neurotic, because for the vast majority of JS code, in spite of the freedom to mutate type of variable, types are stable. (This stability holds for other dynamic languages including Python.)

Another insight, which is key to the tracing JIT approach: if you are spending much time in JS, you are probably looping. There’s simply not enough straight line code in Firefox’s JS, or in a web app, to take that much runtime. Native code may go out to lunch, of course, but if you are spending time in JS, you’re either looping or doing recursion.

The Trace Trees approach to tracing JIT compilation that Andreas pioneered can handle loops and recursion. Everything starts in the interpreter, when TraceMonkey notices a hot loop by keeping cheap count of how often a particular backward jump (or any backward jump) has happened.

for (var i = 0; i < BIG; i++) {
    // Loop header starts here:
    if (usuallyTrue())
        commonPath();
    else
        uncommonPath();
}

Once a hot loop has been detected, TraceMonkey starts recording a trace. We use the Tamarin Tracing Nanojit to generate low-level intermediate representation instructions specialized from the SpiderMonkey bytecodes, their immediate and incoming stack operands, and the property cache “hit” case fast-lookup information.

The trace recorder completes when the loop header (see the comment in the code above) is reached by a backward jump. If the trace does not complete this way, the recorder aborts and the interpreter resumes without recording traces.

Let’s suppose the usuallyTrue() function returns true (it could return any truthy, e.g. 1 or "non-empty" — we can cope). The trace recorder emits a special guard instruction to check that the truthy condition matches, allowing native machine-code trace execution to continue if so. If the condition does not match, the guard exits (so-called “side-exits”) the trace, returning to the interpreter at the exact point in the bytecode where the guard was recorded, with all the necessary interpreter state restored.

If the interpreter sees usuallyTrue() return true, then the commonPath(); case will be traced. After that function has been traced comes the loop update part i++ (which might or might not stay in SpiderMonkey’s integer representation depending on the value of BIG — again we guard). Finally, the condition i < BIG will be recorded as a guard.

// Loop header starts here:
inlined usuallyTrue() call, with guards
guard on truthy return value
guard that the function being invoked at this point is commonPath
inlined commonPath() call, with any calls it makes inlined, guarded
i++ code, with overflow to double guard
i < BIG condition and loop-edge guard
jump back to loop header

Thus tracing is all about speculating that what the interpreter sees is what will happen next time — that the virtual machine can stop being neurotic. And as you can see, tracing JITs can inline method calls easily — just record the interpreter as it follows a JSOP_CALL instruction into an interpreted function.

One point about Trace Trees (as opposed to less structured kinds of tracing): you get function inlining without having to build interpreter frames at all, because the trace recording must reach the loop header in the outer function in order to complete. Therefore, so long as the JITted code stays “on trace”, no interpreter frames need to be built.

If the commonPath function itself contains a guard that side-exits at runtime, then (and only then) will one or more interpreter frames need to be reconstructed.

Let’s say after some number of iterations, the loop shown above side-exits at the guard for usuallyTrue() because that function returns a falsy value. We abort correctly back to the interpreter, but keep recording in case we can complete another trace back to the same loop header, and extend the first into a trace tree. This allows us to handle different paths through the control flow graph (including inlined functions) under a hot loop.

What It All Means

Pulling back from the details, a few points deserve to be called out:

  • We have, right now, x86, x86-64, and ARM support in TraceMonkey. This means we are ready for mobile and desktop target platforms out of the box.
  • As the performance keeps going up, people will write and transport code that was “too slow” to run in the browser as JS. This means the web can accommodate workloads that right now require a proprietary plugin.
  • As we trace more of the DOM and our other native code, we increase the memory-safe codebase that must be trusted not to have an exploitable bug.
  • Tracing follows only the hot paths, and builds a trace-tree cache. Cold code never gets traced or JITted, avoiding the memory bloat that whole-method JITs incur. Tracing is mobile-friendly.
  • JS-driven <canvas> rendering, with toolkits, scene graphs, game logic, etc. all in JS, are one wave of the future that is about to crest.

TraceMonkey advances us toward the Mozilla
2
future where even more Firefox code is written in JS. Firefox gets faster and safer as this process unfolds.

I believe that other browsers will follow our lead and take JS performance through current interpreter speed barriers, using just-in-time native code compilation. Beyond what TraceMonkey means for Firefox and other Mozilla projects, it heralds the JavaScript Lightspeed future we’ve all been anticipating. We are moving the goal posts and changing the game, for the benefit of all web developers.

Acknowledgments

I would like to thank Michael Franz and the rest of his group at UC Irvine, especially Michael Bebenita, Mason Chang, and Gregor Wagner; also the National Science Foundation for supporting Andreas Gal’s thesis. I’m also grateful to Ed Smith and the Tamarin Tracing team at Adobe for the TT Nanojit, which was a huge boost to developing TraceMonkey.

And of course, mad props and late night thanks to Team TraceMonkey: Andreas, Shaver, David Anderson, with valuable assists from Bob Clary, Rob Sayre, Blake Kaplan, Boris Zbarsky, and Vladimir Vukićević.

Popularity

It seems (according to one guru, but coming from this source, it’s a left-handed compliment) that JavaScript is finally popular.

To me, a nerd from a tender age, this is something between a curse and a joke. (See if you are in my camp: isn’t the green chick hotter?)

Brendan Eich convinced his pointy-haired boss at Netscape that the Navigator browser should have its own scripting language, and that only a new language would do, a new language designed and implemented in big hurry, and that no existing language should be considered for that role.

I don’t know why Doug is making up stories. He wasn’t at Netscape. He has heard my recollections about JavaScript’s birth directly, told in my keynotes at Ajax conferences. Revisionist shenanigans to advance a Microhoo C# agenda among Web developers?

Who knows, and it’s hard to care, but in this week of the tenth anniversary of mozilla.org, a project I co-founded, I mean to tell some history.

As I’ve often said, and as others at Netscape can confirm, I was recruited to Netscape with the promise of “doing Scheme” in the browser. At least client engineering management including Tom Paquin, Michael Toy, and Rick Schell, along with some guy named Marc Andreessen, were convinced that Netscape should embed a programming language, in source form, in HTML. So it was hardly a case of me selling a “pointy-haired boss” — more the reverse.

Whether that language should be Scheme was an open question, but Scheme was the bait I went for in joining Netscape. Previously, at SGI, Nick Thompson had turned me on to SICP.

What was needed was a convincing proof of concept, AKA a demo. That, I delivered, and in too-short order it was a fait accompli.

Of course, by the time I joined Netscape, and then transferred out of the server group where I had been hired based on short-term requisition scarcity games (and where I had the pleasure of working briefly with the McCool twins and Ari Luotonen; later in 1995, Ari and I would create PAC), the Oak language had been renamed Java, and Netscape was negotiating with Sun to include it in Navigator.

The big debate inside Netscape therefore became “why two languages? why not just Java?” The answer was that two languages were required to serve the two mostly-disjoint audiences in the programming ziggurat who most deserved dedicated programming languages: the component authors, who wrote in C++ or (we hoped) Java; and the “scripters”, amateur or pro, who would write code directly embedded in HTML.

Whether any existing language could be used, instead of inventing a new one, was also not something I decided. The diktat from upper engineering management was that the language must “look like Java”. That ruled out Perl, Python, and Tcl, along with Scheme. Later, in 1996, John Ousterhout came by to pitch Tk and lament the missed opportunity for Tcl.

I’m not proud, but I’m happy that I chose Scheme-ish first-class functions and Self-ish (albeit singular) prototypes as the main ingredients. The Java influences, especially y2k Date bugs but also the primitive vs. object distinction (e.g., string vs. String), were unfortunate.

Back to spring of 1995: I remember meeting Bill Joy during this period, and discussing fine points of garbage collection (card marking for efficient write barriers) with him. From the beginning, Bill grokked the idea of an easy-to-use “scripting language” as a companion to Java, analogous to VB‘s relationship to C++ in Microsoft’s platform of the mid-nineties. He was, as far as I can tell, our champion at Sun.

Kipp Hickman and I had been studying Java in April and May 1995, and Kipp had started writing his own JVM. Kipp and I wrote the first version of NSPR as a portability layer underlying his JVM, and I used it for the same purpose when prototyping “Mocha” in early-to-mid-May.

Bill convinced us to drop Kipp’s JVM because it would lack bug-for-bug compatibility with Sun’s JVM (a wise observation in those early days). By this point “Mocha” had proven itself via rapid prototyping and embedding in Netscape Navigator 2.0 , which was in its pre-alpha development phase.

The rest is perverse, merciless history. JS beat Java on the client, rivaled only by Flash, which supports an offspring of JS, ActionScript.

So back to popularity. I can take it or leave it. Nevertheless, popular Ajax libraries, often crunched and minified and link-culled into different plaintext source forms, are schlepped around the Internet constantly. Can we not share?

One idea, mooted by many folks, most recently here by Doug, entails embedding crypto-hashes in potentially very long-lived script tag attributes. Is this a good idea?

Probably not, based both on theoretical soundness concerns about crypto-hash algorithms, and on well-known poisoning attacks.

A better idea, which I heard first from Rob Sayre: support an optional “canonical URL” for the script source, via a share attribute on HTML5 <script>:

<mce:script mce_src=”https://my.edge.cached.startup.com/dojo-1.0.0.js” shared=”https://o.aolcdn.com/dojo/1.0.0/dojo/dojo.xd.js”>
</mce:script><br />

If the browser has already downloaded the shared URL, and it still is valid according to HTTP caching rules, then it can use the cached (and pre-compiled!) script instead of downloading the src URL.

This avoids hash poisoning concerns. It requires only that the content author ensure that the src attribute name a file identical to the canonical (“popular”) version of the library named by the shared attribute. And of course, it requires that we trust the DNS. (Ulp.)

This scheme also avoids embedding inscrutable hashcodes in script tag attribute values.

Your comments are welcome.

Ok, back to JavaScript popularity. We know certain Ajax libraries are popular. Is JavaScript popular? It’s hard to say. Some Ajax developers profess (and demonstrate) love for it. Yet many curse it, including me. I still think of it as a quickie love-child of C and Self. Dr. Johnson‘s words come to mind: “the part that is good is not original, and the part that is original is not good.”

Yet here we are. The web must evolve, or die. So too with JS, wherefore ES4. About which, more anon.

Firefox 3 looks like it will be popular too, based on space and time performance metrics. More on that soon, too.

JS2 Design Notes

Goals

Here are some design notes for JS2, starting with my goals, shared in large part by ECMA TG1 for ECMA-262 Edition 4:

  1. Support programming in the large with stronger types and naming.
  2. Enable bootstrapping, self-hosting, and reflection.
  3. Backward compatibility apart from a few simplifying changes.

(Goal 2 implies many things beyond what is discussed in these notes.) Non-goals, again shared (mostly!) by ECMA TG1 going back to Waldemar’s Edition 4 drafts:

  1. To become more like any other language (including Java!).
  2. To be more easily optimized than the current language.

Types

In JS today, every expression has a type, as specified by ECMA-262 Edition 3
Chapter 8. The visible types, by their spec names, are Undefined, Null, Boolean, Number, String, and Object. These types are disjoint sets of values:

Undefined = {undefined}, Null = {null}, etc.
Int32 and UInt32 are subsets of Number (IEEE-754 double precision) subject to
different operators from Number, and they appear only in the bitwise operators,
Array length, and a few other special cases.

Edition 3 Chapter 9 defines somewhat ad-hoc, mostly useful conversion rules between types. Chapter 15 contains constructor specifications that may also convert according to the Chapter 9 rules, or ad-hoc variations on those rules.

One oddness to JS1: the so-called primitive types Boolean, Number, and String each have a corresponding Object subtype: Boolean, Number, and String respectively. When a primitive value is used as an object, it is automatically “boxed” or wrapped by a new instance of the corresponding object subtype. When used in an appropriate primitive type context, the box/wrapper converts back to the primitive value.

For JS2 and ECMA-262 Edition 4, we would like to use modern type theory to avoid the pitfalls and contradictions of less formal, ad-hoc approaches. We define a lattice of all type value sets induced by the set contains
subset
relation, in order to:

  1. Define new operators to enable programmers to test and enforce invariants using type annotations (goal 1).
  2. Let users define their own types by writing class extensions that can do anything the native classes can do (goal 2).
  3. Eliminate primitive types and boxing, coalescing each Edition 3 primitive type with its object wrapper (simplifying exception from goal 3).
  4. Provide nullability and rationalize Undefined (goals 1 and 3).

The lattice is as follows, with arcs directed downward by default (arcs directed otherwise have an arrowhead showing direction):

___⊤___
/       
/         
Void        Object?__________
/               
/                 
Null<----String?    Object__________
     /   |           
   /    |            
String  Number  Boolean  User...
(double)
/  |  
/   |   
int  uint  ...

⊤ is the top type, not named T in the language. Edition 3’s Undefined is renamed to Void, as in Waldemar’s Edition 4 drafts.

For all object types t, there exists a nullable type t? = t ∪ Null. Only Object? and String? are shown above, but every object subtype is nullable. Note that this is just a specification notation; we have not committed to adding the ? typename suffix for nullability to the language.

The User… type stands for a hedge of user-defined type trees. I’ve left out Array, RegExp, Date, etc., because they can be thought of as
user-defined Object subtypes. Also, not all proposed numeric types are shown (not all are subtypes of IEEE double).

Type operators

Given a value and a type, you can ask whether the value is a member of the type’s set using the is relational operator:

v is t ≡ v ∈ t's value set ⇒ Boolean

A class defines an object type, and class C extends B {...} defines a subclass C of base class B. All values of a subclass type are members of its superclass type, so (new C is B).

Given a value of unknown type, the as relational operator coerces (or downcasts) the value to the type, resulting in null if the value is not a member of the type:

v as t ≡ (v is t ? v : null) ⇒ t?

So, e.g., undefined as Object === null — this shows how the type of an as t expression is t?.

Given a value of arbitrary type, the to relational operator converts the value to be a member of the nullable extension type, or throws a TypeError exception.

v to t ≡ (v is t ? v : v converted to t) ⇒ t? or throw TypeError

The to operator may result in t rather than t?, at the discretion of the class implementing t (e.g., null to Boolean === false). A class may define its own to operator using the following syntax:

class C extends B {
...
function to C(v) {...}
}

We will redefine the type conversions specified variously in Edition 3 Chapters 9 and 15 in terms of the to operator applied to the native classes.Our current thinking is that to conversions follow Chapter 9, except for any of (Null ∪ Void) to (String ∪ Object), which all result in null, not "null", "undefined", or a TypeError throw.

Type annotations

Testing and enforcing invariants using these type operators in expressions governing control flow is sometimes useful, but often tedious, error-prone, and bloaty. We wish for typed declarations that enable the language implementation to do the testing and enforcing for us. Therefore for each of the three type operators is, as,and to, there is a corresponding type annotation that may be used with var, const, and function declarations to specify type:

var v is t = x ≡ if (!(x is t)) throw TypeError; var v = x
var v as t = x ≡ var v = x as t
var v to t = x ≡ var v = x to t

The initializer is optional as usual; if missing, a sane default value for the annotated type is used. For all assignments v = x following such a type-annotated variable declaration, the production on the right of ≡ above, stripped of var, is evaluated. Function formal parameters and the function’s return value may be annotated similarly:

function f(a is int, b as Object, c to String) is Number {...}

Type annotations are optional. To support strict options that require every declaration to be annotated, * may be used for ⊤ (the top type), e.g. var v is *, which is equivalent to var v. Note that * is used differently for E4X, but its meaning as ⊤ is unambiguous in type operator and annotation right operand contexts.

In a nutshell, is t annotations insist on type t and defend against null and undefined (no more “foo has no properties” errors; with static analysis, an error that can’t be avoided at runtime can even be reported at compile time). as t annotations enforce (is t)-or-null invariance. And to t annotations convert according to cleaner, class-extensible rules.

Coming soon

In the next update, I’ll list the small number of incompatible changes to Edition 3 that we are considering. In a subsequent item, I will discuss stronger naming mechanisms to support programming in the large.

JavaScript 1, 2, and in between

With DHTML and AJAX hot (or hot again; we’ve been here before, and I don’t like either acronym), I am asked frequently these days about JavaScript, past and future. In spite of the fact that JS was misnamed (I will call it JS in the rest of this entry), standardized prematurely, then ignored and stagnated during most of its life, its primitives are strong enough that whole ecologies of toolkit and web-app code have emerged on top of it. (I don’t agree with everything Doug Crockford writes at the last two links, but most of his arrows hit their targets.)

Too many of the JS/DHTML toolkits have the “you must use our APIs for everything, including how you manipulate strings” disease. Some are cool, for example TIBET, which looks a lot like Smalltalk. Some have real value, e.g. Oddpost, which Yahoo! acquired perhaps as much for its DHTML toolkit as for the mail client built on that toolkit.

Yet no JS toolkit has taken off in a big way on the web, probably more on account of the costs of learning and bundling any given API, than because of the “you must use our APIs and only our APIs” problem. So people keep inventing their own toolkits.

Inventing toolkits and extension systems on top of JS is cool. I hoped that would happen, because during Netscape 2 and 3 days I was under great pressure to minimize JS-the-language, implement JS-the-DOM, and defer to Java for “real programming” (this was a mistake, but until Netscape hired more than temporary intern or loaner help, around the time Netscape 4 work began, I was the entire “JS team” — so delegating to Java seemed like a good idea at the time). Therefore in minimizing JS-the-language, I added explicit prototype-based delegation, allowing users to supplement built-in methods with their own in the same given single-prototype namespace.

In listening to user feedback, participating in ECMA TG1 (back during Edition 1 days, and again recently for E4X and the revived Edition 4 work), and all the while watching how the several major “JS” implementors have maintained and evolved their implementations, I’ve come to some conclusions about what JS does and does not need.

  • JS is not going away, so it ought to evolve. As with sharks (and relationships, see Annie Hall), a programming language is either moving forward, or it’s dead. Now dead languages (natural and programming) have their uses; fixed denotation and grammar, and in general a lack of “versionitis”, are virtues. You could argue that JS’s stagnation, along with HTML’s, was beneficial for the “Web 1.0” build-out of the last decade. But given all the ferment on the web today, in XUL and its stepchildren, and with user scripting, there should be a JS2, and even a JS1.6 on the way toward JS2.
  • JS does not need to become Java, or C#, or any other language.
  • JS does need some of its sharp corners rounded safely. See the table below for details.
  • Beyond fixing what was broken in JS1, JS should evolve to solve problems that users face today in the domains where JS lives: web page and application content (including Flash), server-side scripting (whether Rhino or .NET), VXML and similar embeddings, and games.
  • For example, it should be trivial in a future version of JS to produce or consume a “package” of useful script that presents a consistent interface to consumers, even as its implementation details and new interfaces evolve to better meet existing requirements, and to meet entirely new requirements. In no case should internal methods or properties be exposed by default.
  • It’s clear to me that some users want obfuscated source code, but I am not in favor of standardizing an obfuscator. Mozilla products could support the IE obfuscator, if someone wants to fix bug 125525. A standard obfuscator is that much less obscure, besides being unlikely to be adopted by those who have already invented their own (who appear to be the only users truly motivated by a need for obfuscation at this point).
  • A more intuitive numeric type or type tower would help many users, although to be effective it would have to be enabled via a new compile-time option of some sort. Numeric type improvements, together with Edition 4’s extensible operator and unit proposals, would address many user requests for enhancement I’ve heard over the years.
  • Too much JS, in almost every embedding I’ve seen, suffers from an execution model that appears single-threaded (which is good for most users) yet lacks coroutining or more specific forms of it such as generators (Boo has particularly nice forms, building on Python with a cleanup or two). So users end up writing lots of creepy callbacks, setTimeout chains, and explicit control block state machines, instead of simply writing loops and similar constructs that can deliver results one by one, suspending after each delivery until called again.

That’s my “do and don’t” list for any future JS, and I will say more, with more specifics, about what to add to the language. What to fix is easier to identify, provided we can fix compatibly without making a mess of old and new.

Here are the three most-duplicated bug reports against core language design elements tracked by Mozilla’s bugzilla installation:


Bug #

Dupe
Count


Component

Severity

Op Sys

Target
Milestone


Summary
98409 6 JavaScript Engine normal All literal global regular expression (regexp) instance remembers lastIndex
22964 55 JavaScript Engine normal All JavaScript: getYear returns “100” for 2000
5856 15 JavaScript Engine normal Windows 98 javascript rounding bug

I argue that we ought to fix these, in backward-compatible fashion if possible, in a new Edition of ECMA-262. If we solve other real problems that have not racked up duplicate bug counts, but fail to fix these usability flaws, we have failed to listen to JS users. Let’s consider these one by one:

  1. Unlike object and array initialisers, and E4X’s XML literals, regular expression literals correspond one-for-one with objects created during parsing. While this is often optimal and even useful, when combined with the g (global) flag and the lastIndex property, these singleton literals make for a pigeon-hole problem, and a gratuitous inconsistency with other kinds of “literals”. To fix this compatibly, we could add a new flag, although it would be good to pick a letter not used by Perl (or Perl 6, which fearlessly revamps Perl’s regular expression sub-language in ways that ECMA-262 will likely not follow).
  2. The Date.prototype.getYear method is a botch and a blight, the only Y2K bug in Mozilla-based browsers that still ships for compatibility with too many web sites. This bug came directly from java.util.Date, which was deprecated long ago. I’d like to get rid of it, but in the mean time, perhaps we should throw in the towel and emulate IE’s non-ECMA behavior (ECMA-262 did standardize getYear in a non-normative annex).
  3. The solution here is a new default number type, with arbitrary precision and something equivalent to decimal radix. Mike Cowlishaw has advocated and implemented his own flavor of decimal arithmetic, but it is not popular in ECMA TG1. Still, I bet we could make life better for many JS users with some innovation here.

There are other bugs in JS1 to fix, particularly to do with Unicode in regular expressions, and even in source text (see the infamous ZWNJ and ZWJ should not be ignored bug). More on these too, shortly, but in a wiki, linked with informal discussion here.

/be