24 November 2010

Paren-Free

The tl;dr version

Krusty the ventriloquist

<Krusty>So, you kids want CoffeeScript, do you?</Krusty>

<script type="harmony">   // placeholder MIME type

if year > 2010 {
    syntax++
}

for i in iter {           // i is a fresh let binding!
    frob(i)
}

while lo <= hi {
    let mid = (lo + hi) / 2
    // binary search blah blah blah
}

... return [i * i for i in range(n)]   // array comprehension

</script>

No parentheses around control structure “heads”. If Go can do it, so can JS. And yes, I’m using automatic semi-colon insertion (JSLint can suck it).

There are open issues (are braces required around bodies?) but this is the twitter-friendly section. More below, after some twitter-unfriendly motivation.

Background

We had a TC39 meeting last week, graciously hosted at Apple with Ollie representing. Amid the many productive activities, Dave presented iterators as an extension to proxies.

The good news is that the committee agreed that some kind of meta-programmable iteration should be in the language.

Enumeration

Proxies had already moved to Harmony Proposal status earlier this year, but with an open issue: how to trap for (i in o) where o is a proxy with a huge (or even an infinite — rather, a lazily created and indefinite) number of properties.

js> var handler = {
    enumerate: function () { return ["a", "b", "c"]; }
};
js> var proxy = Proxy.create(handler);
js> for (var i in proxy)
    print(i);
a
b
c

The proxy handler’s fundamental enumerate trap eagerly returns an array of all property names “in” the proxy, coerced to string type if need be. Each string is required to be unique in the returned array. But for a large or lazy object, where the trapping loop may break early, eagerness hurts. Scale up and eagerness (never mind the uniqueness requirement) is fatal. TC39 agreed that a lazy-iteration derived (optional) trap was wanted.

js> var handler = {
    iterate: function () { for (var i = 0; i < 1e9; i++) yield i; } }; js> var proxy = Proxy.create(handler);
js> for (var i in proxy) {
    if (i == 3) break;
    print(i);
}
0
1
2

The iterators strawman addressed this use-case by proposing that for-in would trap to iterate if present on the handler for the proxy referenced by o, in preference to trapping to enumerate.

js> var handler = {
    enumerate: function () { return ["a", "b", "c"]; },
    iterate: function () { for (var i = 0; i < 1e9; i++) yield i; } }; js> var proxy = Proxy.create(handler);
js> for (var i in proxy) {
    if (i == 3) break;
    print(i);
}
0
1
2

To avoid switching from enumeration to iteration under a single for-in loop, once the loop has started enumerating a non-proxy, if a proxy is encountered on that object’s prototype chain, the prototype proxy’s enumerate trap will be used, not its iterate trap.

js> var handler = {
    has: function (name) { return /^[abc]$/.test(name); },
    enumerate: function () { return ["a", "b", "c"]; },
    iterate: function () { for (var i = 0; i < 1e9; i++) yield i; } }; js> var proxy = Proxy.create(handler);
js> var obj = Object.create(proxy);
js> for (var i in obj) {
    print(i);
}
a
b
c

Enumeration walks the prototype chain, and this is why a proxy might want both enumerate and iterate.

Iteration

What all this means: you can implement Pythonic iterators with proxies, and return a sequence of arbitrary values to a for-in loop that’s given the proxy directly (not on a prototype chain of a non-proxy object, as noted above). A large/lazy proxy would trap iterate instead of enumerate and return string keys, but other iterator-proxies could return Fibonacci numbers, integer ranges, or whatever the proxy implementor and consumer want. This was an intended part of the package deal.

js> function fib(n) {
    var i = 0;
    var a = 0, b = 1;
    return {
        next: function () {
            if (++i > n)
                throw StopIteration;
            [a, b] = [b, a + b];
            return a;
        }
    };
}
js> var handler = {iterate: function () { return fib(10); } };
js> var proxy = Proxy.create(handler);
js> for (var i in proxy)
    print(i);
1
1
2
3
5
8
13
21
34
55

(JS1.7 and above, implemented in both SpiderMonkey and Rhino, prefigured this proposal by supporting an unstratified iteration protocol based on Python 2.5. This JS1.7 Iterator extension is fairly popular in spite of some design flaws, and from the exercise of implementing and shipping it we’ve recognized those flaws and fixed them via proxies combined with the iterators strawman.)

The bad news is that the committee did something committees often do: try to compromise between divergent beliefs or subjective value theories.

In this case the compromise was based on the belief that for-in should not become the wanted meta-programmable iteration syntax. The argument is that for-in must always visit string-typed keys of the object, or at least whatever strings the accepted proxy enumerate trap returns in an array. If a Harmony proxy could somehow be enumerated by pre-Harmony for-in-based code, non-string values in the iteration might break the old code.

(The counter-argument is that once you let the proxy handler trap enumerate, a lot can change behind the back of old for-in-based code; also, enumeration is an underspecified mess. But these points do not completely overcome the objection about potential breakage in old code.)

Fear of Change

To fend off such breakage, we could make for-in meta-programmable only in Harmony code — any loop loaded under a pre-Harmony script tag type would not iterate a proxy.

This opt-in protection probably does not resolve the real issue, which is whether syntax can have its semantics changed much (or at all) in a mature language such as JS, which is being evolved via mostly-compatible standard versions in multi-year cycles.

I acknowledged during the meeting that we would not make progress without trying to agree on new syntax. This was too optimistic but I wanted to discover more about the divergent beliefs that made extending for-in via proxies a showstopper.

A quick whip-round the room with an empty cup managed to net us loose change from latter-day Java and C++:

for (var i : x)   // or let i, or just i for any lvalue i
     ...

as our meta-programmable “new syntax”. Bletch!

Not to worry. For-colon is probably not going to fly for some reasons I raised on es-discuss, but it also should die a deserved death as a classic bad compromise forged in the heat of a committee meeting.

The difficulty before us is precisely this how-much-to-change question.

ES5 strict mode already changes runtime semantics for existing syntax (eval of var no longer pollutes the caller’s scope; arguments does not alias formal parameters; a few others), for the better. Unfortunately, developers porting to "use strict" must test carefully, since these are meaning shifts, not new early errors.

My point is that syntactic and semantic change has happened over the last 15 years of JS, it is happening now with ES5 strict, and it will happen again.

Change is Coming

We believe that future JS, the Harmony language, must include at least one incompatible change to runtime semantics: no more global object at the top of the scope chain. Instead, programmers would have lexical scope all the way up, with the module system for populating the top scope. By default, the standard library we all know would be imported; also by default in browsers, the DOM would be there.

Can the world handle another incompatible change to the semantics of existing syntax, namely the for-in loop?

There are many trade-offs.

On the one hand, adding new syntax ensures no existing code will ever by confused, even if migrated into Harmony-type script. On the other, adding syntax hurts users and implementors in ways that combine to increase the complexity of the language non-linearly. The chances for failure to standardize and mistakes during standardization go up too.

What’s more, it will be a long time before anyone can use the new syntax on the web, whereas for-in and proxies implementing well-behaved iterators could be used much sooner, with fallback if (!window.Proxy).

Utlimately, it’s a crap shoot:

  • Play it safe, enlarge the language, freeze (and finally standardize, ahem) the semantics of the old syntax, and try to move users to the new syntax? or
  • Conserve syntax, enable developers to reform the for-in loop from its enumeration-not-iteration state?

All this is prolog. Perhaps the “play it safe” position is right. And more important, what if new syntax could be strictly more usable and have better semantics?

New Clothes and Body

Here’s my pitch: committees do not design well, period. Given a solid design, a committee with fundamental disagreements can stall or eviscerate that design out of conservatism or just nay-saying, until the proposal is hardly worth the trouble. At best, the language grows larger more quickly, with conservative add-ons instead of holistic rethinkings.

I’m to blame for some of this, since I’ve been playing the standards game with JS. Why not? It seems to be working, and the alternatives (ScreamingMonkey, another language getting into all browsers) are nowhere. But I observe that even for Harmony, and notably for ES5, much of the innovation came before the committee got together (getters, setters, let, destructuring). Other good parts of ES5 and emerging standards came from strong individual or paired designers (@awbjs, @markm, @tomvc).

And don’t get me wrong: sometimes saying “no” is the right thing. But in a committee tending a mature but still living programming language, it’s too easy to say “no” without any “but here’s a better way” follow-through. To be perfectly clear, TC39 members generally do provide such follow-through. But we are still a committee.

I want to break out of this inherently clog-prone condition.

So, given the concern about changing the meaning of for-in, and the rise of wrist-friendly “unsyntax” (Ruby, Python, CoffeeScript) over the shifted-keystroke-burdened C-family syntax represented by JS, why not make opting into Harmony enable new syntax with the desired meta-programmable semantics?

Paren-Free Heads

It would be a mistake to change syntax (and semantics) utterly. VM implementors and web developers having to straddle both syntaxes would rightly balk. There will be commerce between Harmony and pre-Harmony scripts, via the DOM and the shared JS object heap. But can we relax syntactic rules a bit, and lose two painfully-shifted, off-QWERTY-home-row characters, naming () in control structure heads?

for i in iter {
    // i is a value of any type;
}

Here’s your new syntax with new semantics!

We can simplify the iterator strawman too. If you want to iterate and not enumerate, use the new syntax. If you want to iterate keys (both “own” and any enumerable unshadowed property names on prototypes), use a helper function:

for i in keys(o) {
    // i is a string-typed key
}

The old-style for (var i in o)... loop only traps to enumerate. Large/lazy proxies? Use the new for k in keys(o) {...} form.

Are the braces required? C has parenthesized head expressions and unbraced single-statement bodies. Without parens, a C statement such as

if x
    (*pf)(y);

would be ambiguous (don’t try significant newlines on me — I’ve learned my lesson :-/). You need to mandate either parens around the head, or braces around the body (or both, but that seems like overkill).

So C requires parens around head expressions. But many style guides recommend always bracing, to ward off dangling else. Go codifies this fully, requiring braces but relieving programmers from having to parenthesize the head expression.

I swore I’d never blog at length about syntax, but here I am. Syntax matters, it’s programming language UI. Therefore it needs to be improved over time. JS is overdue for an upgrade. So my modest proposal here is: lose the head parens, require braces always.

You could argue for optional braces if there’s no particular ambiguity, e.g.

if foo
    bar();

But that will be a hard spec to write, a confusing spec to read, and educators and gurus will teach “always brace” anyway. Better to require braces.

Pythonic significant whitespace is too great a change, and bad for minified/crunched/mangled web scripts anyway. JS is a curly-brace language and it always will be.

Implicit Fresh Bindings

Another win: the position between for and in is implicitly a let binding context. You can destructure there too, but whatever names you bind, they’ll be fresh for each iteration of the loop.

This allows us to solve an old and annoying closure misfeature of JS:

js> function make() {
    var a = [];
    for (var i = 0; i < 3; i++)         a.push(function () { return i; });     return a; } js> var a = make();
js> print(a[0]());
3
js> print(a[1]());
3
js> print(a[2]());
3

Changing var to let in the C-style three-part for loop does not help.

But for-in is different, and in Harmony we (TC39) believe it should make a fresh let binding per iteration. I’m proposing that the let be implicit and obligatory. And of course the head is paren-free, so the full fix looks like this:

js> function make() {
    var a = [];
    for i in range(3) {
        a.push(function () { return i; });
    }
    return a;
}
js> var a = make();
js> print(a[0]());
0
js> print(a[1]());
1
js> print(a[2]());
2

Part of the Zen of Python: “Explicit is better than implicit.” Of course, Python has implicit block-scoped variable declarations, so this is more of a guideline, or a Zen thing, not some Western-philosophical absolute ;-). Having to declare an outer or global name in Python is therefore an exception, and painful. Like the sound of one hand slapping your face.

Of course JS shouldn’t try to bind block-scoped variables implicitly all over the place, as Python does; once again, that would be too great a change. But implicit for-in loop let-style variable declaration is winning both as sensible default, and to promulgate the closure-capture fix.

Comprehensions

When we implemented iterators and generators in JS1.7, I also threw in array comprehensions:

js> squares = [i * i for (i in range(10))];
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
js> odds = [i for (i in range(20)) if (i % 2)]
[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]

At first I actually implemented paren-free heads for the for-in parts in the square brackets, but when I got to the optional trailing if I balked. Too far from JS, and in practical terms, a big-enough refactoring speed-bump for anyone sugaring a for-in loop as a comprehension. But paren-free Harmony rules:

js> squares = [i * i for i in range(10)];
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
js> odds = [i for i in range(20) if i % 2]
[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]

The same win applies to generator expressions.

Thanks

Thanks to TC39 colleagues for their general excellence — we’re a committee but I’ll try not to hold that against any of us.

Thanks especially to @AlexRussell and @arv, who at last week’s meeting brought some attitude about improving syntax and semantics in Harmony that I fought at first (for fear of the committee opening up all design and compatibility constraints and failing to reach Harmony). Their attitude stimulated me to think outside the box, and outside the parens.

Some of you may be thinking “this is crazy!” Others of you will no doubt say “more! more!” I have some other thoughts, inspired by TC39 conversations, that could help make Harmony a better language without it being over-compatible warm beer, but I’ll save them for another post.

My point here is not to rush another syntax strawman through TC39, but to stimulate thinking. I’m serious about paren-free FTW, but I’m more serious about making Harmony better through judicious and holistic re-imaginings, not only via stolid committee goal-tending.

/be

24 Responses to “Paren-Free”

  1. If you’re serious about the wrist-pain argument for removing parens, shouldn’t you provide an alternative to curly braces as well. I’d argue that typing those involves more wrist torque and pain–though I don’t speak from experience on that… begin/end bracketing of statement bodies?

    Also, if I’m understanding correctly, if and while would work the same with and without parens around the head. But for/in would be subtly different with and without the parens. So parens are usually optional but sometimes significant…that worries me.

  2. Sid says:

    Thank you, thank you, thank you for fresh let-bindings per iteration. This is one thing that a lot of languages have gotten so badly wrong.

    Eric Lippert attempts to defend the equivalent definition in C#: http://blogs.msdn.com/b/ericlippert/archive/2009/11/12/closing-over-the-loop-variable-considered-harmful.aspx — but IMHO the only good point there is that you’ll find _some_ developer somewhere relying on it.

  3. Sid says:

    “the equivalent definition” => “the equivalent decision”

  4. Nick says:

    What about people who like their squiggly brackets on separate lines.

    if year > 2010
    {
    syntax++
    }

    Will that be allowed?

  5. Dean Landolt says:

    Dropping parens in all heads would be a very welcome syntax addition, but I share David Flanagan’s worry about parens being subtly significant in just the for..in case. I’d take it as a given that js programmers are liberal with parens where we’re not sure of order-of-ops implications — I know I am. Adding significance to this one case is probably a dangerous precedent. Would the implicit let binding also depend on the parens?

    Still, optional parenless-heads would be a thing of beauty for Harmony, and not just because of our collective carpal tunnel — moving from parens-required to braces-required doesn’t buy us much there, but making our blocks explicit helps plug up a few hazards. Perhaps there’s another nice opt-in syntax for iterators waiting to be discovered but I think that issue should be orthogonal to headless parens.

  6. Stu says:

    In python keeping at the start of control blocks makes sense, as it differentiates them from the rest of the code, might be worth it here (along with removing parens).

    Of course, maybe it would be worth proposing python for inclusion in browsers somehow ?

  7. Thanks for the work you’re doing, Brendan. And I applaud your willingness to steal the good parts of CoffeeScript and integrate them into Harmony. Good on ya mate!

  8. Brendan Eich says:

    @David, @Dean: I don’t see how to eliminate braces and have a sound grammar. Anything like begin/end is overlong and those keystrokes have to hurt too. Braces are less of a reach from QWERTY home than parens, best I can do.

    On the issue of for x in y {} vs. for (x in y) {}, one fix is to ban the latter in Harmony. The “x in y” is not an expression you can over-parenthesize, unlike “if x == y {}” vs. “if (x == y) {}”. Some people keep wanting different syntax for iteration. I aim to please!

    @Nick: sure, free format as usual, modulo automatic semicolon insertion’s “restricted productions” (no newline between return and its expression, break/continue and their label, throw and its expression [why the last is needed is a mystery of ES3]).

    /be

  9. Brendan Eich says:

    @Stu: give http://www.aminutewithbrendan.com/pages/20101122 a listen and follow the hacker news link. C-Python in all the browsers is not going to happen for enough reasons I’ll bet on Skulpt.

    @Anthony: Programming languages all “steal”, similar to how natural languages (especially English) evolve. JS needs a make-over!

    /be

  10. Letain says:

    If you’re saving keystrokes, how about renaming (or aliasing) “function” to something shorter? This would save me more strokes than removing parens. :(

    Larry Wall, for all his warts, had the right idea by applying a kind of Huffman encoding to keywords — common keywords are shorter. Functions are really common in a language with first-class functions…

  11. Harleqin says:

    Is this a new sport? How much ambiguity can your parser take before it breaks?

  12. Nice! And now drop the “return”s and always return the result of the last expression.

  13. Scott Graham says:

    @Brendan re: Betting on Skulpt — ‘goto’ (or even computed goto) would really help make Skulpt more practical. Re-implementing Py control flow w/o it is pretty painfully slow.

    I like these proposals so far.

  14. Brendan Eich says:

    @Letain: see http://wiki.ecmascript.org/doku.php?id=strawman:shorter_function_syntax — more to say about this but in a separate post. This one was long enough.

    @Harlequin: what ambiguity?

    @Christian: that is two-edged, due to completion value leak bugs. It’s not clear we can do better than a shorter “return” or “here is the completion value” syntax.

    /be

  15. wm says:

    Iterators – two dots instead of colon (if using ‘in’ is too problematic):
    for (i .. lst) { print(i); }

    Guards – if instead of two colons:
    let x = 4 if fnGuard

  16. Brendan Eich says:

    @WM: I am trying not to “[p]lay it safe, enlarge the language, freeze [the] old syntax, [and] move users to the new syntax”, as I wrote. But if we must add new syntax, your “..” idea is not bad — better than “:” for sure! Although (dherman reminded me, so here’s an UPDATE:) ECMA-357 (E4X) specifies “..” as the “descendant” operator.

    People could bikeshed for months on new syntax to add, but my hope is that by switching to paren-free (and only paren-free) we can change for the better, without adding extra syntax.

    “if” for guards is even more righteous — see also catch guards — but alas it requires a restricted “no [LineTerminator] here” production. Otherwise automatic semicolon insertion kicks in and error-corrects as follows:

    js> function f() { let x = 4                                    
    if (x) return true; return false; }
    js> f
    function f() {var x = 4;if (x) {return true;}return false;}
    

    A restricted production is not the end of the world, but we are trying to avoid adding more. I encourage you to post to es-discuss with this “if” for guards idea, adverting to the restricted production requirement. Thanks,

    /be

  17. Chris Waterson says:

    Okay, I couldn’t help it. I was reminded of this:

  18. Brendan Eich says:

    Waterson: thanks for the memories. Where is Warren these days? Mail me!

    Srsly, we have too many parens — they are still on sale. Unlike if, while, and switch, the for loops (both of them) do not have heads that are simply expressions with mandatory parentheses around them. Both for (;;) and for-in have heads that, while currently required to be paren’ed, are not expressions but rather special forms.

    Someone on reddit objected to my proposing to relieve JS hackers of having to parenthesize heads. My response is “over-parenthesize if you like, where the head is just an expression”. The for loops are not in that class of statements.

    Another Reddit whiner thought this was pointless since JS can’t ever change in browsers. Hah!

    I’ve been doing this too long. I think five years is a fine period to wait for uptake among browsers in a competitive market. It worked for ES3 and in spite of competition going away during the five years. Competition since Firefox launched has only gone up. I don’t think it will go away soon.

    /be

  19. George says:

    I love Harmony!

    How about using the strict clause for syntax?

    Strict ON requires parens and curlies (whiners, minifiers)
    Strict OFF no parens, no curlies and significant white space.

    * parens and curlies will always be required for lambdas and closures.

  20. Daniel says:

    Re:

    How about using the strict clause for syntax?
    Strict ON requires parens and curlies (whiners, minifiers)

    A:
    Good minifiers (Google’s Closure) already handle curly-free code just fine. As far as curlies are concerned, there is only one audience left – whiners. Accommodating whiners?

  21. Brendan Eich says:

    @George: one of head-parens or mandatory curlies is required. There is no point making yet another mode for this given Harmony’s opt-in script type. What’s more, we have no plans yet for “use strict” in Harmony — that is an ES5 mode that is default-enabled by Harmony.

    @Daniel: no accomodating whiners, right (of course we all whine now and then, get cranky, throw food. ;-)

    The general trend over the last 15 years toward lighter syntax than C’s, and the consequent interest in CoffeeScript and languages that influenced it, involves no whining at all — rather, better ergonomics vs. readability trade-offs. Going too far hurts readability, but heavy mandatory bracketing can hurt both writability and readability.

    /be

  22. [...] cssess: The Bookmarketlet That Finds Unused CSS Selectors How to Detect Nudity with Javascript Paren-Free (Brendan Eich’s blog Kinect + Javascript hack shows potential for web interfaces Share and [...]

  23. [...] Brendan Eich: So, we kids wish CoffeeScript, do you? [...]