TraceMonkey Update

We have been busy, mostly fixing bugs for stability, but also winning a bit more performance, since TraceMonkey landed on mozilla-central, from which Firefox 3.1 alpha-stage nightly builds are built. Tonight’s builds include a fix for the bug that ilooped a SunSpider test (my apologies to those of you who suffered that bug’s bite).

But what I’m sure everyone wants to know is: how do we compare to V8?

Here are the results from head-to-head SunSpider on Windows XP on a Mac Mini and Windows Vista on a MacBook Pro, testing against last night’s Firefox automated build and yesterday’s Chrome beta:

tracemonkeyv8

We win by 1.28x and 1.19x, respectively. Maybe we should rename TraceMonkey “V10” ;-).

Ok, it’s only SunSpider, one popular yet arguably non-representative benchmark suite. We are not about to be braggy. (“Don’t be braggy” is our motto here at Mozilla ;-).)

But it’s worth digging deeper into the results. Let’s look at the ratios by test:

We win on the bit-banging, string, and regular expression benchmarks. We are around 4x faster at the SunSpider micro-benchmarks than V8.

This graph does show V8 cleaning our clock on a couple of recursion-heavy tests. We have a plan, to trace recursion (not just tail recursion). We simply haven’t had enough hours in the day to get to it, but it’s “next”.

This reminds me: TraceMonkey is only a few months old, excluding the Tamarin Tracing Nanojit contributed by Adobe (thanks again, Ed and co.!), which we’ve built on and enhanced with x86-64 support and other fixes. We’ve developed TraceMonkey in the open the whole way. And we’re as fast as V8 on SunSpider!

This is not a trivial feat. As we continue to trace unrecorded bytecode and operand combinations, we will only get faster. As we add recursion, trace-wise register allocation, and other optimizations, we will eliminate the losses shown above and improve our ratios linearly across the board, probably by 2 or greater.

I’ll keep updating the blog every week, as we do this work. Your comments are welcome as always.

V8 is great work, very well-engineered, with room to speed up too. (And Chrome looks good to great — the multi-process architecture is righteous, but you expected no less praise from an old Unix hacker like me.)

What spectators have to realize is that this contest is not a playoff where each contending VM is eliminated at any given hype-event point. We believe that Franz&Gal-style tracing has more “headroom” than less aggressively speculative approaches, due to its ability to specialize code, making variables constant and eliminating dead code and conditions at runtime, based on the latent types inherent in almost all JavaScript programs. If we are right, we’ll find out over the next weeks and months, and so will you all.

Anyway, we’re very much in the game and moving fast — “reports of our death are greatly exaggerated.” Stay tuned!

30 Replies to “TraceMonkey Update”

Michael Haufe says:

September 3, 2008 at 4:46 am

No doubt about it, Chrome is pretty damn fast, but they aren’t quite up to par on things like Acid 3, SVG and a couple other DOM issues and rendering artifacts. I hope speed isn’t their primary focus… (Interesting to note that the memory consumption of the browser also has a correlation to the size of application on the monitor(s))

Reply
Christian Betz says:

September 3, 2008 at 7:06 am

Thank you very much for running these tests Brendan. When I read about V8 yesterday one of my very first questions was how it compares to the latest Tracemonkey builds.
I am currently working on a project using the Spidermonkey JS API and it is good news to hear that V8 does not beat Tracemonkey out of the water and in fact the race is quite close. You guys are doing great work.
One more thing: Is it possible with the current nightly builds to embed Tracemonkey (using the classic Spidermonkey JS API)?

Reply
Brendan Eich says:

September 3, 2008 at 7:52 am

@Christian: yes, you can embed TraceMonkey. We are not breaking the JS API intentionally in any way you should notice.
One change we did make: the standard global properties for Object, Math, etc. used to be monitored by the global object class’s getProperty and setProperty hooks, but now use JS_PropertyStub. Again I do not expect embedders to notice this, and it’s better for performance and integrity of the standard bindings. The default get/set monitoring is an ancient JS API “feature”, going back to 1995, but not really a wanted feature AFAICT.
Any other API changes should be documented on MDC via dev-doc-needed keywords on bugs.
/be

Reply
Siddharth Agarwal says:

September 3, 2008 at 8:16 am

Maybe it’s because I’m running a trunk build checked out a couple of hours ago and not a special branch, but on my Vista machine, Tracemonkey (JITting on) takes 2156.4ms +/- 4.4% and Chrome 1788.8ms +/- 2.9%.

Reply
Brendan Eich says:

September 3, 2008 at 9:29 am

Siddharth: PGO does a lot for our RegExp code, and it is on in the automated builds. If you built your own Minefield, you didn’t get it by default.
/be

Reply
Siddharth Agarwal says:

September 3, 2008 at 10:18 am

Yes, you’re right 🙂 An automated build gives 1757.4ms +/- 4.7%.
This is great stuff! It’ll be interesting to see where Mozilla and Google take their respective implementations.

Reply
Peter Michaux says:

September 3, 2008 at 8:14 pm

Brendan,
Could you please explain a bit more what the following means?
“We have a plan, to trace recursion (not just tail recursion).”
Does this mean tail recursion which is also just-in-time compiled?
Thanks,
Peter

Reply
Brendan Eich says:

September 3, 2008 at 9:51 pm

Peter: recursion in general. Tail recursion is just a loop, which Trace Trees eat for breakfast. Non-tail is more challenging but we’re doing it.
/be

Reply
Breton says:

September 4, 2008 at 4:14 pm

To increase regular expression performance, have mozilla considered implementing the “Thompson’s NFA” algorithm described in the following page?
https://swtch.com/~rsc/regexp/regexp1.html

Reply
Brendan Eich says:

September 4, 2008 at 8:44 pm

Breton: yes, I mentioned Ken Thompson’s original grep to Andreas the other day. We had an intern study regular expression tracing in the Tamarin Tracing context, too. We need to get recursion and a few other things done, but I think regexp tracing (with ES3-compatibility to complicate things, to be sure) is inevitable.
/be

Reply
Γριφεγ says:

September 5, 2008 at 1:39 pm

If V8 was created with big web apps in mind, why are they doing whole-file JIT? With really large apps this will inevitably be a perf hit. As you are saying V8 is well-engineered, I must have misunderstood their «Intro» on the Google Code site.

Reply
Brendan Eich says:

September 5, 2008 at 2:44 pm

@Γριφεγ: V8 can do lazy function-by-function compilation (without any interpreter) based on what is called. In Chrome all source is parsed into ASTs but AIUI not necessarily compiled in a whole-file or whole-script-tag-content sense.
“Well-engineered” is evident from a read through the code. Check it out yourself.
/be

Reply
hmdz says:

September 5, 2008 at 9:58 pm

It is just becoming more interesting, I mean the browsers war. Well, with thousands of add-ons for Firefox, I don’t think that chrome can reach its popularity very soon. JavaScript speed alone is not everything for a browser, and a good example is Opera which had the fastest engine for a long time but gathered much less audience than Firefox/IE.
A JIT engine in Firefox 3.1 will be so great, making Firefox on the top of my list again. However, I guess Mozilla’s decision in using JIT compilation has so much room for optimization when we compare it with things like JIT compilation in dot net framework. The performance of .net code is super great when we think of it as a jitted code, and in some cases performs better than its native counter parts.
There will be a bright future for the web and desktop if we feel the speed of desktop applications in their web counterparts too, using highly advanced Just in Time JavaScript compilers with things like Zoho.

Reply
Lanny Heidbreder says:

September 5, 2008 at 11:08 pm

*blankstare*
Why in the world is it faster on a Mac Mini than on a Mac Book Pro?! Is it because of XP/Vista?

Reply
RoBoMe says:

September 6, 2008 at 3:30 am

I’m glad something is being done and also that you’re confident to hold up with V8.
Though I’m curious how results from the V8 Benchmark Suite fit into the picture. Surely they’ve chosen to include benchmarks where V8 is superiour. But still, I get an overal score of 181 for todays 3.1 nightly of Firefox when jit is off, and only 158 when jit is on—but blazing 1646 for Chrome.
What type of optimization will get Tracemonkey on par?
Second point, more JS speed is great and everything else is not your business. But I’m still bothered by Firefox’ DOM speed which comes into play in nearly every usage as a browser. Here Opera, Safari, Chrome are ten times faster.
So to really get a performance boost for the whole app tuning JS just isn’t enough. And sadly I can’t see anything to be done in Gecko performance-wise.

Reply
_ck_ says:

September 6, 2008 at 8:01 am

Try making a 100 row, 100% width table, with random data inside each row. Then via dhtml/javascript, collapse it row by row (on a 20ms timer so the display has time to render each change).
In the above test, Firefox (all versions including 3.1a2) are far slower than any other modern browser, including IE.
Why is this? Does it have to do with the poor DOM performance and nothing to do with javascript?

Reply
Brendan Eich says:

September 6, 2008 at 4:46 pm

@RoBoMe: the V8 benchmarks are heavier on recursion and function calling than others, and we fall off trace on them, mainly due to recursion but also due to untraced native methods. We’re looking at them now in detail and working through the bugs.
@_ck_: yes, you are describing layout (rendering) and DOM, not core JS, performance bottle-necking.
You might be describing an O(n^2) layout bug we should fix, which would be a spot-fix, not a general performance shift on the order of TraceMonkey.
The Gecko DOM is a huge, ripe target for improvement via tracing, and otherwise (Jason Orendorff’s quickstubs just landed, I’m not sure you tested with a build that includes them). If you made a test and can share it, please file a bug and attach it (cc: me — brendan@moz will autocomplete when you submit the bug). We’ll study it and see what should be optimized.
/be

Reply
CVertex says:

September 7, 2008 at 12:08 am

Chrome does seem to cold start up faster.
I really like the multi-process choice. Out of all the chrome ideas, this is the best. I think Firefox should follow suit.
TBH, it’s flash sites that really kill the browser and slow it down. I hope adobe works harder to make it more compatible

Reply
waleofsuous says:

September 7, 2008 at 2:46 am

I think you should compare the latest nightly build of Firefox to the latest nightly build of Chromium, that’d be a more valid comparison.

Reply
Brendan Eich says:

September 7, 2008 at 1:52 pm

CVertex: agreed on MP, we’re looking into it.
waleofsuous: check the date on this blog post: Chrome was just out, and it hadn’t changed that day. When I give updates, I’ll blog latest tip to tip perf, but until then you’ll have to benchmark for yourself.
Again, and for the last time: this is not some world series or playoff, so don’t get too worked up. The main point is that we’re competitive and working on even bigger wins in the near term. And I see that webkit.org is too (SquirrelFish Extreme!), which is good. Microsoft must be feeling the heat and seeing the light.
The competitive trends should become clearer over time, and if V8 ends up fastest when the different competing engines approach their designs’ asymptotes, great — we’ll probably be right behind, so developers can count on excellent performance across most browsers. But we may well be ahead, because we can inline and speculate better.
/be

Reply
Dannii says:

September 7, 2008 at 11:37 pm

Hi Brendan, I’m considering reworking Thomas Thurman’s Gnusto engine, which has some form of JIT. I looked at Andreas Gal’s site, but I fear it all looks far above my level (and probably not too applicable for a JIT compiler running in JS?)
Do you know of any good resources explaining the basic theory and methods behind JIT compilation?

Reply
waleofsuous says:

September 8, 2008 at 2:13 am

hmm… what’s this “SquirrelFish Extreme!” you are talking about here???

Reply
yusufg says:

September 10, 2008 at 6:59 am

Just a reminder that TraceMonkey fans are looking forward to your next weekly update

Reply
waleofsuous says:

September 12, 2008 at 4:37 am

latest Sunspider comparatives :
Safari, WebKit nightly r36309 : 1763.8ms +/- 2.3%
Chromium build 2105, Javascript V8 : 1902.0ms +/- 1.8%
Minefield/3.1b1pre Gecko/20080912031847, TraceMonkey jit.chrome jit.content set to true : 2055.6ms +/- 3.4%

Reply
Brendan Eich says:

September 14, 2008 at 3:09 pm

@waleofsuous, @muonis: Those last two comments look exactly the same — I will delete the later one if you can confirm, or in any event (since it is redundant). The results also look like Mac numbers — please confirm.
These numbers show that SquirrelFish Extreme has landed in the WebKit nightlies — good job SFX team! JITting regexp evaluation really pays off on SunSpider.
TraceMonkey recursion-handling and regexp compilation are under way. More after I return from traveling.
/be

Reply
Michael Adams says:

October 12, 2008 at 3:23 pm

https://forums.mozillazine.org/viewtopic.php?f=23&t=900365
Performance in the nightlies is slipping. I ran a test between 3.1 10-12-2008 trunk & Chrome 0.2.149.30. Results are not so promising now…
** TOTAL **: *1.82x as slow* 2380.8ms +/- 1.3% 4330.4ms +/- 0.7% significant

Reply
W & L says:

December 9, 2008 at 3:09 am

Any update here? About the tracemonkey and trace into DOM issues.

Reply
RoBoMe says:

February 2, 2009 at 2:51 am

Seems no newsworthy progress on Tracemonkey. That’s sad.

Reply
Brendan Eich says:

June 10, 2009 at 11:31 am

Plenty newsworthy, just me being too busy and lame to blog :-P. I will have new posts this week.
/be

Reply
achudnov says:

June 19, 2009 at 12:31 pm

Brendan,
What’s the current status of the project introducing an info-flow monitor in the next JS engine? Do you have a project page for that now? The latest info I have is https://www.dagstuhl.de/Materials/Files/09/09141/09141.EichBrendan.Slides.pdf
/Andrey

Reply

30 Replies to “TraceMonkey Update”

Leave a Reply to Michael Adams Cancel reply