V8 under the hood

I think I can add one more option to my last post about VMs. With Chrome, Google released a new virtual machine for Javascript called V8. I’ve spent some sweet time playing with it and browsing through the code in the past two days and thought some would be interested in my findings.

First it’s not really a “classic” VM. There’s no usable intermediate representation or higher level opcode that you can program at the VM level. The only thing V8 understands is Javascript and the only target representation is native assembler (Intel and ARM for now). So in some respect it’s closer to a compiler than a traditional VM even though the line is blurry for most modern VMs. There’s no interpretation whatsoever.

But it’s also more than just a compiler. It includes a generational, accurate garbage collector and the assembler generation is reworked at runtime depending on execution paths. What they’re calling hidden classes allows the generalization of some calls to optimize them as much as possible. Once you know for sure the address that’s being called, there’s a lot of clever things to make that very fast. As Javascript is fairly dynamic as class structures go (classes are basically hashes, you can add, remove, or update functions whenever you feel like), those hidden classes can also be recalculated which triggers new assembler generation. A fairly cute machinery.

So for those who would want to reuse the VM only to implement their own languages on top of it, they’re out of luck. V8 isn’t like the JVM or Mono, where you can generate intermedate bytecode, it’s straight Javascript to assembler. On the other hand, it makes cross-compiling to Javascript an interesting option. Theoretically and with enough optimizations, this thing could be as fast as C at least on some benchmarks. It’s only going to take time to get to the level of maturity of gcc.

Other than that, as C++ code goes it’s as clean as you get. The API is pretty nice, with specialized higher level classes to wrap Javascript datatypes like String or Number all wrapped in handles for garbage collection. If you wanted to provide a readfile(f) function for example here’s what the code would look like:

[source:java]
v8::Handle ReadFile(const v8::Arguments& args) {
if (args.Length() != 1)
return v8::ThrowException(v8::String::New(“Read expecting a single string argument”));
v8::HandleScope handle_scope;
v8::String::AsciiValue file(args[0]);
v8::Handle source = ReadFile(*file);
if (source.IsEmpty()) {
return v8::ThrowException(v8::String::New(“Error loading file”));
}
return source;
}
[/source]

Ah, a couple more things. Some standard libraries are written in Javascript. So to package everything in one executable, they translate them to big C arrays containing each char and compile that. Sounds to me like something others could reuse to build their own executables from Javascript, all bundled with V8. Another goodie is snapshotting. When you build V8 from source, you can have it generate a snapshot of the memory state once the libraries are loaded and this gets packaged in the executable. It makes it slightly bigger but the load time is blazingly fast. Pretty sweet.

I think I covered most of what was interesting. So what do you think? For me V8 is a keeper. We’ll see how it evolves and where the benchmark war leads it but the bases are definitely sound. After all, it’s just a first release.

16 Comments »

Matthieu Riou on September 3rd 2008 in Uncategorized

16 Responses to “V8 under the hood”

  1. Emson… » Blog Archive » Chrome is a Desktop Web Application platform responded on 04 Sep 2008 at 6:59 am #

    [...] of the relevant JavaScript engines and it seems that V8 is very fast. Matthieu Riou says that V8 is closer to a compiler than a traditional VM. It takes JavaScript code and converts it into low level byte code. It is still early days but we [...]

  2. QuarkBlog » Blog Archive » Chrome responded on 04 Sep 2008 at 9:27 am #

    [...] V8, o Javascript se convierte en el 4º lenguaje Google. [...]

  3. anjan bacchu responded on 04 Sep 2008 at 9:51 am #

    hi there,

    cool.

    “Sounds to me like something others could reuse to build their own executables from Javascript, all bundled with V8. Another goodie is snapshotting. When you build V8 from source, you can have it generate a snapshot of the memory state once the libraries are loaded and this gets packaged in the executable. It makes it slightly bigger but the load time is blazingly fast. Pretty sweet.”

    Can you elaborate on this ?

    Thank you,

    BR,
    ~A

  4. Matt Cruikshank responded on 04 Sep 2008 at 11:49 am #

    Two questions -

    Why does the code say “return v8::ThrowException”? What’s the point of the return?

    What’s with the handle_scope? It’s not referenced in the rest of the code. Is it changing a hidden, global data structure (or thread-local structure) someplace that for instance ReadFile also accesses? Seems odd.

  5. Marcus responded on 04 Sep 2008 at 1:21 pm #

    I’ve always been mildly interested in how VM’s actually generate assembler code?They just bundle a code generator in there, which spits assembly to some malloced area, and then they just jump there? How do they preempt? Is it an optimizing compiler, does it deal with register allocation?

  6. Alpha Kilo Hotel responded on 04 Sep 2008 at 1:27 pm #

  7. Demopoly responded on 04 Sep 2008 at 2:24 pm #

    Thanks! I’ve been scanning for an impartial review that talks about salient points rather than fanboy cheerleading. It’s getting harder to find topical discussion on the internet…

    Thank the bloggers for blogs, without which, I would be lost.

    Demo

  8. Drazick responded on 04 Sep 2008 at 2:26 pm #

    Very interesting.
    Thank you.

  9. Nanik responded on 04 Sep 2008 at 3:23 pm #

    Nice writeup !

  10. Matthieu Riou responded on 04 Sep 2008 at 7:02 pm #

    anjan: Elaborate on which part of the mechanism? How one could reuse this or how it is implemented?

    Matt: You can’t reuse C++ exceptions as they need to capture Javascript level information. So you just return an object that actually represents the exception. The usage of handle_scope is documented here: http://code.google.com/apis/v8/embed.html#handles

    Marcus: You’re mostly correct about the jump. Details are slightly more completed but that’s the idea. For preemption I don’t know exactly for V8 but my guess is that they generate some loopback to their own code. As for optimizing register allocation, I guess the whole point of compiling straight to assembler is to be able to generate your own optimizations so I would think so yes.

  11. Roger responded on 04 Sep 2008 at 11:11 pm #

    The whole dump the process out after initialization and reuse that for faster startup is some that emacs did for yours. What it did was remarkably close to a core dump. Search for “emacs unexec” to get various pointers as to the complexity and bugs.

  12. My daily readings 09/05/2008 « Strange Kite responded on 05 Sep 2008 at 3:42 am #

    [...] Off The Lip » V8 under the hood [...]

  13. Matthieu Riou responded on 05 Sep 2008 at 6:39 am #

    Roger: Sure, the idea isn’t really new. Smalltalk images were sort of similar. It’s just nice that they went the extra mile to implement it. I’m still waiting for the JVM to have something similar after all these years…

  14. Engine JavaScript V8: Tan rapido como C++? | Seraphinux responded on 05 Sep 2008 at 7:38 am #

    [...] C++; bueno bueno, ahora pasemos al articulo que les comentaba. Este articulo lo puedes encontrar en V8 under the hood, como bien podras darte cuenta esta en ingles, pero “no problem” las siguientes son mis [...]

  15. Marcus responded on 07 Sep 2008 at 10:46 pm #

    Matthieu: Thank you for the response!
    I’m gonna go pull the sources now and have a look.

  16. A Comparison of JavaScript Engines | tips & tricks responded on 19 Jul 2010 at 12:18 pm #

    [...] V8 Under the Hood by Matthieu Riou [...]

Trackback URI | Comments RSS

Leave a Reply