WebAssembly, a new binary execution format for the Web, is starting to arrive in stable versions of browsers. A major goal of WebAssembly is to be fast. This post gives some technical details about how it achieves that.
But WebAssembly is also intended to be as fast as native code. asm.js has already come quite close to that, and WebAssembly narrows the gap further.
This post focuses therefore on why WebAssembly is faster than asm.js. Before we start, the usual caveats: Performance is tricky to measure, and has many aspects.
Also, in a new technology there are always going to be not-yet-optimized cases.
- So not every single benchmark will be fast on WebAssembly today.
- This post describes why WebAssembly should be fast; where it isn’t yet, those are bugs we need to fix.
- With that out of the way, here is why WebAssembly is fast:.
WebAssembly is designed to be small to download and fast to parse, so that even large applications start up quickly.
Still, WebAssembly’s binary format can improve on that, by being carefully designed for size in mind (indexes are LEB128s, etc.). It is often around 10–20% smaller (comparing gzipped sizes).
This mostly comes down to binary formats being faster to parse, especially ones designed for that.
WebAssembly also makes it easy to parse (and optimize) functions in parallel, which helps a lot on multicore machines.
Total startup time can include factors other than downloading and parsing, such as the VM fully optimizing the code, or downloading additional data files that are necessary before execution, etc.
But downloading and parsing are unavoidable and therefore important to improve upon as much as possible.
All the rest can be optimized or mitigated, either in the browser or in the app (for example, fully optimizing the code can be avoided by using a baseline compiler or interpreter for WebAssembly, for the first few frames).
So asm.js made it easy for VMs to use a lot of the full power of CPUs.
- WebAssembly isn’t limited in that way, and lets us use even more CPU features, such as:.
- 64-bit integers. Operations on them can be up to 4x faster.
This can speed up hashing and encryption algorithms, for example.
- Load and store offsets.
- This helps very broadly, basically anything that uses memory objects with fields at fixed offsets (C structs, etc.).
- Unaligned loads and stores, avoiding asm.js’s need to mask (which asm.js did for Typed Array compatibility purposes).
This helps with practically every load and store.
Various CPU instructions like popcount, copysign, etc.
- Each of these can help in specific circumstances (e.g. popcount can help in cryptanalysis).
- How much a specific benchmark benefits will depend on whether it uses the features mentioned above.
- We often see a 5% speedup on average compared to asm.js. Further speedups are expected in the future from CPU features like SIMD.
WebAssembly is primarily a compiler target, and therefore has two parts: Compilers that generate it (the toolchain side), and VMs that run it (the browser side).
Good performance depends on both.
This was already the case with asm.js, and Emscripten did a bunch of toolchain optimizations, running LLVM’s optimizer and also Emscripten’s asm.js optimizer.
For WebAssembly, we built on top of that, but have also added some significant improvements while doing so. Both asm.js and WebAssembly are not typical compiler targets, and in similar ways, so lessons learned during the asm.js days helped do things better for WebAssembly.
We replaced the Emscripten asm.js optimizer with the Binaryen WebAssembly optimizer, which is designed for speed.
- That speed lets us run more costly optimization passes.
- For example, we remove duplicate functions by default when optimizing, which often shrinks large compiled C++ codebases by around 5%.
- Better optimizations for irreducible and convoluted control flow, improving the Relooper algorithm.
- Helps a lot on compiled interpreter-type loops.
- The Binaryen optimizer was designed with experimentation in mind, and experiments with superoptimization have led to miscellaneous minor improvements — things which could have been done in asm.js too, had we thought of them.
- Overall, these toolchain improvements help about as much as moving from asm.js to WebAssembly helps us (7% and 5% on Box2D, respectively).
asm.js could run at basically native speed, but it never actually did so in all browsers consistently.
WebAssembly, on the other hand, has been designed jointly by all major browsers.
- There is still plenty of room for differentiation in VMs (different ways to tier compilation, AOT vs. JIT, etc.), but a good baseline of predictable performance can be expected across the entire Web.
Alon founded the Emscripten project in 2010.
- Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
- By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account . Disclaimer: All opinions thoughts / work were made personally by me and do not represent any of my employer’s thoughts or work.
- Edit (12/21/18): Added Firefox performance bug / issue.
- When developing for the web, there have been plenty of times where I couldn’t bring my idea to fruition due to browser performance.
- Browsers do not run instructions directly like a compiled executable written in C.
I’ve built more than a handful of Cordova / Ionic, Electron, and Progressive Web Apps (PWA) to allow myself to have the portability and flexibility of the web, but I knew it was always at the sacrifice of performance; So the moment I heard the whispers of WebAssembly (Wasm), I knew I had to jump in on it.
- About a year ago, I started a new personal project called WasmBoy.
- WasmBoy is a Gameboy / Gameboy Color Emulator, written for WebAssembly, to help me learn WebAssembly.
- Gameboy emulation has been playable in browsers on mediocre desktop devices for a while now, but hardly playable in browsers on mobile devices.
Therefore, one of the goals I had with WasmBoy was to bring playable Gameboy Emulation to budget mobile phones and Chromebooks.
More importantly, with Wasmboy, I wanted to answer my question: “Will WebAssembly allow web developers to write almost as fast as native code for web browsers, and work alongside the ES6 code that we write today”.
The core of WasmBoy is written in AssemblyScript, which is a language that compiles TypeScript to WebAssembly using Binaryen.
AssemblyScript is amazing. AssemblyScript allows Web developers to write more performant code in a new technology, using tools they are already comfortable with.
WasmBoy is compiled to WebAssembly using the AssemblyScript compiler. However, if we take a step back, we can realize that we can mock out some of AssemblyScript’s global functions that we call within our TypeScript code base.
Therefore, we can use the TypeScript compiler on the same code base that we use the AssemblyScript compiler with.
- Which gives us two different outputs in two different languages, using mostly the exact same source code!
- These cores are compared in a WasmBoy Benchmarking tool that we will will get into greater detail later.
There are a handful of other benchmarks out there that test WebAssembly vs.
Another common benchmark found, is a comparison of the two different compiler outputs of Emscripten.
Emscripten takes LLVM bytecode from C/C++ and compiles it down to asm.js or WebAssembly.
Colin Eberhardt, who runs WebAssemblyWeekly on Twitter, has a great response / TL;DR to one of the micro-benchmark stack overflow questions on the problems with micro benchmarking, and how Wasm should give about a 30% increase over asm.js in a real world case.
Here is a link to the paper they are referring to for the Wasm performance increase claimed in the Stack Overflow response.
Also, Colin has an A M A Z I N G talk on WebAssembly.
The talk has a section that does a ton of comparisons of Wasm vs.
JS performance, and the talk illustrates this in much more detail than that response linked above.
In terms of other “Real world” WebAssembly Benchmarks, PSPDFKit has a great benchmarking tool and article on WebAssembly performance in a production application.
I highly suggest giving that article a read as well if you are interested in this topic as it provides another point of view, and they did a great job comparing the two.
- However, the PSPDFKit benchmark does the comparison between WebAssembly and asm.js, and not WebAssembly and ES5/ES6.
- Therefore, the PSPDFKit benchmark is great if you are a developer with a large C/C++ application, and were wanting to know if moving from asm.js to WebAssembly is a great idea (which it is).
Game emulation in general stresses almost every part of a language / platform. Since it requires graphics, sound, controller input, and presents several interesting challenges such as performance, and flexibility.
Emulation tends to be very computationally intensive, which makes it a great fit for WebAssembly.
Also, WasmBoy is in the unique position to compare transpiled ES5 code from a popular compiler (TypeScript) to WebAssembly.
Therefore, I thought WasmBoy would be a great fit for this type of benchmark.