Notes: Building a Better Web Browser

These are my cursory notes from a talk given by James Mickens of Microsoft Research, in March 2015, titled "Building a Better Web Browser".

Notes legibility estimate: MEDIUM

---

The State of Progress

Chrome, Opera isolate the renderer in separate processes -- this allows tabs to crash on their own. ...but the issue is that the browswer is still a monolithic kernel.

Servo -- extra threading! ...but still monolithic.

The problem: Browser developers take the monolithic design as a given, and tinker around the edges.

The Problem

What is a browser trying to do? Provide services for origins -- render, computation, i/o + messaging

  • It provides
    origin = <protocol, host, port>

Render: HTML CSS MathML Aria WebGL video canvas images

IO: XHR DOM IndexedDB Cookies FileReader BrowserCache AppCache

Currently: providing services for origins, but they're high-level and complex. You wouldn't ask your operating system to implement Emacs in the kernel. ...well, you might. That was a test; I've already called the police on you.

Kernel: network, UI, storage (concurrency)

Usercode: rendering, javascript, parser


Atlantis

Taglines: security, performance, robustness

But first, some depressing things...

Browsers offer many opportunities for parallelism! ex: HTML parsing, rendering, network and disk IO. But current architectures limit concurrency, suffer from races.

But the DOM model says: JS is single-threaded, and cannot see browser locks. So: JS execution should biglock the DOM. What does happen: NOT THAT.

But in reality, there's just no concurrency re: the DOM. None. more at: Race Detection for Web Applications -- Petrov et al., PLDI '12

idiosyncratic abstraction (n.) an abstraction that is inconsistently applied throughout an application.

HTML5 screen-sharing attacks: SOP should prevent attacker.com from opening and screen-scraping bank.com (even after screen-sharing enabled). What does happen: NOT THAT.

Principal instance: Web application code+data, scripting runtime (e.g. HTML CSS JS parser DOMtree WebKit parsing) ...but if you wanted to, you could be rolling a Haskell runtime and LaTeX stack, or, like, whatever you wanted.

Kernel: Cross-principal messaging, principal creation, network, UI, storage.

Interfaces: Computation: createPI(), sendMsg(), listenMsg(), Storage: put(), get(), External: HTTPStream, IO: randerImg(), listenGUI()

I think that the browser community has accepted the notion that if we want fast performance and rapid innovation, we need to put up with a really grotesque software structure. Well, maybe for real bare-metal performance... but we don't really need that. So all you undergrads out there who only code in C...get hip!

Good features we'd like to keep:

  • Easy to inspect pages
  • Easy to compress/validate code

Atlantis executes abstract syntax trees! -- see JSZap: Compressing JavaScript Code, Burtscher et al. / Slim Binaries (ACM paywall) Franz and Kistler


Common case: You don't have to write the extensible web stack! -- You'll use GoogleStack, or FacebookStack; sim. jQuery vs Angular vs...

So...how much bigger is an Atlantis page?: Well, bigger. But remember: caching saves 81% of bytes, 75% of requests. Anyway, the backwards-compatible default: 762K (avg webpage is 1.6MB -- this is 47% bigger).

Typically, what you (contra. Gizmodo) care about in your browser performance is the ability to run a series of event handlers, not straight-line execution.

Deny-by-default SOP protection.

Presently: the renderer and JS engine have different heaps -- can only communicate via RPC. Ex. the .innerHTML interface. (Obviously a dangerous interface...) As a WebDev, you'd like to be able to interpose in the innerHTML interface. We should be able to shim in order to get a reference to the oldInnerHTML, sanitize the input string, and do the thing. Nope!

In a regular browser, web developers can't control very many invariants, so adding new features opens portals to demon worlds.

Atlantis... Sanitize innerHTML strings? I don't need anyone's permission!


  1. Modern web browsers make it difficult to create fast, secure, robust web pages
  2. Happens because browsers use bad abstractions, developers lack the ability to fix them.
  3. Refactoring the browser and giving developers more control solves problems.

Future Work: New abstractions for web privacy. / Private logical machines.

  • Strictly follows your commands;
  • Disappears when you turn it off, and takes data with it.

Presently: Your dataset is sharded among many DSPs, who do what they want; MVC-wise, your devices are just views for M/Cs that live elsewhere. Note: Encryption is necessary but insufficient for this abstraction.

Server-side: Encrypted computation. Client-side: Private browsing (does it work very well right now? probs not.)


Q: What if the WebDev doesn't want to send a whole stack, and just wants to deliver native-code instead? A: Yeah, sure! Why not?

Kohler: Q: Since all of the kernels we're using are microkernels, we're all going to be using microkernel browsers by 2020, right? A: Well, we're not on micros yet, but even so, the microkernel movement has had an influence on the sorts of monolithic kernels we see.

Browsers are currently undergoing what Marx predicted would happen to Capitalism, being torn apart by internal contradictions.

Q: So what's the first thing they'll take from your work? A: The AtlantisKernel API.

Q: So what's the story on hardware APIs, for GPUs, webcams... A: Think of it like a USB-type thing; there's a format it needs to go through for the webpage. Re: GPU, we're not quite certain what the right interface is... It's tricky.

Q: What happens to the market in add-ons and extensions, which depend on everything being shipped out as markup? A: Pages themselves might have to opt-in to that, providing those sorts of interfaces to their data.


Postscript: It occurs to me, a few days later, that what James is describing is the idea of mutating individual sites from Unix-philosophy, textbuffer-to-textbuffer piping applications to a server-and-ports, things-happen-on-the-inside model more reminiscient of full-fledged modern applications. In this framing, it is not at all surprising to me that the former is the way things have grown, and that people are now suggesting a jump away from it. (I could speculate at the coincidence of James's suggestion and his position at that non-Unix company, Microsoft, but that might perhaps be impolitic.)