wllang - Slashdot User

Comment Not a timing attack as claimed (Score 2) 181

by wllang on Monday June 25, 2018 @05:49AM (#56841062) Attached to: Changes in WebAssembly Could Render Meltdown and Spectre Browser Patches Useless

Access to very accurate timers improves the efficiency of attacks but they are not the core of the attack and the attack still works with far less accurate timers. This has already been explained with proof of concept examples, see https://weblll.org/index.php/s... Attempts were made to have WebAssembly support mitigation techniques efficiently, but those in control appear to have had different plans and appear to be working towards using 32-bit indexing wasm in a large block of 64-bit address space to constrain attacks, but obviously that fails on 64-bit indexing wasm and is of no help in a 32-bit OS.

Comment Re:Mod parent up? (Score 1) 235

by wllang on Monday March 13, 2017 @02:45AM (#54027105) Attached to: Will WebAssembly Replace JavaScript?

PNaCl appears to still uses NaCl to sandbox the code, so it would appear that PNaCl could have been a user definable language that translated to NaCl on the client side to run, so perhaps not so interesting from the perspective of a sandbox VM. Whereas wasm can be validated in a single pass decoder and runtime bounds checking emitted as necessary. Some distinctions that help here are that wasm has structured control flow, blocks and loops, and a 'safe' stack, and indirect functions are referenced by an index into a table. I had been trying very hard to get better type derivation into the JS compilers so they could better optimize away bounds checks known to be unnecessary, I don't think PNaCl could do that, but this work was effectively blocked in v8, firefox, and jsc (some patches remain open, some just closed). As it is some wasm VMs emit inline bounds checks and some use memory protection etc, so more flexible than NaCl. My impression is that NaCl (the PNaCl sandbox) is too close to the machine code requiring machine code patterns to be verified, and that a slightly higher level language that allows some re-writing (bound check styles etc) as the code is emitted has more potential. Wasm is designed to be decoded to SSA in a single pass, but fwiw I don't see loops decoding to simple SSA so easily, and I think it would have helped a lot to have encoded the loop stride explicitly and other design work appeared necessary in the area of loops.

Getting back to the idea of a translation layer and build file, if these had been defined then it might have been possible for web developers to target NaCl where available and asm.js otherwise, from the same web developer defined encoding on the client side. The web browser could do the caching and even share the cached results across origins as it could trust the build process. Thus in the end the focus on a common wasm language seems to have missed the key design requirement. Wouldn't it have been great if instead we now had the infrastructure to allow the web browsers to compete for better encodings and VMs and web developers to work on languages and encodings upstream of the translation layer.

Comment Disappointing outcome (Score 2) 235

by wllang on Saturday March 11, 2017 @11:26PM (#54020927) Attached to: Will WebAssembly Replace JavaScript?

I worked on this development for over two years, since before the WebAssembly CG was created, and have been demonstrating some of the best performance from asm.js and wasm style code, and I believe the process and outcome has been somewhat of a failure. There are some positive outcomes, such a set of operators, but this seems just a small step from asm.js (adding some 64-bit operators etc).

At the end of this process it became clear that seeking a single virtual machine or encoding was not a good outcome, because it means that the web community is stuck with the lowest common supported feature set. It seems that the key enhancement that was required was a translation layer from the deployed binary to a VM that might be somewhat specific to particular web browsers, plus a build process. This would have allowed the major web browsers to offer different competing solutions and web developers could still target code to them by having the translation layer rewrite for each. While this can be done in part with the wasm 1.0 version released, I do not believe this is well supported because it needs to work very well with streaming and caching.

Caching can work well with named sources and their versions and the dataflow that produces the translated output and the compiled output. For these products to be safely shared across origins the web browser needs to control the inputs into this build process - a defined and sandboxed build pipeline was needed. Scheduling of the builds could also be far better managed by the web browser using global knowledge that is not practically available to each context.

Optimizing memory accesses is also critical. The key sandbox requirement is to ensure that accesses to the linear memory are contained so safe. To achieve near native performance requires some design attention to this challenge. My impression is that the design of wasm assumed only that memory protection would be used to efficiently catch accesses that are out of bounds. Early on I had been demonstration very good performance using a pointer masking technique, so to mask off the high bits and have the VM derive that the index is within bounds, and even for translated C code this was giving very good performance. For code that uses tagged points and naturally masks off these tags it could mask off the high bits at the same time almost for free. For this to work well requires that the memory be a power of two plus a spill area, and that the masking be baked into the code, and this could be done in a translation layer if the linear memory size were negotiated and an input into the pipeline. The WebAssembly project just would not accommodate this use case, my patches across JS compilers were stalled and some still sit there today with no progress in the past two years.

With pointer masking it helps to move the masking before small offsets are added to indexes, so that the machine index+offset instruction addressing modes can be used, and this helps hoisting the masking and reduces register pressure etc. One complication is that this only works assuming the index is non-negative so the offset does not wrap the index. Some C code would break without extra care. Emscripten refused to merge patches to improve support for this, claiming that it was not a good approach. Guess what, wasm adopted a similar memory access offset restriction, that indexes must be positive! There is a lot more to optimizing this, and I see little progress in the past two years. I can still demonstrate some of the best performance.

Another memory related optimization is to be able to place the base of the asm.js/wasm linear memory at absolute zero in the address space, so that a register is freed from having to point to this base, and so that the machine instruction addressing modes can be better exploited to help the code. I had demonstrate how much this helped well before the WebAssemembly CG was created. Practically many systems already protect access to the low pages, so for code to use this strategy it would need to avoid using these low pages too. There was no accommodation for this optimization. The CG would not even add a flag to allow code to opt in to not using the first 64k page of the linear memory. All the wasm application deployed that use the first page will block the VM using this strategy. Just another disappointing outcome and contrary to the goal of near native performance.

Not all code could use a linear memory at absolute zero in the address space, so clearly the properties of the linear memory need to be inputs into the build process, and inputs into the cache keys. It would also help to be able to defer allocation of the linear memory until after compilation - compilation on multiple threads can use a bit of memory, so it helps to not have already allocated a big block of memory to the linear memory while compiling. These design constraints all point towards the needs for a build process. The wasm 1.0 API does not even fit this pattern as it is not possible to compile the code and store that compiled code in the indexdb for re-uses as it might need to be re-compiled - it's a dead end design. What was needed was a build file, for the web browser to cache the compiled results and to reused this on a cache match, and to re-build just those paths of the dataflow necessary on a miss.

A defined build process might have also worked well in non-web contexts, allowing code to be updated and re-built ready for use, or allowing updates to be prepared and switched quickly, etc. I have also been exploring deploying wasm style code to IoT devices and here the actual building might occur off the device with the compiled code being deployed but still using the wasm style sandbox.

There were other issues with the design too, perhaps more bike-shedding but perhaps important for making tools simpler and compilers faster and simpler. The 'stack machine' idea has some merit, but the CG refused to explore adding a 'pick' operator that would have made it much cleaner to work with. Wasm block fall-through paths do not unwind values left on the stack rather fail with a validation failure if the block stack does not match the require values, and it would have made it just a little easier to work with to not have that restriction, much more so when using 'pick'. With 'pick' and a few other minor changes wasm can be a functional style SSA encoding, and this avoids the need for the decoder to do the SSA conversion, would reduce the attach surface, probably speed the decoder. The SSA encoding also gave a much nicer language to read and work with for producers (in my subjective opinion). A range of other issues with the wasm 1.0 encoding that frustrate an SSA encoding were simply closed and not considered, for no technical reason, and they will still cause problems in future, for example the arguments are in local variables rather than on the stack and the br_table operator passes the same number of values to all targets which does not work well for functional SSA style and just using a br_case operator seemed to work much better and gave much nicer and flatter code.

Given a SSA encoding, the design could have explored better encoding the definition live ranges, and if anything else could be done to support a fast compiler. There just are no really fast wasm compilers that integrate closely with the encoding, so this design area just does not appear explored.

The WebAssembly CG was created in back room discussions. I was an active member in this area and was not consulted, was not invited to a single meeting of the group. Sometimes design documents were released after decisions were made. The Chairs were not elected by the group as a whole, and there was no call for consensus from the group as a whole to publish wasm 1.0 rather four of the web browsers just released wasm 1.0 which is what you see them advertising. I appealed the decision for Mozilla to enable wasm in release builds and not a single member of the whole Mozilla community defended the technical merits of that decision, and the appeal was not answered. I believe I can still demonstrate the best performance and articulate what needs to be done.

It seems a very disappointing outcome to me, and I am just taking stock of where to next. If they had done the job well then I might have just been able to offer an alternative web browser that met other use case better, but without the translation and caching working well even this is not practical. Is there any chance of getting web developers to use a 'makefile' to describe the build process for the applications and to have this be implemented in JS using the wasm 1.0 API for a start while leaving the doors open to a build process implemented in the browsers. With everyone crowing about how great wasm is, what is the chance and how much work would it take to get this support.

Slashdot Top Deals