💾 Archived View for gemini.susa.net › gen › A_Technical_Due_Diligence_of_WASM.gmi captured on 2023-07-10 at 13:50:44. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2021-11-30)
-=-=-=-=-=-=-
WASM gets touted by quite a few thought leaders and early stage VCs as the hot new thing that will be used for backend app development. I've been skeptical of these claims for years now. Here's the thing - most VCs are not technical and the ones that are probably don't code much at all anymore. To be fair - it's not their business to be technical - it's their business to evaluate good_business_opportunities. Of course when you are evaluating technology businesses you don't want to windup funding a Theranos nor do you necessarily want to be funding a GameStop, the latter of which Silly/Con Valley is filled to the brim with.
So what do they do when presented with shiny new technology that they can't vouch for? After all - they get pitched thousands of businesses each year. They ask other engineers they respect for their opinion - this is called 'technical due diligence'.
Let me put on my engineer hat and tell you why WASM won't ever be what some people hope it will be. Keep in mind that this article is written with a heavy bias. Our company is dominated by kernel engineers so we clearly see the software world through a different lens. Also, if you are a developer and want to use WASM - knock yourself out - this isn't really written to beat WASM itself down -- moreso to challenge the assumption that it will be a paradigm changing application deployment mechanism - which it won't. Finally, most parts of this article address WASI versus WASM because without it the arguments for the ecosystem really don't make sense.
First off, WASM doesn't do raw sockets. WASM uses websockets which are TCP only but even that means that you can't produce your own TCP - you must use websockets.
What uses raw sockets? The example you might be most familiar with is ICMP or ping.
Raw sockets will never be implemented because it's such a huge security issue. Even on normal non-browser based linux systems it creates vast security issues so browsers have just ex-nayed it which is reasonable. I'm sure lots of people have tried and will try but without some sort of heavy capability lockdown it's not going to happen.
Most games that people play use UDP for network latency reasons. Web browsers utilize HTTP which is in return built on TCP. Ever wondered why flash games "sucked" so much? There are many reasons but that is one core reason.
One of the recurring themes you'll see throughout this article is the reference to WASI[1] - the WebAssembly System Interface - which is basically an effort to give you all the missing API you'd typically find on a system versus what WASM itself gives you. This is what a lot of people looking at doing server-side application delivery are looking at.
There are efforts to remedy these networking issues. As we glean from this github[2] ticket that wishes to add support for Berkeley style sockets, however, the code and comments present their own issues to overcome:
"connect() has never been fast. There have always existed network conditions which can cause it to block for a long time (defined in minutes)."
The WASI_networking_design_issue[3] is equally damning:
"Browsers already have an HTTP API: fetch(). All WASI can provide is a wrapper for that function."
This particular PR is interesting because it also talks about the issues affecting TLS (which we'll address down the list).
Raw sockets and the Berkeley socket API are super important for anything server-side. This issue alone is a major blocker.
Next up - WASM doesn't have a stable dynamic linking story yet - it's a WIP[4] - which seems to be the case for most things in this environment.
99% of the software you use on Linux is dynamically linked. Unless you are using some alt-libc (you aren't) you are probably linked dynamically. You could re-compile and re-link statically but very few people do that - definitely not the ones consuming end software through apt-get. If you are using a mac this issue is further complicated immediately and the new M-1s will make that complication even more complicated. I think most of the apologists here contend that there would be a maintainer to have these re-packaged to get around this limitation - that's an ok response.
WASM is 32bit only - that means you can only address 4 gig of memory. That is way too small for any type of database more than a toy and not to be blunt about it but this probably excludes a ton of java applications as well (read: enterprise adoption).
WASM was originally constrained to only 2Gb but_the_limit_was_pushed_to_4Gb_last_year[5].
Not only databases will be affected by this but any "in-memory" software which is increasingly becoming more and more popular is affected. So think of all the software that data engineers use in their day to day work - this nixes all of that.
Just keep in mind that all WASM is read in as an arraybuffer[6], written in -- wait for it -- JAVASCRIPT. So at the end of the day you are constrained by javascript.
Yes, javascript is indeed the most popular programming language in the world but that's not because it is the most technically superior one. Sometime in the past 10 years or so software development was broadly democratized and while I don't think that's necessarily a bad thing (as long as you aren't working on things like medical software, banking, airplanes or the like) and could even be a good thing, one has to contend that there are LA Lakers and then there are tens of millions of school kids that are forced to do PE once a week (yes, I know, pre-covid). One group is paid individually millions of dollars a year and one group might be pissed that physical education doesn't mean exercising your thumbs on a game controller.
The lack_of_64bit_support[7] is not necessarily a "forever" blocker but it's not something that is going to be immediately available anytime soon either. Chrome already piles multiple plates high at the memory buffet.
There are proposals[8] and work being done for this issue but it's not something I see being easily solved in the next few years and I don't know about you but I'm been using 64bit exclusively for the better part of a decade - I mean the Intel Xeon was released in 2004.
One of the marketing points of WASM is "near native" or "near metal" speed. However, that's not necessarily the case and there is plenty of evidence suggesting the complete opposite as shown in the paper "Not_So_Fast:_Analyzing_the_Performance_of_WebAssembly_vs._Native_Code"[9]
The authors complain that they couldn't even directly port many applications to measure because of standard Unix APIS that were completely missing.
Languages like Go still_suffer_considerably[10] when compiled over because of things like wasm_lacking_goto[11].
WASM people like to opine ad infinitum about "how secure" the bytecode vm is when it comes to things like memory safety - rustaceans like this argument as well. Unfortunately for WASM, the lack of memory randomization ala ASLR, or the lack of stack canaries makes things as easily exploitable as things were in the late 90s as_pointed_out_by_David_Schneider_in_the_IEEE[12].
Lehmann and others address this in more detail in the "Everything_Old_is_New_Again:_Binary_Security_of_WebAssembly"[13]
"We find that vulnerable source programs result in binaries that enable various kinds of attacks, including attacks that have not been possible on native platforms since decades"
In fact - WASM has no "read only" memory. It employs linear memory as well.
This doesn't even touch on the fact that the vast majority of WASM payloads out there are nothing more than cryptojacking_malware[14].
True threads are faked through the use of web workers and the SharedArrayBuffer[15] and they come with a host of limitations and work arounds.
Even to turn them on you_need_the_right_headers[16] and might need to flip the browser support on as it was disabled in certain runtimes because of Spectre (remember that?) but apparently has been re-enabled recently.
Out of all the bullet points in this article this one looks like it has achieved the most traction in the past few years.
So listen, there are plenty of multi-billion dollar companies that run their software on languages like ruby and python which are inherently single threaded and single process so no one is saying you can't scale around these problems with load balancing and reverse proxying and such.
TLS can mean two things given the context. One is thread local storage. The other is transport layer security (you might know that as 'SSL').
Let's discuss SSL. As we talked about the missing_socket_API[17] earlier this dovetails into how SSL gets implemented.
In it we find YAP - yet_another_proposal[18] exists, however is going to be lacking for real world applications for the foreseeable future. Some people think it's currently impossible to ensure the non-existence of side channel attacks here - remember the discussion we just had about threads and Spectre? This also doesn't touch on hardware acceleration for crypto primitives which is kind of a must.
You can technically compile any number of ssl libraries in but that still has these limitations and is_not_advised[19].
"Crypto can never work on top of WASM. So these need to be system APIs."
Will any of this ever be fixed or implemented? Some of it will be implemented. Some of the other issues will never be dealt with though. WASM was made for the browser - not the server. Browser security models will dictate how these features evolve.
There's plenty of other stuff we didn't even touch on that is missing from WASM/WASI to make it a suitable environment for server-side applications.
I believe there is room for probably one or two companies working on WASM, assuming they own the IP and governance, which as far as I can tell no one company does, but I don't see it blossoming into the end-all-be-all server-side app ecosystem that many others seem to see.
Mozilla fired much of their WASM team last year through multiple layoffs, although some companies that are vested in the space swooped in to scoop them up.
The people that state WASM is the new JVM are way off base. While the JVM might be a gas guzzling crank shaft starting beast once it's running you can do most anything you'd like to do, has more than acceptable performance despite it's GC, and it has access to all the normal Unix like APIs - that's not the case with WASM.
The people that label WASM as a new kind of flash are absolutely 100% correct. That doesn't mean that there isn't some opportunity. After all Zynga (maker of Farmville and others) is still a 11B company and Adobe is still a 230B market cap company.
Having said that, it's worth noting that Flash has been deprecated and Farmville has been shut down.
Link: 3. WASI_networking_design_issue
Link: 5. but_the_limit_was_pushed_to_4Gb_last_year
Link: 7. lack_of_64bit_support
Link: 9. "Not_So_Fast:_Analyzing_the_Performance_of_WebAssembly_vs._Native_Code"
Link: 10. still_suffer_considerably
Link: 12. as_pointed_out_by_David_Schneider_in_the_IEEE
Link: 13. "Everything_Old_is_New_Again:_Binary_Security_of_WebAssembly"
Link: 14. cryptojacking_malware
Link: 16. you_need_the_right_headers
Link: 18. yet_another_proposal
https://www.linkedin.com/pulse/technical-due-diligence-wasm-ian-eyberg