💾 Archived View for dioskouroi.xyz › thread › 24987861 captured on 2020-11-07 at 00:52:31. Gemini links have been rewritten to link to archived content
-=-=-=-=-=-=-
________________________________________________________________________________
Somehow I have the sinking feeling that the most popular "killer app" for webassembly will be circumventing user preferences like "never auto-play" - or thwarting ad blockers.
WebAssembly doesn't grant any extra permissions into the inside of the browser sandbox. Audio and canvas HTML elements already exist and don't have any innate connection to WebAssembly. (If you ignore performance, WebAssembly could be entirely polyfilled into old browsers that didn't support it through a Javascript interpreter.)
Yep, people seem to have a hard time understanding that WebAssembly doesn't add any new functionality, it just makes what JS can already do more performant (for cases that benefit from it). WebAssembly by itself is actually running inside a more restrictive sandbox even.
It's not about permissions. It's about decoding video using WebAssembly and rendering it on canvas. Browser can't prevent it. Now I'm not sure that it wasn't possible before WebAssembly... Decoding some tiny video should not be that resource-intensive. Also you could just utilize GIF.
It is about permissions. You can autoplay video just fine with HTML5. But you can't do it with sound. And how would your WebAssembly decoded video play sound? Right, using the exact same HTML5 media elements. WebAssembly doesn't allow you to somehow do different things in the page. It only allows some computations but to actually have any output it has to resort to the same APIs that your normal JS uses.
If you can autoplay html video without sound right now, then you're right, I thought that autoplay was forbidden for all videos.
That's what I told Firefox, but for some reason video autoplays just fine all the time so...
Same, is there a fix for this?
Firefox media developer here.
It's a bit hard to understand what the problem is, sorry.
You can block video and sound auto-play altogether, or just audio, or none, in `about:preferences#privacy`, and choose or check exceptions you've granted, but if I read the grand-parent comment correctly, an appropriate setting has already been put in place.
Additionally, if the website really wants to play audio against a visitor's will, it can try to hijack clicks and touch events. There are some plans to tighten this, the spec was designed to allow implementing various policies. It should also be possible to write a content blocker for this as a Web Extension (maybe it already exists).
Does this happen on all websites? On some websites, that you could possibly share with us, so we can have a look and understand what's up ?
Can anybody that is not satisfied about this open a ticket at
, and put in `:padenot` in the box `Request information from` (log in with github or a bugzilla account) with some info about the websites ? An alternative would be to send me an email, username at mozilla.com.
Thank you, I have only noticed this occurring after clicking a different video first. It is probably due click highjacking or the like.
Thanks, I'll keep this in mind and report when I get it next time.
So far uBlock Origin seems to be pretty good at keeping the most annoying videos at bay, but sometimes they do slip through.
I also noticed it today on a website that video was autoplaying and was shocked by it as Firefox has stopped that for some time.
I'll note where it happens again.
I don't suppose you dismissed one of those "totally GDPR compliant" cookie bars right before that, did you? (You may have to think very hard; it's basically muscle memory for us these days...)
I've seen this trick in the wild. "No autoplay" means "no autoplay without some user interaction first". And clicking one of those "I accept" buttons is indistinguishable from clicking on any other random element to make it play a video. So they just made the "I accept" button play the video as well as dismiss whatever thing was probably covering up the website.
Some people are terrible.
You know what that could have been it, that would be one of those tricks.
Thanks for the tip.
More to the point, it's about decoding _documents_ using WebAssembly and rendering _that_ on canvas, thus circumventing browser-based discrimination between legitimate and malicious content.
This already happened. It's why the Chromium team doesn't allow blocking video completely because they saw the advertisers just working around the issue by making custom players. Those players were bloated and heavy so by blocking video for the user they didn't actually help block video and they made the problem worse.
So yes, with or without wasm, people will roll their own ways of trying to get your attention.
After playing around with immediate mode GUIs in WebAssembly, I can envision a future which renders any kind of content blocker obsolete. The internet could be turned into cable television. I don't want to sound overly pessimistic, but it seems very plausible that this is the future we will eventually end up with.
The thing with WASM though is you can always intercept it at the edges where it interacts with the DOM. The usual ad blocking tactics of either deleting particular arrangements of DOM nodes, blocking specific js/wasm payloads, and blocking specific domain names will still be unaffected. The tactics don't even need to change.
If an entire UI (content plus ads) is drawn with wasm onto a canvas (the way Qt WebAssembly does it), it will be impossible to use CSS to hide ads, and possibly more difficult to modify the script to disable the ad-loading functions.
Doesn't sound very feasible, as this would make user experience probably much worse. No native elements, no browser integration or extensions, interaction with the page, and accessibility. They all would suffer.
> No native elements, no browser integration or extensions, interaction with the page, and accessibility. They all would suffer.
Uh, but those things are already happening right now, e.g. Flutter Web UI [1]
[1]
https://hugotunius.se/2020/10/31/flutter-web-a-fractal-of-ba...
> No native elements, no browser integration or extensions, interaction with the page, and accessibility.
Sounds like the pages made in Flash (or even Java) which were somewhat common in the 90s.
If publishers valued user experience over ad income, they wouldn't show ads to begin with.
Google would probably not feature you very high in search results, though, since your website has no indexable text.
If this technique would become popular, Google will OCR it just like it runs JS.
This technique is essentially the same as websites made entirely in Flash back in the day. And back then, the only way to have Google index your site was to provide a separate (hidden) version that used normal, semantic HTML.
Google is an advertising company. It would make sure there is a way to get your site indexed, perhaps by providing an alternate machine-readable version just to them. As a bonus, that version would not be available to potential competitors.
They can use it only for video players that'll autoplay for example.
If ad content is served from the same domain as normal content, and there is minimal DOM interaction, then it will be a lot harder.
The weird thing is, ads from the same domain already would be a lot harder to track. For some reason very few sites switched to this approach.
it's an ad industry thing, not a site thing. The ad industry would really have to build the solution to host on the sites, and the problem then becomes making something every site can reasonably host, and all the headaches associated with that.
And I have the suspicion that doing it wouldn't add much to the bottom line of the company who invested in it, despite how much they like to complain about ad-blockers.
That makes ad fraud detection _much_ harder, which is something the ad industry is even less willing to accept than ad blocking.
How so?
This is what I understand the strategic intent of Google's AMP project to be.
WASM modules can be blocked by domain just like regular ads. Just block the whole thing, problem solved.
Just like you can easily set your browser to refuse all cookies. Problem solved, now you can’t browse half the internet.
Wasm has a pretty simple execution model, wouldn’t be hard to write something that preprocesses wasm code during load, similar to what you can do with java classloaders.
We can handle Wasm like Flash: declare it obsolete and then reinvent it again.
Declaring something obsolete doesn't make it so. Flash went away because of iPhone and rise of mobile app and still it took about a decade for it to be irrelevant. Without a similar tectonic shift, it would take longer.
Easily preventable using encrypted wasm modules.
I don't think WASM can write to its code memory to self-decrypt. At some point, the WASM has to create a DOM node containing the decrypted script, at which point you can process the code.
Now that is an idea for a prof of concept.
I worry a lot about this but if this is the price of competing with the Google/Apple duopoly maybe it's worth it.
It’s long been possible to decode H.264 using JavaScript (which obviously uses way more power than the native decoder). This is why browsers only restrict audio playback rather than actually autoplaying video.
Playing audio from WASM means going through WebAudio, there is no magic direct connection between WASM and the system's audio device.
On one hand this is great - offload the processing to the client side. On the other hand, I don't know if I'd trust a client-encoded video when serving video content to thousands of other users.
Curious about the performance though. Would love to see benchmarks vs a standard ffmpeg install.
Unless you're a broadcast studio, aren't most videos client-encoded one way or another? Does YouTube even let users upload raw, uncompressed video? Speaking of which, IME (mostly with audio), exploits more often seem to relate to encapsulation--container format, framing, etc--than to the low-level compressed blocks themselves; and those parts certainly are controlled by the client, at least at ingest. I don't know if that's simply because more people (albeit still small in absolute terms) are capable of understanding and breaking encapsulation implementations as opposed to the math-heavy block (de)compression routines, or because the (de)compression algorithms natural bound the scale of the output--e.g. a compressed 44.1khz AAC frame is not going to expand to more than 44.1k samples, and each sample won't be larger than the parameterized bit size. Either way, anyone distributing multimedia _should_ account for this stuff. Alas, most don't seem to bother; they just pass off files to an ffmpeg binary or wrapper and hope for the best.
I think you've misunderstood. I'm saying that I wouldn't want to directly serve a client-encoded video without doing an encoding pass on the server. Youtube re-encodes all video regardless of the format it's uploaded in.
In the early days of YouTube you could upload FLVs that were served directly to users (this was back when their only playback format was Flash Video). It let you get high-quality video before that was an official option, and some people even had "fun" with malicious FLVs that crashed your browser via Flash.
but if it encode it in your webpage with your ffmpeg tool then unless someone interfere with it maliciously in some way it should not be different to encoding it on the server
I like this for pure client-side apps, if even only for a simple ffmpeg-GUI app that I can link any of my friends who want to quickly convert something without going to amazingmp4tomp3converter.com
Basically I'd like to see something like Handbrake on the web, maybe even simpler interface.
The data formats are standardized with expected outputs, so it should be possible to develop parsers that enforce strict specs. It would be significant faster and cheaper than encoding.
This is HUGE in terms of lowering the barrier to entry for making a viable youtube competitor. Google spends tremendous amounts of money on video conversions that could be done in the uploader's browser. Maybe this will tip the scale.
You underestimate the amount of time needed to do video conversion client side. It is a very compute intensive process and is not feasible for videos of any significant length unless you are willing to let you computer's fan spin up to maximum for an extended period of time. There has been a dozen WebAssembly ffmpeg ports. The issue is absolutely not with the technology.
Just to put this into perspective: my i7 4790k takes around 2 minutes to render about 1 minute of 1080p 60 fps footage at x264 medium.
If you wanted to run a YouTube competitor you would likely require better compression, which would take even longer. Now think about it running in the browser, which likely slows it down even more.
You upload a 10 minute clip and then your browser eats 100% of the CPU for the next hour.
And all of that is just for one quality setting.
Can it be done faster on a GPU?
Yes, _but_ it comes at the cost of quality/bandwidth. When it comes to videos you can trade image quality for file size. High end nvidia GPUs are better in bit rate constrained situations for streaming, but they're not as good at reducing file size as the slower encoders.
When it comes to video hosting it's largely about bandwidth. Imagine you had a video that got 1 million views. 110 MB vs 100 MB file size. That's 110 TB vs 100 TB bandwidth - a difference of 10 TB. At a cent per GB that's $100 difference. Now imagine a video with 10 million views or 100 million views.
And the real difference in file size is likely to be larger.
You're not going to pay anywhere near a cent per GB if you're running a major video hosting site.
Those kinds of fees would only happen if you were a tiny site _and_ using an overpriced option like AWS.
Not if you want to keep quality. GPU encoding blocks are pretty poor in comparison to SW encoders.
That depends on the encoder and the settings you use. My own playing around with CUVID and NVENC support in ffmpeg a couple of years ago worked out quite well. I found that I could take 1080p H.264 footage, decode it, overlay a text display, and re-encode it back to H.264 much faster than real time without perceptible-to-me quality loss.
I saw no problems with the video quality if I used -profile:v high -preset:v slow and set the output bitrate equal to the input's average bitrate. With those settings, I was able to reencode at about 130 fps - handy when the raw footage was 9+ hours at 25 fps. Yes, that's "slow" on the GPU. :)
Not only that, but are you gonna transcode to multiple resolutions on the users computer and upload all of those? Not only wasting a ton of time, but also bandwidth? What if they're uploading from their phone, also wasting their battery.
And its not just different resolution but also different codec. You have x264 as baseline. Then you want HEVC, and for some people AV1 as well.
It may be trivial for Youtube or Netflix, certainly not from a consumer's perspective.
Does it also add other concerns, such as the user uploading different content at different resolutions?
YouTube's stronghold is because of community, not technology. It's also been around for 14 years and has so much internet history baked into it that it's just hard to conceive it being dethroned.
I used to think similar about AOL Instant Messenger, the Wintel monopoly, SEGA and Internet Explorer.
The Windows is still a near monopoly (Mac at 8-10%, Linux at 1-3%) in the desktop globally. On mobile, they never had a stronghold.
As for SEGA, AOL, and IE, they never had any real stronghold (like a lock-in), they just had the users - those are easier to switch to something new. Games for example go stale, and you don't care that much for your older played games - you want the new shiny. A browser, you switch over, and you have everything you did before, including your bookmarks. AOL, had it's stronghold when the internet was 1/100 of what it is now.
I can see the internet going 100x larger now (except if it goes to 80 billion people out of the 8 billion on Earth).
>The Windows is still a near monopoly (Mac at 8-10%, Linux at 1-3%) in the desktop globally. On mobile, they never had a stronghold.
right, Windows didn't lose their position because someone out-competed them in their area, but because a whole new category of device (smart phone) and a new category service (search) rose to prominence. (Also fears of legal problems tampered their normal behavior)
For a lot of those things, competitors have a viable path to profitability if they make a better alternative. With Youtube, even if you make a better alternative, you're still loosing a ton of money due to costs of serving video and scaling (especially if you want to let people watch and upload for free).
Those were different times, old man.
As these moments will be, some day. :)
None of those have the network effect and an existing community. Maybe MySpace would be a better counter example.
In addition to the transcoding performance raised by others, the real cost sink in Youtube competitor is not transcoding but bandwidth. It is extremely expensive at any serious scale. YT by itself does not make any or significant money for Google.
Eh, ffmpeg is already pretty long in very optimized native builds (as in, reencoding from a 4k master to a 1080p decent quality MP4 for yt can already be 0.5x the duration of the video)... I don't see it becoming faster by being put in a web browser, especially when there is likely no access to hardware encoding APIs.
Are those costs not dwarfed by the bandwidth and compute costs for distribution?
I'm working on porting ffplay, targeting HEVC but also all the other decoders if possible. I've had nothing but trouble from emscripten and have had to start from the beginning, just using ffplay as reference and very slowly building the application up.
Emsriptens proxy_to_thread option doesn't transfer the document context correctly, so calls from SDL to change canvas size, mouse scroll events, etc, all error. I just had to comment those lines out.
But then the program would infinitely lock because a mutex would not return when a pthread was created. It would just spin on mutex.wait, and there were no errors in the thread creation as I tracked it line by line.
Then debugging randomly wouldn't compile anymore.
There's a whole blog I'll write on it but the summary is that emscripten is cool, but the quality of it's features isn't ready to simply cross compile without the underlying code supporting it.
I'm aware of this project, but the demo doesn't work well, might just be slow access to video hosted on Chinese servers:
https://github.com/sonysuqin/WasmVideoPlayer
I would be interested to sponsor development of a project that could reliably play large video files with different formats in the browser. I'd use it adapt our Chrome extension ('Language Learning with Netflix') into a website/SPA that could load in local video files, so you could study with movies from your hard drive. Contact email is in my profile. :-)
Why not use something like Webrtc? You’ll be able to use a native decoder + seek to the nearest key frame.
WebRTC doesn't add anything for his usecase of playing local files. WebRTC is an API to let the browser stream data (including video/audio) over the network. If the browser can play the given video by supporting that codec then you can play it using a HTML5 <video> element directly from local disk. No network communication needed.
Ah yes, didn't think about the use case of unsupported codecs. Though at that point I would argue it would be easier to convert the file with ffmpeg, etc, then to use a custom decoder in the browser.
Sounds cool! Maybe you can check my build script and see if anything is helpful. :)
https://github.com/ffmpegwasm/ffmpeg.wasm-core/tree/n4.3.1-w...
I no longer see the warning on the readme, but, this relies on SharedArrayBuffer, so it is not currently supported on mobile
(except Firefox for android) and some other browsers:
https://caniuse.com/sharedarraybuffer
I get this when I visit the OP link:
> Your browser doesn't support SharedArrayBuffer, thus ffmpeg.wasm cannot execute. Please use latest version of Chromium or any other browser supports SharedArrayBuffer.
According to caniuse some headers are necessary for it to work on Firefox. I guess the developer has to fix the demo.
In fact, the header must be set in the server-side, which means I cannot do that as I am using github pages. You can check more details here:
https://github.com/ffmpegwasm/ffmpeg.wasm/issues/102
I noticed you were using github and completely failed to think of that! My bad.
It is on the last line of the Installation section, but yes, ffmpeg.wasm still replies on SharedArrayBuffer for multi-threading.
Hey dang, I submitted this more than 12 hours ago. How come it now says "3 hours ago"?
Mods can reset its time to help it get more views.
You can email hn@ycombinator.com to ask mod/meta questions. It's in the "Contact" link at the bottom if you forget.
i think dang once said that if they think a thing is good they'll try to get it seen or whatever, idk if that is what happened
How does the licensing here work? It claims MIT Licensing but FFMpeg is GNU LGPL
The build scripts and the JS files in the project that interact with FFmpeg and give it a JS API are licensed under the MIT license, but FFmpeg itself is still LGPL.
You are right, I cannot overwrite FFmpeg license. Lol
Might be important to note! I can see a lot of places this would be applicable for my work but I wouldn't be able to release content using this.
Really great work though, and I'm sure it will help a lot of people!
This is awesome. I tried to do something similar a long time ago (back when the html5 video tracks/codecs API were released).
I'd love to see a FFmpeg compilation to WASI, so it can be run also standalone in server-side runtimes (such as Wasmer or WAVM).
I wonder if threads would made a WASI compilation more challenging?
Not knowing much about WASM wouldn't just using straight up native platform ffmpeg be a lot faster and more energy efficient if you're going to be running it on the backend anyway?
The advantages of using Wasm in this case would be: multiplaform/multichipset binaries along with a completely sandboxed environment.
It's true that the speed will be a bit slower (I expect to be 5-10% slower than native), but in some cases the gains could outweight the relative slowdown.
The popular external encoder libraries such as x264/x265/libvpx/libaom-av1 as well as a lot of parts of ffmpeg itself have fine-tuned assembly and sometimes C for a lot of things that you'd be missing out on. So I'd guess the speed is more than a bit slower, at least when run on common x84-64 and arm targets, even considering any WASI runtime JIT.
True! I haven't considered the fine-tuned assembly inside of the encoders.
I would love to have it working in Wasmer so we can benchmark both paths (native vs Wasm).
Faster, yes. But a WASM version would have other advantages: it wouldn't need to be recompiled for each platform, and it would run in a sandbox.
Great! now if someone could do a JS port of MoviePy please :)
https://zulko.github.io/moviepy/
So this can finally put Apple's refusal to support Ogg container to rest and make Ogg/Opus work on Apple systems?
God that would be useful. The least enjoyable part about using an iphone is all the media elements that show up blank because it uses vorbis/vp9
I can imagine. I'm not touching anything Apple are making due to such kind of attitude.
there is already ogv.js for that.
A few years ago I was using this port
https://github.com/Kagami/ffmpeg.js
It was still asm.js but working just fine.
Audio/Video/Image encoding/decoding and crypto is about all that I can find webassembly useful for. I wonder why all the hype with it.
There are many of us that would like to do frontend development in a language that isn't JavaScript - having a compilation target for that which even comes with speed benefits is huge.
When it's more mature, being able to develop for the web in whatever language we want.
What on earth is a "pure webassembly/javascript port"? It's gotta be one or both of those to run in a browser regardless. Is it just ffmpeg compiled to wasm through emscripten?
It's called a port because probably more work went into making it work than just recompiling.
there is always two part to these ports:
- a webassembly blob
- a javascript wrapper to call the webassembly's functions from js in a practical way.
This is what it seems to be.
The repo in this thread depends on another package which is the actual bulk of work:
https://github.com/ffmpegwasm/ffmpeg.wasm-core
(fork of ffmpeg with emmake in a build script).
There definitely does seem to be some other work in there in terms of getting it to actually work though. And tests.
Here is a (soft paywalled) blog post about the "making of" this project.
https://itnext.io/build-ffmpeg-webassembly-version-ffmpeg-js...
You can also check the same post in my blog:
https://jeromewu.github.io/build-ffmpeg-webassembly-version-...
Who said this is a great idea? We're backing here the new Flash, consuming resources on clients.
This is impressive and cool from a purely technical point of view. That's certainly enough for me, so I don't wanna detract from this, but: what would the real-world importance of this be?
Nice!
How power efficient would this be? Will it become a setback towards the goal of carbon neutrality if widely used?
Watching videos for entertainment instead of working on solutions to global warming and/or new metamaterials that are created from carbon neutral sources such as biodiesel (with the use of your computer) is a setback towards the goal of carbon neutrality.
We are about 10000 steps away before worrying the power usage of websites. Transport and energy generation changes are many many order of magnitudes more important
We have much larger problems to solve than Wasm. E. g. Java. 2 billion devices run it!
That depends on how "dirty" your electricity is, and maybe on how much dirty energy your clean energy could have displaced? My laptop runs mostly on nuclear power with most of the rest from some kind of "renewable" energy.
I don't understand