________________________________________________________________________________
article w/ overview of how various platforms handle link-previews
https://www.mysk.blog/2020/10/25/link-previews/
Discord: Downloads up to 15 MB of any kind of file. Facebook Messenger: Downloads entire files if it’s a picture or a video, even files gigabytes in size. * Google Hangouts: Downloads up to 20 MB of any kind of file. Instagram: Just like Facebook Messenger, but not limited to any kind of file. The servers will download anything no matter the size.* LINE: Downloads up to 20 MB of any kind of file. (This one still deserves a big as we’ll discuss later) LinkedIn: Downloads up to 50 MB of any kind of file. Slack: Downloads up to 50 MB of any kind of file. Twitter: Downloads up to 25 MB of any kind of file. Zoom: Downloads up to 30 MB of any kind of file.
Facebook also has been shown to visit links that are embedded in PDFs you send over messenger
I wonder if this could be used as an DoS attack vector. Flood users with bot messages with dynamic links to huge files.
Who says that you even need a file of defined size? Rig up a server to continually serve pseudorandom data over https.
The idea is to attack HTTP servers you _don't_ already own ;)
(Instagram is the reflection attacker, not the target.)
[Add.: sure you could try to attack Instagram/FB with this, but to execute it you kinda need to trump FB's bandwidth which is gonna turn it into a DoS on your own bank account.]
Oooo that too I guess.
Also looks like this multiplies the download. I wonder if they cache it somehow so if the url is requested more than once it doesn’t pull it again.
I wonder how much bandwidth one could consume with a bot account.
Presumably if I message 100 urls they will all be pulled.
The idea of an infinitely sized file is interesting.
The point is to not DoS yourself at the same time...
Yes I remember reading an article posted here before on using Facebook to ddos
Facebook itself is sort of a DDOS attack on your pre-frontal cortex isn't it?
Well, I mean, they're presumably also keeping all of the messages you ever sent (for "quality assurance" purposes of course), so the privacy aspect certainly can't be the point of contention here.
The bandwidth usage on the other hand is a different question, this essentially allows a reflection-style DoS against any HTTP server hosting large-ish files. Presumably you can stick nonsense query parameters after the URL just to create distinct URLs for the single largest file you found?
This got me wondering. Is there any native way to HTTP to get a checksum or some hash of a file at the beginning of a transfer?
Isn't that part of the point of etag headers?
Thanks for the name. I looked it up and it seems close. But I think it's up to the client to provide an etag and the server to respond with the same etag and a 304.
What I'm thinking is a server blind way for the client to say: "give me th resource at address X. ... Oh you know what i think I already have this one. Let's abort the transfer."
I guess what I'm thinking is a way for th client to decide it's a 304, not the server.
The If-Match and If-None-Match headers makes the request conditional. You can also of course store the etag with the file or calculate the checksum and do a HEAD conditionally followed by a GET. But there isn’t any advantage to that over If-Match
Wouldn’t that create a giant mess if someone forged the header etag to match something else?
I haven't got enough skin in this to read any RFCs but my assumption had been it's not intended to act as a point of truth or verify contents and was always only to be used in an advisory capacity. So yeah, it would cause a mess.
Yeah I think it's meant to be used in a context of mutual trust.
Download the first 1mb and hash that? Or intermediate portions if the server supports resuming?
That's a clever but fatally flawed solution given how many files could share the first MB.
I recently had to implement this feature too. Initially chose approach 2 since it was the easiest. Quickly figured out that it doesn't work on the web because of stupid CORS. Had to resort to approach 3. Using
now.
I get doing a malware scan or whatever, but downloading it so many times seems like overkill.