Instagram servers download any link sent in Direct Messages even if it's 2.6 GB

Author: tosh

Score: 72

Comments: 21

Date: 2020-10-28 20:20:57

Web Link

________________________________________________________________________________

tosh wrote at 2020-10-28 21:09:22:

article w/ overview of how various platforms handle link-previews

https://www.mysk.blog/2020/10/25/link-previews/

Discord:            Downloads up to 15 MB of any kind of file.
  Facebook Messenger: Downloads entire files if it’s a picture or a video, even files gigabytes in size. * 
  Google Hangouts:    Downloads up to 20 MB of any kind of file.
  Instagram:          Just like Facebook Messenger, but not limited to any kind of file. The servers will download anything no matter the size.* 
  LINE:               Downloads up to 20 MB of any kind of file. (This one still deserves a big  as we’ll discuss later)
  LinkedIn:           Downloads up to 50 MB of any kind of file.
  Slack:              Downloads up to 50 MB of any kind of file.
  Twitter:            Downloads up to 25 MB of any kind of file.
  Zoom:               Downloads up to 30 MB of any kind of file.

loa_in_ wrote at 2020-10-29 11:54:00:

Facebook also has been shown to visit links that are embedded in PDFs you send over messenger

ada1981 wrote at 2020-10-28 21:27:07:

I wonder if this could be used as an DoS attack vector. Flood users with bot messages with dynamic links to huge files.

stu2010 wrote at 2020-10-28 22:01:43:

Who says that you even need a file of defined size? Rig up a server to continually serve pseudorandom data over https.

eqvinox wrote at 2020-10-28 22:25:50:

The idea is to attack HTTP servers you _don't_ already own ;)

(Instagram is the reflection attacker, not the target.)

[Add.: sure you could try to attack Instagram/FB with this, but to execute it you kinda need to trump FB's bandwidth which is gonna turn it into a DoS on your own bank account.]

ada1981 wrote at 2020-10-29 01:40:48:

Oooo that too I guess.

Also looks like this multiplies the download. I wonder if they cache it somehow so if the url is requested more than once it doesn’t pull it again.

I wonder how much bandwidth one could consume with a bot account.

Presumably if I message 100 urls they will all be pulled.

The idea of an infinitely sized file is interesting.

j_walter wrote at 2020-10-28 23:27:36:

The point is to not DoS yourself at the same time...

jdndbfbf wrote at 2020-10-29 13:32:38:

Yes I remember reading an article posted here before on using Facebook to ddos

ada1981 wrote at 2020-10-29 19:22:41:

Facebook itself is sort of a DDOS attack on your pre-frontal cortex isn't it?

eqvinox wrote at 2020-10-28 22:24:25:

Well, I mean, they're presumably also keeping all of the messages you ever sent (for "quality assurance" purposes of course), so the privacy aspect certainly can't be the point of contention here.

The bandwidth usage on the other hand is a different question, this essentially allows a reflection-style DoS against any HTTP server hosting large-ish files. Presumably you can stick nonsense query parameters after the URL just to create distinct URLs for the single largest file you found?

Waterluvian wrote at 2020-10-28 23:21:35:

This got me wondering. Is there any native way to HTTP to get a checksum or some hash of a file at the beginning of a transfer?

chrisacky wrote at 2020-10-28 23:36:57:

Isn't that part of the point of etag headers?

Waterluvian wrote at 2020-10-28 23:41:20:

Thanks for the name. I looked it up and it seems close. But I think it's up to the client to provide an etag and the server to respond with the same etag and a 304.

What I'm thinking is a server blind way for the client to say: "give me th resource at address X. ... Oh you know what i think I already have this one. Let's abort the transfer."

I guess what I'm thinking is a way for th client to decide it's a 304, not the server.

bbatha wrote at 2020-10-29 03:09:17:

The If-Match and If-None-Match headers makes the request conditional. You can also of course store the etag with the file or calculate the checksum and do a HEAD conditionally followed by a GET. But there isn’t any advantage to that over If-Match

Scoundreller wrote at 2020-10-28 23:39:56:

Wouldn’t that create a giant mess if someone forged the header etag to match something else?

chrisacky wrote at 2020-10-28 23:42:16:

I haven't got enough skin in this to read any RFCs but my assumption had been it's not intended to act as a point of truth or verify contents and was always only to be used in an advisory capacity. So yeah, it would cause a mess.

Waterluvian wrote at 2020-10-28 23:42:48:

Yeah I think it's meant to be used in a context of mutual trust.

Scoundreller wrote at 2020-10-28 23:38:51:

Download the first 1mb and hash that? Or intermediate portions if the server supports resuming?

Waterluvian wrote at 2020-10-28 23:42:20:

That's a clever but fatally flawed solution given how many files could share the first MB.

devxpy wrote at 2020-10-29 02:43:38:

I recently had to implement this feature too. Initially chose approach 2 since it was the easiest. Quickly figured out that it doesn't work on the web because of stupid CORS. Had to resort to approach 3. Using

https://metascraper.js.org/#/

now.

sp332 wrote at 2020-10-28 20:41:16:

I get doing a malware scan or whatever, but downloading it so many times seems like overkill.