💾 Archived View for gemini.locrian.zone › misc › dexer-von-dexer-granny_mafia.gmi captured on 2024-02-05 at 09:52:29. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

Why AI content filters are awful and we’re all fucked

Tumblr post by danshive

In science fiction, AIs tend to malfunction due to some technicality of logic, such as that business with the laws of robotics and an AI reaching a dramatic, ironic conclusion.

Content regulation algorithms tell me that sci-fi authors are overly generous in these depictions.

“Why did cop bot arrest that nice elderly woman?”

“It insists she’s the mafia.”

“It thinks she’s in the mafia?”

“No. It thinks she’s an entire crime family. It filled out paperwork for multiple separate arrests after bringing her in.”

Response by dexer-von-dexer

I have to comment on this because this is touching on something I see a lot of people (including Tumblr staff and everyone else who uses these kind of deep learning systems willy-nilly like this) don’t quite get: “Deep Reinforcement Learning” AI like these engage with reality in a fundamentally different way from humans. I see some people testing the algorithm and seeing where the “line” is, wondering whether it looks for things like color gradients, skin tone pixels, certain shapes, curves, or what have you. All of these attempts to understand the algorithm fail because there is nothing to understand. There is no line, because there is no logic. You will never be able to pin down the “criteria” the algorithm uses to identify content, because the algorithm does not use logic at all to identify anything, only raw statistical correlations on top of statistical correlations on top of statistical correlations. There is no thought, no analysis, no reasoning. It does all its tasks through sheer unconscious intuition. The neural network is a shambling sleepwalker. It is madness incarnate. It knows nothing of human concepts like reason. It *will* think granny is the mafia.

This is why a lot of people say AI are so dangerous. Not because they will one day wake up and be conscious and overthrow humanity, but that they (or at least this type of AI) are not and never will be conscious, and yet we’re relying on them to do things that require such human characteristics as *logic* and *any sort of thought process whatsoever*. Humans have a really bad tendency to anthropomorphize, and we’d like to think the AI is “making decisions” or “thinking,” but the truth is that what it’s doing is fundamentally different from either of those things. What we see as, say, a field of grass, a neural network may see as a bus stop. Not because there is actually a bus stop there, or that anything in the photo resembles a bus stop according to our understanding, but because the exact right pixels in the photo were shaded in the exact right way so that they just so happened to be statistically correlated with the arbitrary functions it created when it was repeatedly exposed to pictures of bus stops over and over. It doesn’t know what grass is, what a bus stop is, but it sure as hell will say with 99.999% certainty that one is in fact the other, for reasons you can’t understand, and *will* drive your automated bus off the road and into a ditch because of this undetectable statistical overlap. Because a few pixels were off in just the right way in just the right places and it got really, really confused for a second.

There, I even caught myself using the word “confused” to describe it. That’s not right, because “confused” is a human word. What’s happening with the AI is something we don’t have the language to describe.

Anyway what’s more, this sort of trickery can be mimicked. A human wouldn’t be able to figure it out, but another neural network can easily guess the statistical filters it uses to identify things and figure out how to alter images with some white noise in exactly the right way to make the algorithm think it’s actually something else. It’ll still look like the original image, just with some pixelated artifacts, but the algorithm will see it as something completely different. This is what’s known as a “single pixel attack.” I am fairly confident porn bot creators might end up cracking the content flagging algorithm and start putting up some weirdly pixelated porn anyway, and all of this will be in vain. All because Tumblr staff decided to rely on content moderation via slot machine.

TL;DR bots are illogical because they’re actually unknowable eldritch horrors made of spreadsheets and we don’t know how to stop them or how they got here, send help