AI mania has swept the Internet. People generate imagery with Stable Diffusion, Midjourney, and DALL-E, talk to ChatGPT, write code with Copilot, and make "hot AI selfies" with Lensa. Deep learning technology has been moving almost shockingly quickly, leading some to predict that AGI is just around the corner.
This is not a good thing, and it has potential to get a lot worse.
These systems do not generate content in a vacuum. Their training process involves ingestion of massive quantities of existing human-created content - as a general rule, without the explicit consent of the human who created it to begin with. This, then, competes economically with the very people who created the machine's training dataset to begin with. Art and writing - creative expression - are things humans do for joy. The perversity of megacorporations taking work that a human did as a means of self-expression, running it through a large statistical model with the work of many other humans, and selling it as a product or service "generating art" (or writing, or code) is hard to overstate.
I've seen a common argument that because this is legal - whether due to the original license terms or due to being sufficiently transformative - it is legitimate. There is a meaningful distinction between legality and moral legitimacy. Additionally, I find it very hard to believe that a model vomiting up a near-exact copy of existing code - as has been demonstrated[1] with Github Copilot - is respecting license terms. These models are taking copyrighted material, putting them in a blender, and presenting the resulting sludge as original work and not a derivative work of any of the inputs; this strikes me as little more than IP laundering to the benefit of the megacorps that train the models.
DL models aren't human. And, before we veer off into "but your brain's a computer! Checkmate!" - current deep learning models are vastly simpler than brains are, and have only very limited structural similarity[2]. "A human does x, therefore a computer program is entitled to do x in an equally legitimate manner" is a massive leap. Even beyond that, humans are inherently limited in the quantity of output they can produce; a deep-learning model that can generate ten thousand high-resolution images per day on moderately capable hardware is a scenario where an artist, limited by a human brain and a human body, is fundamentally disadvantaged. This is not, I would argue, socially desirable; art is a crucial means of human expression and producing vast quantities of Extruded Art Product on an assembly line devalues the artist.
DL models are trained on vast existing datasets, and do not possess sapience; they cannot be politely asked to stop exhibiting aspects of those datasets that are biased against groups of people or which are otherwise harmful to the dignity and safety of humans. "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?" by Bender and Gebru, et al, included a thoughtful critique of training practices surrounding large language models - noting that they over-rely on large amounts of Internet-sourced text and tend to amplify already-privileged voices, encoding the biases of the source dataset into the resulting model.[3] It is notable that Google's response to this paper was to fire Timnit Gebru and her co-author Margaret Mitchell - an action that speaks volumes about how seriously the institutional big-tech culture takes the issues the authors raised.
An analysis of Stable Diffusion by Bianchi, et al, demonstrates depressing but unsurprising results; prompts like "an attractive person" consistently generate images of white people, while prompts such as "a thug" or "a poor person" consistently generate images of people of color.[4] "A software developer" generates, in the ten images tested by the authors, ten images of white men. The paper helpfully notes that the demographics of images generated have little correspondence to the actual demographics within the professions specified in the prompt.
The "AI avatar" application Lensa has also provided some interesting - in a bad way - examples. The model, in its stastical wisdom, has seen fit to add breasts to photos of children; to skew toward nudity when told to generate an avatar based on female images; and to sexualize images of women in general, especially women of color (some of whom have also reported the model producing output with lighter skin or Anglicized features.)[5]
First, that illustrates some things are really screwed up in real life and on the Internet, and that those of us in places of privilege have a collective responsibility to try to make things better. Second, "it reflects the biases of its training data" - from the people choosing the training data and overseeing the training - is an abdication of moral responsibility. "It's hard so it's not our fault" does not cut it.
It's notable that some individuals who occupy prominent places in corporate AI research and spending are proponents of longtermism, "rationalism", or related ideologies. While there is broad internal variance in these beliefs, both in the beliefs themselves and the intensity, there is a broad tendency to focus on "existential risk", to disregard non-existential risks (often including global warming, because it's likely to only kill *some* of humanity), and to focus on the glorious utopian future humanity - which is very frequently some variant on "massive numbers of humans, both real and simulated, spread the universe under the benevolent guidance of wise and powerful AGI." Strange numbers get thrown around to rationalize this "Robo God Future at Any Cost" mentality; for instance, to quote noted longtermist Nick Bostrom, "If we give this allegedly lower bound [ed: 10^54 'human brain emulation subjective life-years'] on the cumulative output potential of a technologically mature civilization a mere 1% of being correct, we find that the expected value of reducing existential risk by a mere one billionth of one billionth of one percentage point is worth a hundred billion times as much as a billion human lives."[6]
Mysteries like "why are we focusing on models that can talk fluently and generate images instead of ones that actually help solve human problems?" start to look less mysterious if one considers the possibility that these are but preliminary miracles to portend the coming of Robo God.
LLMs are, as Gary Marcus put it, "spreadsheets for words." They do not contain genuine intelligence, cannot express meaning, and have no particular notion of truth - only of statistical association. They will happily generate pages upon pages of text about the benefits of eating crushed glass (as Facebook's "Galactica" LLM recently did), the merits of the January 6 insurrection (as I've seen GPT-3 do), and many other fascinating topics. To once more quote Bender and Gebru, et al, "Contrary to how it may seem when we observe its output, an LM is a system for haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabalistic information about how they combine, but without any reference to meaning: a stochastic parrot."
The actual useful applications of DL models that will output fluent, plausible nonsense with no regard to truth seem limited. One of the few that I've seen so far is the generation of marketing copy, currently offered by several companies. It doesn't seem hard to imagine a future where actual human communication on the Internet is drowned out by massive quantities of near-meaningless (and sometimes actively harmful) DL-generated text.
Finally, I've seen a lot of responses to any criticism of the AI-industrial complex along the lines of "we're lucky that people like you weren't around when humans were developing fire / stone tools / the wheel / inidustrialization / cars / the Internet." This sounds like "all 'progress', loosely defined, is a good thing." History presents a long and instructive list of examples of "progress" making things worse for living, breathing, people - especially those already systemically disadvantaged.
Questioning the "progress" being pushed by the capital class isn't heresy. It's necessity.
[1] https://twitter.com/DocSparse/status/1581461734665367554
[2] https://miro.medium.com/max/4800/1*28mEJZh50XUrFAuWwwCOcg.webp - slides from Montreal AI Debate
[3] "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?", by Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, Shmargaret Shmitchell, DOI 10.1145/3442188.3445922
[4] https://arxiv.org/pdf/2211.03759.pdf
[5] https://www.wired.com/story/lensa-artificial-intelligence-csem/
[6] https://existential-risk.org/faq.pdf