💾 Archived View for snowcode.ovh › en › courses › english › it_file_draft.gmi captured on 2024-05-10 at 10:47:30. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

Dropped topics for the IT File English assignment

This page contains draft about other topics ot include in my English assignment which I couldn't do because it wasn't "technical enough".

Social and ecological impact of artificial intelligence

Environmental cost of artificial intelligence

The computing power required to make an artificial intelligence such as ChatGPT work is enormous. One assessment suggests that ChatGPT is already consuming the energy equivalent of 33 000 homes. However the exact number are hard to know for sure because this sort of data is hidden from the public.

Training data without the author's consent

AI is trained on real human-made data. However the data is simply scraped from the Internet, most of which is copyrighted content in which a lot of work of many people have been put in, and all this work is "flattened" to become a simple stream of data for the AI to improve on. The problem is that there is no way to repay the creators of the data itself, and more over their work is used for private companies to generate profit on.

It's also very hard to prove that the drawings of one artists have been used in the training dataset of an AI in order to actually call the company in court and apply the laws of copyrighted content, therefore since no proof can be given, the AI companies most often are unquestioned and can't be attacked in court according to law.

It's impossible for an AI to be trained on AI data because AI data will always be of lesser quality for training than the human made data, therefore the AI is doomed to deteriorate over training instead of actually improving.

This has for example happened at many companies where websites have sold the work of its users to an AI company in exchange for profit. This has been for example been done by the very popular website Reddit, all the user-created content has been sold to the company for about 60 millions dollars.

This has also been done by GitHub the most popular code hosting company owned by Microsoft who used the code written by its users in order to create their own AI. The code written by the users was open source, therefore public, however it's unclear whether it's a fair-use of the source code or not. Also all this source code has been taken without prior approval by the software developpers.

Biais in training data and automated discimination

Most often artificial intelligence works as a black box. An artificial intelligence works by taking inputs (for example images mapped to descriptions) and finding connections between them (for example between the images and their descriptions). Although the general comprehension of the machine-learning is understood, the actual ways the AI will identify connections is however unknown.

The fact that the machine learning AI works as a black box, also means that it can only learn from the data it gets. That way if the data is biaised, for instance only containing images of white people, the AI will also be biaised.

This is for example something that has happened with the Twitter automatic image cropping algorithm where the algorithm would systematically prefer white faces to black faces. While this particular example is pretty harmless (since images can be cropped manually), other examples related to police, millitary or healthcare can be much more dangerous.