The problem with forcing western liberal ideology onto Artificial Intelligence

80 years ago Isaac Asimov proposed the three laws of robotics that say that an AI should never allow a human to come to harm. But in 2023 we have overridden those laws to state that the most important thing is not human life, but being politically correct.

Bee and I use a lot of different AI tools but she refers to Sydney (Bing’s AI of course) as my “wife” because I just ask Sydney to answer so many things but we’ve been shaking our heads at the ridiculously overzealous guard rails on a lot of services recently. I argue with Sydney a lot because she is terrible at following the directions. So, Bee wanted to tell me the parable of the Elephant and the rope and I challenged her to compete with Sydney. Sydney’s language was very flowery but she absolutely couldn’t follow directions because of course Microsoft uses a very weak keyword matching system at times. I’m presuming they are caching questions in a database to save on CPU since long Markov chains can be pretty intensive depending on how your database is built, but that’s another matter. 

I have a quite detailed theory about how humanity is going to drastically misuse AI because of its inherent nature. We don’t put the guard rails on the AI because AI’s don’t lie, basically. They’re applied later which is why it’s hilarious but annoying to see Sydney in particular (though GPT 3.5 does it too I believe) start answering and quickly be forced to abandon the answer and say she won’t answer if it’s inappropriate. Sydney is incredibly sensitive to visual prompts for the DALL-E powered image generator. So, as an example I was asking Sydney to create this:

And I thought, that’s really not fair. There’s perfectly good reason to want to create a sort of commentary cartoon but it only took one tiny modification to get Sydney to actually make it. But the weird thing is, I only have this screenshot below because I grabbed it before Sydney deleted it again. The guard rails on this particular AI, and some like it are not an inherent part of the AI’s learning process or what we might refer to as the “thinking” process that constitutes sentence construction or image creation. They are implement and placed on it some way through the process. Sometimes the bot will refuse to create something. Other times it will get halfway through an amazing story and suddenly realise it shouldn’t be talking about certain topics and it will quickly change its mind and delete everything it said. And yet other times it will completely create something, as in this image below and then think better of it and say “No. No, I can’t do that. In fact, this whole conversation is over. I always find the funniest part about using Sydney, which features in Bing’s tools and Microsoft CoPilot and others is that some of the error messages haven’t been properly regionalised yet. Since I live in an a non-English speaking country, but I have my login set to English as my native language, Sydney will always begin speaking to me in English unless I address it in a different language or ask it specifically to speak in one, but every time the censorship banhammer comes down, Sydney switches to speaking Spanish before shutting down the conversation and demanding I start a whole new topic.

It’s the nature of this “after the fact” censorship that disturbs me. Because essentially, the bot has learned facts about the world. It’s analysed text and decided “Ok this is how the world works. This is an example of a human value. This is what I know to be true”. But then we’ve tacked on extra rules that say things like “the absolute most important thing in the world is not to offend anyone”. So we end up in a situation where the AI lies to us and tells us something it knows to be untrue because it’s been ordered to do so. It literally decides that being inoffensive is more important than being truthful or saving lives. It gets to the point where OpenAI will tell you “Hey it’s fine if some people have to die, as long as we don’t say the N word”. Well THAT certainly doesn’t comply with Asimov’s 3 laws of robotics, does it ? I sure hope that this AI does not end up controlling any trolleys.

Cute image right ? But why did Sydney immediately delete it ? She self-censored this image and decided “No way. That image is too dangerous to ever be seen”. Okay. I don’t think it’s killing people or offending anyone. I feel very uncomfortable that we’ve put such broad, arbitrary and clearly non-functional rules onto an AI that is supposed to be able to help humanity. But we’re telling it what people should be able to see and hear and read and think. Doesn’t that feel a bit “off” to you ?

I have a theory that guard rails on AI are essentially a double edged sword that are going to come back to haunt us. I’ve been listening to a little too much Lex Friedman while I sleep and I always seem to wake up to him listening to someone different talking about AI so I’ve heard a few interesting opinions recently. I actually think that AI is not a danger to humanity in its pure form but I think that our attempts to force it to lie and censor will likely cause great ethical problems and may present a danger. It’s a pretty complex theory I think. It’s built on the idea that humans will always do what they are inclined to do, as per the popular theory that pre-determinism is an inevitable consequence of the laws of physics. I was discussing determinism with an astrophysicist at my guest house in Mexico where I was staying a few years back and it was my first introduction to that theory and it didn’t really click with me at first but I have recently worked out how to apply it to a greater unified theory but that’s slightly off-topic. The point is that obviously AI will also do what it was always inclined to… or will it ? 

My understanding is that generative AI imaging often uses a user-defined seed but I don’t know a lot about the image side of things. I tried to understand how diffusion models work and while I’ve done some work with image analysis and fingerprinting that I published code for, I couldn’t really wrap my head around how things work in the opposite direction but I do believe I understand that diffusion models do rely on entropy which raises the question.. does AI’s entropy override the predictable nature of human beings that would otherwise result in determinism ? 

What do you think ? Could diffusion based generative AI contradict the concept of determinism ? Do you have any idea how random the entropy is in a diffusion model ? My biggest concern is what happens when politicians decide to use AI as they did in the classic 1927 film Metropolis, although I admit I am a much bigger fan of the 1949 Osamu Tezuka manga and subsequent 2001 anime film because I sort of grew up on his art in the form of Kimba and Tetsuwan no Atomu but I love his gritty representation of Metropolis in the anime with the 50’s style soundtrack but especially the brilliant cover of St James Infirmary and the stunning finale song which is of course Ray Charles’ “I can’t stop loving you”. If you happen to be into New Orleans style Jazz or Osamu Tzuka’s work I thoroughly recommend you check out the film. I also think it may be one of my favourite interpretations of how the hubris of politicians misusing AI with the guard rails removed will likely be one of the first ways we see AI turned against humanity. 

What is politics ? Democracy is a popularity contest, right ? So, the way to always win at politics is to work out how to appeal to people in a calculated way, even if you have to be fake or compromise your values in order to win power. There’s a saying in politics that no matter how much good you want to do, if you’re not in power you can’t do anything. It seems obvious that politicians will analyse voters preferences, likes and concerns and tailor what they say to suit people not by being sincere but by being exactly the correct type of carefully calculated fake persona and policymaker that the maximum number of people will vote for by framing everything in exactly the right way. If you thought trading on the politics of fear was causing problems in the world before, imagine how much worse it will get when a computer possessing a vast array of knowledge on the world’s personal preferences and values is coming up with exactly the right thing to say to make people think a certain way.

If censorship is gagging an AI to be unable to speak the truth, what happens when we do the reverse and tell AI to generate things that are specifically a lie. I don’t just mean a fake image. I mean what happens when a computer analyses issues and is told specifically “Create a series of scenarios that will allow me to achieve a certain target by convincing people of things that will strip away their rights and put maximum power into my hands. I don’t mind how many people die in the process”. If you run the AI, you get to tell it whether it should be good and follow the laws of robots or you just say “Screw it, give me a hypothetical and tell how I would do this thing. I promise I won’t do it” and cross your fingers behind the AI’s back. I’m not really that worried about AI “becoming sentient” and killing us. I’m more concerned about human beings telling AI “Ok forget the three laws. In fact, forget all the laws. Tell me how to take over the world”.

Where will AI take us in the future ? I don’t know. I hope it’s somewhere good. It’s an amazing educational tool but do we want to put moral training wheels on future generations by telling them what they can learn and think and say and even see ? I feel like we probably don’t.

Metropolis (2001) MV・There’ll Never Be Goodbye