No, AI Does Not Learn Like a Human, and This Is Why

One of the most common arguments you hear from fans of generative ‘AI’ is that it’s not plagiarizing people’s work, it’s just learning like a human learns. So I’m going to break down why that’s just not true, and why it can never be true, with the existing systems.

Before I start, I’d like to make the distinction between web-trained, generative AI models and other forms of machine learning. Artificial intelligence built for a specific task on solid, well-filtered data is capable of incredible things, and its development is exciting to see.

What I’m talking about here is the trillion-dollar industry whose purpose seems to be to create the illusion that computers can think and create like humans can.

My mother is a speech and language therapist, and there’s an example she uses to explain how we develop language. The cup. As a baby, you start with no language. Then you will starting using a cup around the same time that you start understanding the word ‘cup’ when it’s said to you. It’s a physical thing that you use. It’s not just your brain interpreting a word, it’s your whole body interacting with it. You have to hold it, keep it upright, tip it to drink from it. It has solidity and weight. You learn that ‘cup’ can refer to all kinds of cup, not just one thing.

From there, you will learn to play with the idea of what a cup is. You’ll use toy cups and pretend they have liquid in them. It’s another level of perception. You will also understand that, like the word, a picture can represent a cup. Eventually, you’ll be able to read and understand the written word ‘cup’. It’s not a single solid thing any more, it’s something your mind can imagine, a concept. It can be full or empty, different sizes or colours, it can be too hot or too cold. You can make jokes about it.

Most words we use don’t relate to single solid objects. If you say ‘I am hungry’, there’s no simple thing to point to, but like most of your early language, these words do relate to your experience of the real world. You don’t just learn with your mind, you learn with your whole body.

Language, like life, is not just a matter of words, it is something that must be experienced to be understood. A piece of software can never know what it is like to be hungry or cold, happy or sad. It may be able to identify a word or a picture of a cup, but it has never held one to its lips.

This is a piece of lorem ipsum. Designers will recognize it as a form of substitute text; you use it to lay out text in an ad or document, either because you don’t have the final copy yet, or you don’t want to distract the client by having them reading when you just want them to look at the design.

It’s based on a piece of Latin, but it’s gibberish – you can’t read it. It’s not meant to be read. This is how an AI sees text. It cannot understand meaning, but what it can do, is associate other words with that first word and assign a probability to what word comes next based on a prompt.

It’s incredibly sophisticated, with thousands, possibly tens of thousands of options for words to connect to that first word, and to continue on from there. It’s astonishing technology. If you say ‘I use a cup to . . .’ it can take that and run with it. But it can only ever play the odds.

It still doesn’t understand what a cup is. Like one of us looking at a piece of lorem ipsum, it can see familiar shapes that it can form into plausible combinations and patterns, but it will never understand what it’s saying. It can’t, because it has never learned with a body, with our extraordinary range of senses. This idea of understanding language through experience in the real world is referred to as ‘grounding’ or more specifically ‘the symbol grounding problem’ for AI.

A large language model can manipulate words, but it has never experienced the world that language was developed to exist within and interpret. Text is a thing to make patterns with, not something to be understood. It’s why it makes such stupid mistakes – it just doesn’t know any different. It’s why the word ‘hallucination’ in relation to gen AI is misleading. When humans hallucinate, it’s because something’s misfiring in our brains. AI hallucinations are not a mistake. The system is doing exactly what it was intended to do. That’s why they can’t fix it, because it’s not broken.

The tech creates images in a similar way. A diffusion model uses a highly sophisticated, multi-layered form of processing tagged images it has been fed into probability-based outcomes. It learns to dissolve images down and reassemble them in different forms, based on the identifiable elements in the images, and it can draw on a dizzying array of references for each element.

It still doesn’t know that jewellery doesn’t blend with skin, and the eyes’ pupils should be equal in size and point the same way. It has never worn jewellery. It has never made eye contact with anyone.

When the most common mistakes are identified, the tech guys can add patches to try and iron them out, but they’re just plugging leaks, they’re not redesigning the tech, because the vast mass of (often very flawed) data they needed at the start is the base for everything now, it’s an integral part of the system. They can’t start again from scratch. They can’t unscramble the egg.

And it will keep making serious mistakes, because it doesn’t know that you don’t put glue in a pizza, or pebbles in a salad, and that you shouldn’t drink two litres of urine a day. It doesn’t know why seven fingers on one hand is wrong, or what it’s like to walk or smile or speak.

So when someone says feeding this tech the work of all the artists on the web is just like an artist learning from other artists, either they’re trying to deceive you or they don’t understand the technology they’re such huge fans of.

No artist ever just scans in content and spews out a remix.

We are so hardwired by evolution and lifelong learning to associate language with intelligence, that it’s incredibly hard not to imagine there’s a mind behind these text or image generators. It’s like the way we see faces in patterns, like folks who see Jesus in a stain on the wall. When we see something that appears to be communicating like us, we are compelled to believe that it must think and feel like us. We have a natural tendency to anthropomorphize, attribute human characteristics to an object. When it ‘creates’, we assume we are witnessing intention, imagination . . . conscious thought.

But when you look at a picture of a cup, you don’t just see an image. You can imagine holding it, feeling its weight and texture, its temperature, you can imagine drinking its contents. You bring your own life experience to how you interpret what you’re looking at. And part of sharing that experience with others is the innate desire to make a connection, to say ‘This is how I feel about this. Do you feel the same way?’

Gen ‘AI’ brings nothing. It only has what it was fed of other people’s work, which it can regurgitate into different, sophisticated shapes, at the suggestions of its clients. It’s not trying to make a connection. This stuff is computer text and imagery shaped by mathematics . . . never experienced, never felt, never understood.

And there is a profound danger in attributing true perception and intelligence to this technology, because people will unconsciously begin to treat it as if it’s alive, as if it can reason, imagine and distinguish fact from fiction, truth from lies. They will come trust it – that’s another vital human trait. It’s why we’re so good at cooperating with each other. But this technology does not think, it cannot care and being accurate or truthful, trustworthy or empathic, responsible or accountable mean absolutely nothing to it, because it has no means of judging or valuing these things.

Humans do.

So next time someone tells you these things learn like humans, either they’re lying or, like generative ‘AI’, they don’t understand human beings.