AI - 1.10 - history - neural to LLMs
When babies are learning to talk, they reach a stage which pretty much everybody refers to as babbling. However, if you pay attention, careful attention, to what they're doing, you will realize that they probably think that they are actually speaking. They have learned the patterns, or at least a number of the patterns, that we use when we are speaking. The sounds that they make may not sound like English words to us, but you will notice that the pauses that they make when they are babbling, and head tilts, and possibly even movements of hands, copy what we do when we are speaking.
They are learning to speak, and they learn to speak by copying the patterns that they see us using.
There are many patterns in our use of language. You probably know that the letter "e" is the most commonly used letter in the English language. The most common consonant is "t." A number of the patterns are statistical. When we can copy a sufficient number of these patterns, we can use the statistics, just the statistics, and nothing else, and nothing to do with any kind of meaning, to create a string of text that looks very much like the English language. In another area that I have studied, forensics, there is a field called forensic linguistics, or stylistic forensics, which we can use to look at even more detailed patterns of statistics in text, and actually determine the specific author of a piece of written text.
Now, some of you may be somewhat suspicious of the proposition that a mere statistical analysis, no matter how complex, can generate lucid English text. Yes, I am oversimplifying this somewhat, and it's not just the probability of the next word that is being calculated, but the next three words, and the next seven words, and so forth. The calculation is quite complex, but it still may sound odd that it can produce what seems to be a coherent conversation.
Well, this actually isn't very new. There is a type of statistical analysis known as Bayesian analysis, or Markov chain analysis. It has been used for many years in trying to identify spam, for spam filters for email. And, around twenty years ago, somebody did this type of analysis (which is much simpler and less sophisticated than the large language model neural net analysis) on the published novels of Danielle Steele. Based on this analysis, he wrote a program that would write a Danielle Steele novel, and it did. This was presented to the Danielle Steele fan club, and, even when they knew that it was produced by a computer program, they considered that it was quite acceptable as an addition to the Danielle Steele canon. And, as I say, that was over two decades ago. And done as a bit of a lark. The technology has moved on quite a bit since then, particularly when you have millions of dollars to spend on building specialized computers in order to do the analysis and production.
One of the other areas of study that I pursued was in psychology. Behavior modification was a pretty big deal at the time, and we knew that there were studies that confirmed how subjects form superstitions. If you gave random reinforcement to a subject, the subjects would associate the reward with whatever behavior that they had happened to be doing just before the reward appeared, and that behavior would be strengthened, and would occur more frequently. Because it would occur more frequently, when the next random reward happened, that behavior would likely have occurred recently, and so, once again, that behavior would be reinforced and become more frequent. In animal studies it was amazing how random reinforcement, presented over a few hours or a few days, would result in the most outrageous obsessive behavior on the part of the subjects.
This is, basically, how we form new superstitions. This is, basically, why sports celebrities have such weird superstitions. Whether they have a particularly good game, or winning streak, is, by and large, going to be random. But anything that they happen to notice that they did, just before or during that game, they are more likely to do again. Therefore they are more likely to do it on a future date when, again, they have a good game or win an important game. This is why athletes tend to have lucky socks, or lucky shirts, or lucky rituals. It's developed in the same way.
One of the other fields I worked in and researched was, of course, information technology, and the subset known as artificial intelligence. One of the many fields of artificial intelligence is that of neural networks. This is based on a theory of how the brain works, that was proposed about eighty years ago, and, almost immediately, was found to be, at best, incomplete. The theory of neural networks though, did seem to present some interesting and useful approaches to trying to build artificial intelligence. As a biological or psychological model of the brain itself, it is now known to be sometimes woefully misleading. And one of the things that researchers found, when building computerized artificial intelligence models based on neural networks, was that neural networks are subject to the same type of superstitious learning to which we fall prey. Neural networks work by finding relations between facts or events, and, every time this relation is seen, the relation in the artificial intelligence model is strengthened. So it works in a way that's very similar to behavior modification, and leads, frequently, to the same superstitious behaviors.
The new generative artificial intelligence systems based on large language model are, basically, built on a variation of the old neural networks theory. So it is completely unsurprising to see one of the big problems that we find with generative artificial intelligence, is that it tends, when we ask it for research, to present complete fictions to us as established fact. When such a system presents us with a very questionable piece of research, and we ask it to justify the basis of this research, it will sometimes make up entirely fictional citations in order to support the proposal presented. This has become known as a "hallucination."
Calling these events "hallucinations" is misleading. Saying "hallucination" gives the impression that we think that there is an error in either perception or understanding. In actual fact, generative artificial intelligence has no understanding, at all, of what it is telling us. What is really going on here is that we have built a large language model, by feeding a system that is based on a neural network model a huge amount of text. We have asked the model to go through the text, find relationships, and build a statistical model of how to generate this kind of text. Because these systems can be forced to parrot back intellectual property that has been fed into them, in ways that are very problematic in terms of copyright law, we do, fairly often, get a somewhat reasonable, if very pedestrian, correct answer to a question. But, because of the superstitious learning that has always plagued neural networks, sometimes the systems find relationships that don't really relate to anything. Buried deep in the hugely complex statistical model that the large language models are built on, are unknown traps that can be sprung by a particular stream of text that we feed into the generative artificial intelligence as a prompt. So it's not that the genAI is lying to us, because it's only statistically creating a stream of text based on the statistical model that it has built with other text. It doesn't know what is true, or not true.
There is a joke, in the information technology industry, that asks what is the difference between a used car salesman, and a computer salesman. The answer is that he used car salesman knows when he is lying to you. The implication of course (and, in my five decades of working in the field I have found it is very true), is that computer salesman really don't know anything about the products that they are selling. They really don't know when they are lying to you. Generative artificial intelligence is basically the same.
AI topic and series
Introduction and ToC: https://fibrecookery.blogspot.com/2026/01/ai-000-intro-table-of-contents.html
No comments:
Post a Comment