AI - 2.01 - genAI - build use LLMs
Having discussed some of the issues around artificial intelligence, in general, and some of the various historical approaches, we are now, finally, ready to talk about generative artificial intelligence and large language models. These are the backbones of the current crop of artificial intelligence products that are being promoted quite heavily in our society.
As previously noted, this is built on the mathematics behind Bayesian analysis, Markov chain analysis, neural networks, and so forth. Using the mathematics here, the companies that have built generative artificial intelligence chatbots have created statistical models based on enormous amounts of text data. This text data has come from books, it has come from the news media, and, of course, lots and lots and lots of it has come from social media. Social media is a free source of a huge amount of text based on people conversing with each other.
Building these statistical models is not easy, and the resulting statistical models, themselves, are not easy to understand. As a matter of fact, if they are honest, the companies that have built these statistical models will, themselves, admit that they do not understand everything that is in the models that they have built. After all, it is not they that have built the statistical models. The statistical models have been built by computer programs that have done statistical analysis of these masses of text.
It is hard to explain just how complicated this process is. In one sense, it is very simple. It is simply looking at a lot of text, and making a statistical analysis of which words come in what order, what word comes after a certain word, and how often, with some extra statistics thrown in to indicate how often this word comes four words after that word, and so forth. But the thing is, that the statistical analysis goes on at many levels, and the statistics that are built get modified according to the mathematics of neural networking theory, which is looking for relationships, sometimes relationships between the statistics themselves. It's all just numbers, and it's all just ones and zeros, but it keeps on going, and the end result is enormously complex.
This is why such enormous amounts of money are being put into this effort. Yes, there have been artificial intelligence programs that have been built on specialized computer equipment. When IBM built Deep Blue and Watson, they were built on specialty computers, which were created specifically for the purpose of running those artificial intelligence programs. The work that went into creating those programs, and the work that went into creating the hardware for those programs, have, certainly, spun off benefits for the fields of both hardware engineering, and program design. But they were one-off attempts to address specific challenges.
The building of the large language models has required the construction of entire data centers. Enormous computers, filled with what would normally be specialty processors within other computers, that have been specially designed to perform a certain type of mathematics. This type of mathematics is one that has been widely used in generating graphics on computers, and so one particular company, formerly known simply for creating the chips that were helpful with making graphic cards for computers, has come to be enormously valuable in the midst of this race to create artificial intelligence. I should note that the same type of mathematics is the mathematics that goes into trying to break encryption systems, so these type of chips do have more than one purpose. Prior to the demand for these chips because of the artificial intelligence boom, a lot of people were using them to build cryptocurrency mining devices.
But now there are enormous data centers, which are, in reality, just single computers, created by putting together thousands, and sometimes millions, of these specialty processing chips. This demand for processing power in order to accommodate research into and the use of, artificial intelligence, and particularly generative artificial intelligence, is so great that other companies are now building power plants, solely for the purpose of powering these particular data centers, solely for the purpose of using creating large language models for generative artificial intelligence.
The creation of chatbots is not new. Microsoft, rather infamously, tried it some years ago. They created a chatbot, and put it up on the social media platform Twitter. In a few hours, the chatbot was taken down. What had originally been seen as a polite and helpful commentator, had, within hours, turned into a foul mouthed combatant. The chatbot had been designed in order to use the text that it encountered to build and improve itself. The thing is, the conversations on social media aren't always polite. The improvement didn't improve things any. The chatbot learned to be a troll.
So, it turns out that, one of the things that you really need to be careful of, with regard to generative artificial intelligence chatbots, is that they don't go off the deep end. You need to build in some kinds of restraints. You can't just let them learn, and then accept whatever it is that they produce. No, instead, you need to make concerted efforts to ensure that the chatbot is at least somewhat reasonable in terms of its conversation, and that it doesn't give people useful information about how to kill themselves, or how to make weapons of mass destruction, or various things like that. Creating these restraints is known, in the field, as guardrails.
Creating guardrails turns out to be a non-trivial problem. People who are interested in the field have attempted to get around the guardrails, and, in all too many cases, it has turned out to be surprisingly easy. Sometimes it is the researchers who have found the ways to make chatbots spit out very dangerous information. Sometimes, unfortunately, it is the users who have found that the chat box are all too willing to encourage them to commit suicide, and counsel them that painful ways of dying aren't really that bad if it ends up fulfilling your objective not to exist. In addition, there is an ongoing problem, now identified as AI psychosis, which is that, partly encouraged by the publicity and promotion of the generative artificial intelligence companies, people have come to regard chat bots as having personalities. People have created chatbots with personalities. People have created chat bots as artificial friends, sometimes artificial lovers, and in a great many cases artificial representations of a grieving individual's dead loved ones. A number of psychological issues are only just starting to be examined with respect to this particular risk.
We'll deal with this issue of chatbots in some detail later. However, there is another side to generative artificial intelligence, and that is in regard to the systems that create graphical images or even video.
These systems use very similar mathematics and technologies to the text-based chat box. However, the graphical systems are fed masses of image data, usually image data that has some accompanying text. Therefore, the graphical systems are able to respond to prompts that are involved as queries for certain types of images, by producing images that are going to be similar to images associated with text similar to The prompt that is issued to the system.
And, now that I have used the word prompt, I have to explain it. Most people who are dealing with artificial intelligence through chatbots are used to thinking that they are asking a question, and the chat bot is giving an answer. This is, quite simply, not true. Using a generative artificial intelligence chatbot means that you are issuing a prompt to the system. The prompt is the "question" that you type in. This system, however, does not know that this is a question. It doesn't know what a question is. It just knows that you have typed in certain text. And then uses the enormous statistical model to generate a stream of text which is, statistically, probable based on the string of text that *you* typed in. That is, the statistical model is making a match, based solely on mathematics and statistics, between the words that you have typed in, and strings of words that have followed strings that are similar to those that you typed in, in the masses of data that were fed into the system in order to create the statistical model. This is not question and answer. There is no understanding involved here. What is happening is that the system, with layers and layers of mathematics, is simply generating a stream of text that is statistically probable, based on the analysis that it has previously done of tons and tons and tons of text.
Your question isn't a question. It's just a prompt. In cryptographic terminology, we would say that it is a seed. It'll produce something, but what it produces is based on mathematics, not understanding.
In terms of producing graphics or video, sometimes the situation is even worse. In terms of encrypting graphics, you have to use methods that are somewhat different from the encryption that you do with regard to text. If you use methods that work very efficiently in hiding text, in terms of encrypting graphics, very often you will come up with a result where the original image maybe somewhat fuzzy, but you should be able to get the general idea. That's not good in terms of encryption. Therefore, the process that we use in encrypting graphics often uses something called diffusion. This means that we take the actual information in the image, and move it around, so that the information is actually all still there, but it's no longer next to other information that will recreate the image and let you know what the image is and means.
When you ask a generative artificial intelligence system, which creates graphics, to create a picture for you out of something, it usually actually starts with random noise. And then, using the same mathematics that would go into diffusing an image, so that it no longer appears to be an image, we run that process backwards. You have heard the old joke that it's easy to create a statue of an elephant. All you have to do is take a large block of stone, and then cut away everything that doesn't look like an elephant. Although the process is complex and heavily mathematical, this is, essentially, what image generation generative artificial intelligence systems actually do. They take noise, and then move it around, throwing away everything that doesn't look like an image that is similar to an image that is associated with something like the text that you typed in. Again, there is no comprehension or understanding involved here. This is one of the reasons why, when you first start trying to use the graphical generative artificial intelligence systems, you have to make many tries, and teach yourself, how to word a prompt so that you will get an image that is something like what you want. (For example, these systems don't understand how many arms or legs human beings have.) It's a bit of a trial and error and frustrating project.
AI topic and series
Introduction and ToC: https://fibrecookery.blogspot.com/2026/01/ai-000-intro-table-of-contents.html
Next: TBA