AI’s information limits
Last month I talked about what artificial intelligence needed to be useful. I also talked about AI being smart, but I should also have qualified what I meant by using such an emotive word. I was guilty of assigning (like most people) a human quality to a technology attribute, in this case simply that the models store a lot of knowledge.
Most of us enthusiastically anthropomorphise generative AI because of the incredible way it produces and uses language. Arthur C Clarke once said, “Any sufficiently advanced technology is indistinguishable from magic”. This would be fine, except that without looking inside the magic box we are making some dangerous assumptions.
Even in the short time since ChatGPT hit the zeitgeist, users are reporting a degradation in its performance, something that seems to be backed-up by AI researchers. While experts are working through why it seems generative AI models degrade, it is easy to do a small experiment ourselves to see what may be happening and certainly illustrate the difference between conscious intelligence and apparently smart knowledge systems such as generative AI.
The easiest way to perform this experiment is to open two chat sessions with either the same or different large language model tools such as ChatGPT, Bard, Claude or any of the other similar products that are emerging. The hardest step of this process is to establish a conversation between the sessions. You will usually need to do a little work to get the conversation going, such as by suggesting to both that you would like to have a conversation about the game of chess, the history of London or trends in psychology. However, once you have prodded both into conversing you should then step out of the way and simply copy and paste from one to the other and then in reverse.
Initially they will compete to regurgitate what they know and admire what the other knows, but very rapidly almost every conversation, without human intervention, quickly degrades into mutual admiration devoid of meaningful new content. It is common to see phrases like “I admire you” and “I really love your approach”. That’s because the systems anticipate what you want, with admiration typically being top of the list.
Similarly, you can use any of the image creation tools combined with the chat tools and create, interpret and create images repeatedly. Very quickly you enter a repetitive cycle. Create a picture of several elements. Interpret the resulting picture. Create the picture from the interpretation and you see the same repetitive material over and over again.
A very likely cause of this behaviour can be found in Shannon’s or Information Entropy. Claude Shannon discovered the relationship between thermal entropy and the quantity of information a system holds in a seminal paper way back in 1948. Since then, physicists have realised that information is as fundamental as energy, it is neither created nor destroyed, it simply moves between systems.
This approach of quantifying information is very useful when thinking about generative AI. Despite the term “generative”, the systems are not creating any new information rather they simply recycle and combine content it has already stored. Synthetic data generation, where lower value information is combined to generate smaller a quantity of higher value content still obeys the rules of information entropy.
You see this in action in our small experiment creating a “conversation”. When we humans interact, we deduce new things that effectively become information. However, when generative AI interacts with its own data it is simply recycling information which, like thermal systems, gradually degrades.
This is likely why people feel that the performance of AI models deteriorates. Without knowing what is human generated and what is the exhaust of other AI processes, it is easy for generative AI to “feed” on a mixture of both. This is akin to eating white bread, lots of content with little nutritional value!
Of course, humans don’t sit in an information vacuum. Our environment contains ambient information such as news feeds, announcements and even the conversations we overhear which are all adding to the content we hold in our heads. We’re then combining and turning this information into new ideas through cognitive leaps. It may be that this real-time processing of a background flow of data could be the next step in generative AI or it may be that this is where true general (rather than generative) AI requires something more that we don’t yet understand.
All of this may lead regular readers of my blog to feel like I’m flipping between excitement and downplaying the potential of generative AI. Far from it, I am convinced that there is much more that we can do and, if we play it right, the dividend for our societies could be huge. But decision making in the absence of understanding the detail of generative AI serves no-one well.