OpenAI has made a version of GPT-4, its most recent text-generating model, which can retain the information of up to about 50 pages of text due to its large extended context window.
That may not seem like a lot. Nevertheless, the amount of data that the plain GPT-4 is able to store is five times greater than GPT-3, and eight times more than the GPT-4.
Greg Brockman, co-founder and president of OpenAI, highlighted during a demonstration this afternoon that the model is capable of dealing with lengthy documents. He noted that they are eager to observe the different types of applications this could open up.
The context window is a concept that applies to text-generating AI. It is a limited number of words that the model can take into account when creating new text. Models such as GPT-4 learn to write from large databases of text, but due to the context window size, they are only able to consider a small selection of that text at any one time.
Models that only have a few words to remember typically can’t retain the details of previous conversations, causing them to go off topic. After a while they also forget their original instructions and instead just follow the most recent words they recall.
In the words of Allen Pike, a former Apple software engineer, it can be put this way:
The model won’t remember anything you’ve taught it, including the fact that you live in Canada, have kids, and despise booking appointments on Wednesdays. Even if you haven’t mentioned your name in a while, it will forget that too. Yet, when conversing with a GPT-powered character, it can feel like you are forming a connection and reaching something quite remarkable. Sometimes it can be a bit disorienting, but that’s typical for humans too. Eventually it becomes evident that it has no long-term memory, which destroys the illusion.
We have yet to gain access to GPT-4-32K with its extended context window. Open AI has revealed that requests for the high- and low-context GPT-4 models are being dealt with at a rate dependent on availability. However, it is easy to visualize how conversations with this model could be much more enthralling than those with the previous model.
GPT-4’s larger “memory” should enable it to converse in a logical manner for a much longer period of time, potentially even days, as compared to mere minutes. Moreover, it should be less prone to err. Pike highlighted that one of the reasons why Bing Chat sometimes behaves inappropriately is because its initial instructions, such as being helpful and responding politely, are soon replaced by subsequent inquiries and answers.
The complexities of the situation are more intricate than that, however, context window plays a big role in basing the models without any uncertainty. In the future, it will be evident what kind of genuine impact it makes.