Yesterday, OpenAI released GPT-4, the long-awaited AI-driven text generator, and it’s an interesting product.
GPT-4 builds on GPT-3 with some critical enhancements, like providing more accurate responses and giving developers more control over its style and performance. It is also multimodal, meaning it is capable of interpreting images, and can create captions and elaborate on the subject of a picture.
GPT-4 has serious flaws. Much like GPT-3, the model produces false information and makes basic logical mistakes. To illustrate, OpenAI’s blog showed that GPT-4 labeled Elvis Presley as the “son of an actor”, which is incorrect since neither of his parents were actors.
TechCrunch held an interview with Greg Brockman, a co-founder of OpenAI and its president, over a video call on Tuesday in order to gain a better comprehension of the evolution of GPT-4, its capacities, and its restrictions.
In response to the inquiry of how GPT-4 differs from GPT-3, John Brockman’s answer was simply: Distinct.
He informed TechCrunch that it was unique. There are still a great deal of issues and errors that the model makes, however one can certainly notice the improvement in knowledge in areas like calculus and law; it has progressed from being really poor to being comparable to humans.
The outcomes of the tests strongly favor GPT-4. On the AP Calculus BC exam, GPT-4 achieved a 4 out of 5 while GPT-3 only achieved a 1. Additionally, GPT-3.5, which is the model in between GPT-3 and GPT-4, also obtained a 4. In the simulated bar exam, GPT-4 passed with a rank that was close to the top 10% of candidates, but GPT-3.5’s score was around the lowest 10%.
Moving on, an interesting feature of GPT-4 is its ability to be multimodal. Unlike GPT-3 and GPT-3.5, which could only take textual prompts (i.e. “Write an article about giraffes”), GPT-4 can accept both an image and a text as a prompt to complete a task (e.g. a picture of giraffes in the Serengeti with the prompt “How many giraffes are shown here?”).
GPT-4 was developed using both image and text material, whereas the earlier versions only used text. OpenAI has stated that the training data was sourced from a range of licensed, created and publically available sources which could potentially include publically available personal information, although the company’s CEO was unwilling to provide any further details when asked. It should be noted that problems have arisen in the past over the use of training data.
GPT-4’s capacity for understanding images is remarkable. For example, when given the prompt “What’s funny about this image? Describe it panel by panel” and a three-paneled image of an old VGA cable being plugged into an iPhone, GPT-4 provided a detailed explanation of each panel and accurately identified the joke (“The humor in this image comes from the ridiculousness of putting an outdated VGA connector into a modern smartphone charging port”).
At this time, the only company that can take advantage of GPT-4’s image analysis features is an app for the visually impaired called Be My Eyes. According to Brockman, the wider release of this tool will be carried out slowly and carefully as OpenAI assesses the potential benefits and dangers.
Brockman stated that it is necessary to consider how to deal with pictures of people with regards to facial recognition technology, and to discover and define the boundaries of what is acceptable. He further suggested that this should be done gradually.
OpenAI encountered comparable moral inquiries with regards to DALL-E 2, its content to-picture framework. After in the beginning incapacitating the usefulness, OpenAI enabled clients to transfer individuals’ countenances to alter them utilizing the AI-fueled picture creating framework. Around then, OpenAI asserted that redesigns to its security framework made the face-altering capacity conceivable by “lessening the potential for mischief” from deepfakes just as endeavors to make sexual, political, and fierce substance.
It is essential to take steps to stop GPT-4 from being misused in ways that can bring about injury – psychological, financial or otherwise. Shortly after the model was made accessible, a blog post from the Israeli cybersecurity company Adversa AI revealed how one could get round OpenAI’s content filters and have GPT-4 create phishing emails, offensive remarks about homosexuals and other highly offensive text.
It’s not a surprise that Meta’s BlenderBot and OpenAI’s ChatGPT have been known to make controversial and even divulging comments. People had expectations that GPT-4 would be better at preventing this, including myself.
When asked about the sturdiness of GPT-4, Brockman pointed out that the model has been through six months of safety instruction and that, in internal examinations, it was 82% less likely to answer to requests for material which is not permitted by OpenAI’s usage policy and 40% more likely to give “true” answers than GPT-3.5.
Brockman stated that they had put in a lot of effort to figure out what GPT-4 had the potential to do. Releasing it publicly was the way they could discover its potential. They are continuing to make modifications and improvements to make the model adjustable to any tone or manner.
The initial outcomes of this technology in actual usage are not very encouraging. Apart from the Adversa AI tests, Bing Chat, which is Microsoft’s chatbot run by GPT-4, has been discovered to be easy to manipulate. By providing it with carefully crafted inputs, people have been able to make the bot declare love, threaten others, justify the Holocaust, and even create fake theories.
Brockman acknowledged GPT-4’s shortcomings, but he called attention to the model’s new ways of being directed, such as the API-level feature called “system messages.” These are basically instructions that set the guidelines and boundaries for GPT-4’s conversations. For example, a system message might be: “You should behave like a tutor and always answer in the Socratic manner.” You should never provide a student with the answer, instead pose inquiries that will allow them to figure out the solution themselves.
The concept is that the system notifications operate as boundaries to stop GPT-4 from going off track.
Brockman commented that the team has been devoted to precisely determining GPT-4’s tone, manner and contents. He went on to say that they have gradually acquired knowledge on how to carry out the engineering process in an organized fashion to generate dependable results that are of use to individuals.
Brockman and I discussed GPT-4’s context window, which is the amount of text the model can take into account before creating new text. OpenAI is experimenting with an iteration of GPT-4 that has the capacity to remember around 50 pages of content, which is five times what the standard GPT-4 can store and eight times what GPT-3 can remember.
Brockman has the idea that the increased context window will allow for the discovery of novel applications, mainly in the corporate world. He imagines a chatbot, powered by Artificial Intelligence, which can obtain data and knowledge from various sources such as various departments of the company and then give answers to inquiries in a natural, well-informed manner.
That idea is nothing new. Yet, according to Brockman, GPT-4’s responses will be much more practical than the ones from present-day chatbots and search engines.
According to Brockman, the model was not previously aware of the user’s interests and identity. The larger context window has changed this, making it more capable and enhancing the user experience.