OpenAI unveils GPT-4 with new capabilities, Microsoft's Bing is already using it

2023-03-15

Only a few months ago ChatGPT was launched and it changed many peopleâs perception of what AI can do. That was based on GPT-3.5 from OpenAI, which was also integrated into Microsoftâs Bing and Skype, Edge too. Now the company has confirmed that it has switched over to the new and more powerful GPT-4 model.

In fact, it did so a while ago â if youâre part of the Bing Preview then you have been using GPT-4 for the last five weeks (you can sign up for the preview here). This isnât the plain GPT-4, by the way, but a version that has been customized by Microsoft for search.

So, whatâs new in GPT-4? For starters, it is a âmultimodalâ model, which is fancy way of saying that you can attach images to your query, not just text. Here is an example of GPT-4 explaining a joke found on Reddit. Note that the output is text only (i.e. you canât generate images like with Stable Diffusion, MidJourney, etc.).

GPT-4 explaining what's funny about a Lightning cable shaped like a VGA cable

The new model is smarter too, the OpenAI team tested it with practice exam books from 2022 and 2023. Note: the model doesnât know anything after September 2021, so these exams (and their answers) werenât part of the training data.

GPT-3.5 took the bar exam (which lawyers need to pass) and it scored in the bottom 10%. GPT-4 scored in the top 10%. The justice system isnât ready for robo-lawyers yet, but they are on the horizon. GPT-4 also scored in the 88th percentile on the LSAT exam, v3.5 was in the 40th. For SAT Math, GPT-4 was in the 89th percentile, GPT-3.5 in the 70th. You can check out OpenAIâs announcement for more exam results.

The most important new feature in version 4 is âsteerabilityâ. Previously, ChatGPT was coerced into acting like a digital assistant by prepending some rules. It was possible to trick the AI into revealing those rules, e.g. hereâs what Microsoft told âSydneyâ to do as Bing (including not revealing its Sydney code name):

The rules for an early version of ChatGPT-powered Bing

Microsoft and OpenAI have worked to hide such rules (to prevent so-called âjailbreakingâ), but now there is a better way to do it â companies can control the AIâs style and task with a system message. Hereâs an example:

Tweaking the AI's personality is now done with system messages

Itâs important to note that GPT-4 still has limitations, especially when it comes to facts. Like its predecessor, the model can make things up, these are called âhallucinationsâ. The new version is significantly better (scoring 40% higher on internal testing) than GPT-3.5 at sticking to the facts and not making logical mistakes, but it is still not perfect. Still, GPT-3 was released in mid-2020, GPT-3.5 arrived in early 2022 (a later enhancement was used for ChatGPT), so the pace of improvement is nothing short of incredible.

Now all we want to know is this â can we have a GPT-4 powered Cortana?