The incredible simplicity of ChatGPT’s code

The difference between good and great, success and failure is so often down to very small details. Working in marketing, we all see examples of this often. Putting ketchup in a squeezy bottle, Pfizer pivoting Sildenafil away from being a heart drug and changing its name to Viagra, the introduction of Saul and Gus in Breaking Bad. I think the line of code that gave birth to chatGPT (and therefore the new AI revolution) could go down as one of the most extreme examples of this in history.

I’d read a few weeks ago that OpenAI only developed chatGPT because no one else did. They were surprised because it was such a straightforward task and they had a large community of smart, dedicated users already making tools out of their model. So because no one else had done it, they developed the chatbot interface. And yet, the bigger surprise was when that simple change of interface with GPT-3 (by then, 3.5) caused the AI explosion we’re currently seeing.

Last week, I saw the code myself for the first time. It’s really one of those beautifully simple ideas that frequently grace fields like mathematics, science or poetry. ChatGPT is a simple chatbot interface for OpenAI’s GPT-3 (now 4) – a model that had been available to users for two and a half years(!). In order for the chatbot functionality to work, it had to hold the whole exchange with the user in its memory as it answered each subsequent question. So, did this mean some retooling of the database or upping of computational power? No. Here's the code:

def chatbot(input):

if input:

messages.append({"role": "user", "content": input})

chat = openai.ChatCompletion.create(

model="gpt-3.5-turbo", messages=messages )

reply = chat.choices[0].message.content

messages.append({"role": "assistant", "content": reply})

return reply

It simply says: for every subsequent question after the first one from the user, append all previous questions and answers to the beginning. In other words, there’s no “memory” per se. It just reads all the questions and answers again before answering the newest question.

That’s it*. What a thing of beauty it is. Sitting in the open for anyone to implement all that time. It blows my mind. My guess is it took less than 2 mins to write (or 10 secs if they asked it to do the writing). A brief moment of work for a ludicrously simple command that caused a volcanic eruption of AI potential, a new arms race between the biggest companies in the world and potentially untold societal disruption. You can’t start a fire without a spark.

*OK OK, strictly speaking that’s far from “it”. There was work done to change the tone of its language to something more conversational than the original GPT-3, some other alignment work, a bunch of testing etc but it’s clear that the conversation interface of a chatbot was the key ingredient that launched it into the mainstream.

big group. bigger thinking.

The incredible simplicity of ChatGPT’s code