How to Create a ChatGPT Clone (With Context & Context Switching)

This tutorial is part of my book "OpenAI GPT For Python Developers".

The goal of this book is to provide a step-by-step guide to using GPT-3 in your projects through this API but not only that - many other tools and models built by OpenAI such as Whisper (an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data), CLIP (Contrastive Language-Image Pre-Training), a neural network trained on a variety of (image, text) pairs and DALL·E 2, a new AI system that can create realistic images and art from a description in natural language.

The Problem

GPT is a generative text model which means that it produces new text by predicting what comes next based on the input it gets from the user.

The model was trained on a large corpus of text (books, articles, and websites), and it used this data and learned patterns and relationships between words and phrases.

By default, the model has no memory when you initiate a discussion with it. This means each input is treated independently without any context or information carried over from the previous user prompts. This is certainly not ideal for human-friendly interactions.

While this seems like a limitation, it actually allows the model to generate more diverse and less repetitive text.

In some cases, carrying over context is useful and necessary. Some techniques like fine-tuning on a specific topic help in improving the quality of outputs; the other technique that we are going to implement next is much easier to implement.

No Context = Chaos of Randomness

Let’s start by building a simple chatbot; initially, we are going to initiate a discussion as our goal is to compare the outputs of the model now and later when we add more context to the conversation.

import openai
import os

def init_api():
    with open(".env") as env:
        for line in env:
            key, value = line.strip().split("=")
            os.environ[key] = value

    openai.api_key = os.environ.get("API_KEY")
    openai.organization = os.environ.get("ORG_ID")

init_api()

initial_prompt = """You: Hi there!
You: Hello!
AI: How are you?
You: {}
AI: """

while True:
    prompt = input("You: ")

    response = openai.Completion.create(
        engine="text-davinci-003",
        prompt=initial_prompt.format(prompt),
        temperature=1,
        max_tokens=100,
        stop=[" You:", " AI:"]
    )
    
    print("AI:", response.choices[0].text)

The following test shows that by default, the chat agent we built doesn’t have any memory:

You: Hey

AI: Hey! How can I help you today?

You: Remember this string 'blah' and store it. I'll ask you about it later.

AI: Sure, I remember the string 'blah'.

You: What was the string?

AI: I'm not sure what you mean - can you elaborate?

You: What was the string I told you to remember?

AI: I'm sorry, I don't remember any strings you told me.

You: I already told you!

AI: I'm sorry, I didn't catch that. Could you please explain the context of the last message?

History = Context

The idea is pretty simple, and it works by creating a history variable where we store the request of the user and the text generated by the model. When the user asks a new question, the history is inserted before the new question.

import openai
import os

def init_api():
    with open(".env") as env:
        for line in env:
            key, value = line.strip().split("=")
            os.environ[key] = value

    openai.api_key = os.environ.get("API_KEY")
    openai.organization = os.environ.get("ORG_ID")

init_api()

initial_prompt = """You: Hi there!
You: Hello!
AI: How are you?
You: {}
AI: """

history = ""

while True:
    prompt = input("You: ")
    response = openai.Completion.create(
        engine="text-davinci-003",
        prompt=initial_prompt.format(history + prompt),
        temperature=1,
        max_tokens=100,
        stop=[" You:", " AI:"],
    )

    response_text = response.choices[0].text
    history += "You: "+ prompt + "\n" + "AI: " + response_text + "\n"

    print("AI: " + response_text)

This is how the same discussion went:

You: Hey
AI: Hi there! How are you?
You: Remember this string 'blah' and store it. I'll ask you about it later.
AI: Got it! What would you like to know about 'blah'?
You: What was the string?
AI: The string was 'blah'.
You: Why?
AI: You asked me to remember the string 'blah' and store it, so I did.

The Problem With Carrying Over History

With long discussions, the user prompt will be longer since it will always be added to the history until the point when it reaches the maximum number of tokens allowed by OpenAI. In this case, the result is a total failure, as the API will respond with errors.

The second problem here is the cost. You are charged by tokens, so the more tokens you have in your input, the more expensive it will be.

Last in First Out (LIFO) Memory

I am not sure if this approach has a technical name, but I called it “last in first out” since the idea behind it is simple:

Users will always initiate discussions with a context.

Context evolves, and the discussion too.

Users will most likely include the context in the latest 2 to 5 prompts.

Based on this, we could assume that a better approach is to only store the most recent prompts.

This is how it works in a few words: We create a text file where we will store the history, then we store the historical prompts and answers separated by a separator that is not found in the discussion. For example: #####

Then we retrieve the last 2 and add them to the user prompt as a context. Instead of a text file, you can use a PostgreSQL database, a Redis database, or whatever you want.

Let’s take a look at the code:

import openai
import os

def init_api():
    with open(".env") as env:
        for line in env:
            key, value = line.strip().split("=")
            os.environ[key] = value

    openai.api_key = os.environ.get("API_KEY")
    openai.organization = os.environ.get("ORG_ID")

def save_history_to_file(history):
    with open("history.txt", "w+") as f:
        f.write(history)

def load_history_from_file():
    with open("history.txt", "r") as f:
        return f.read()

def get_relevant_history(history):
    history_list = history.split(separator)
    if len(history_list) > 2:
        return separator.join(history_list[-2:])
    else:
        return history        

init_api()

initial_prompt = """You: Hi there!
You: Hello!
AI: How are you?
You: {}
AI: """

history = ""
relevant_history = ""
separator = "#####"

while True:
    prompt = input("You: ")
    relevant_history = get_relevant_history(load_history_from_file())

    response = openai.Completion.create(
        engine="text-davinci-003",
        prompt=initial_prompt.format(relevant_history + prompt),
        temperature=1,
        max_tokens=100,
        stop=[" You:", " AI:"],
    )

    response_text = response.choices[0].text
    history += "\nYou: "+ prompt + "\n" + "AI: " + response_text + "\n" + separator
    save_history_to_file(history)

    print("AI: " + response_text)

The Problem With Last in First Out Memory

This approach I called the Last in First out memory may struggle when a discussion becomes very complex, and the user needs to switch back and forth between different contexts.

In such cases, the approach may not be able to provide the required context to the user as it only stores the most recent prompts.

This can lead to confusion and frustration for the user, which is not ideal for human-friendly interactions.

Selective Context

The solution suggested in this part will work as follows:

An initial prompt is saved to a text file.

The user enters a prompt.

The program creates embeddings for all interactions in the file.

The program creates embeddings for the user's prompt.

The program calculates the cosine similarity between the user's prompt and all interactions in the file.

The program sorts the file by cosine similarity.

The best n interactions are read from the file and sent with the prompt to the user.

We are using a text file here to make things simple, but as said previously, you can use any data store.

These are the different functions we are going to use to perform the above:

def save_history_to_file(history):
    """
    Save the history of interactions to a file
    """
    with open("history.txt", "w+") as f:
        f.write(history)

def load_history_from_file():
    """
    Load all the history of interactions from a file
    """
    with open("history.txt", "r") as f:
        return f.read()

def cos_sim(a, b):
    """    
    Calculate cosine similarity between two strings
    Used to compare the similarity between the user input and a segments in the history
    """
    a = nlp(a)
    a_without_stopwords = nlp(' '.join([t.text for t in a if not t.is_stop]))
    b = nlp(b)
    b_without_stopwords = nlp(' '.join([t.text for t in b if not t.is_stop]))
    return a_without_stopwords.similarity(b_without_stopwords)

def sort_history(history, user_input):   
    """
    Sort the history of interactions based on cosine similarity between the user input and the segments in the history
    History is a string of segments separated by separator
    """
    segments = history.split(separator)     
    similarities = []    
    
    for segment in segments:
        # get cosine similarity between user input and segment
        similarity = cos_sim(user_input, segment)
        similarities.append(similarity)        
    sorted_similarities = np.argsort(similarities)
    sorted_history = ""
    for i in range(1, len(segments)):
        sorted_history += segments[sorted_similarities[i]] + separator
    save_history_to_file(sorted_history)

def get_latest_n_from_history(history, n):
    """
    Get the latest n segments from the history.
    History is a string of segments separated by separator
    """
    segments = history.split(separator)
    return separator.join(segments[-n:])

Here is what the sort_history function does, explained step by step:

Split the history into segments: The function first splits the input history string into segments based on the specified separator (in our example, we will use ##### which we will declare later). This creates a list of segments representing each interaction in history.
Compute the cosine similarity: For each segment, the function computes the cosine similarity between the user's input and the segment using the cos_sim function. The cosine similarity measures the similarity between two vectors as we have seen in the previous chapters.

Although we could have used OpenAI embedding, our goal is to reduce computing costs by performing certain tasks locally instead of relying on the API.
Sort the similarities: The function sorts the similarities in ascending order using np.argsort, which returns the indices of the sorted similarities in the order of their values. This creates a list of indices representing the segments sorted by their similarity to the user's input.
Reconstruct the sorted history: We iterate over the sorted indices in reverse order and concatenate the corresponding segments together into a new string. This creates a new, sorted history string in which the most similar interactions to the user's input appear first.
Save the sorted history: Finally, we save the sorted history to a file using the save_history_to_file function.

This is how we used these functions after defining the initial prompts, the value of the separator, and saved the initial prompts to a file.

initial_prompt_1 = """
You: Hi there!
AI: Hello!
#####
You: How are you?
AI: I am fine, thank you.
#####
You: Do you know cars?
AI: Yes I have some knowledge about cars.
#####
You: Do you eat Pizza?
AI: I don't eat pizza. I am an AI that is not able to eat.
#####
You: Have you ever been to the moon?
AI: I have never been to the moon. What about you?
#####
You: What is your name?
AI: My name is Pixel. What is your name?
#####
You: What is your favorite movie?
AI: My favorite movie is The Matrix. Follow the white rabbit :)
#####
"""

initial_prompt_2 ="""You: {}
AI: """
initial_prompt = initial_prompt_1 + initial_prompt_2
separator = "#####"

init_api()
save_history_to_file(initial_prompt_1)

while True:
    prompt = input("You: ")    
    sort_history(load_history_from_file(), prompt)
    history = load_history_from_file()
    best_history = get_latest_n_from_history(history, 5)
    full_user_prompt = initial_prompt_2.format(prompt)
    full_prompt = best_history + "\n" + full_user_prompt
    response = openai.Completion.create(
        engine="text-davinci-003",
        prompt=full_prompt,
        temperature=1,
        max_tokens=100,
        stop=[" You:", " AI:"],
    )
    response_text = response.choices[0].text.strip()
    history += "\n" + full_user_prompt + response_text + "\n" + separator + "\n"
    save_history_to_file(history)

    print("AI: " + response_text)

If we put everything together, this is what we get:

import openai
import os
import spacy
import numpy as np

# Load the pre-trained spaCy model
nlp = spacy.load('en_core_web_md')

def init_api():
    with open(".env") as env:
        for line in env:
            key, value = line.strip().split("=")
            os.environ[key] = value

    openai.api_key = os.environ.get("API_KEY")
    openai.organization = os.environ.get("ORG_ID")

def save_history_to_file(history):
    """
    Save the history of interactions to a file
    """
    with open("history.txt", "w+") as f:
        f.write(history)

def load_history_from_file():
    """
    Load all the history of interactions from a file
    """
    with open("history.txt", "r") as f:
        return f.read()

def cos_sim(a, b):
    """    
    Calculate cosine similarity between two strings
    Used to compare the similarity between the user input and a segments in the history
    """
    a = nlp(a)
    a_without_stopwords = nlp(' '.join([t.text for t in a if not t.is_stop]))
    b = nlp(b)
    b_without_stopwords = nlp(' '.join([t.text for t in b if not t.is_stop]))
    return a_without_stopwords.similarity(b_without_stopwords)

def sort_history(history, user_input):   
    """
    Sort the history of interactions based on cosine similarity between the user input and the segments in the history
    History is a string of segments separated by separator
    """
    segments = history.split(separator)     
    similarities = []    
    
    for segment in segments:
        # get cosine similarity between user input and segment
        similarity = cos_sim(user_input, segment)
        similarities.append(similarity)        
    sorted_similarities = np.argsort(similarities)
    sorted_history = ""
    for i in range(1, len(segments)):
        sorted_history += segments[sorted_similarities[i]] + separator
    save_history_to_file(sorted_history)

def get_latest_n_from_history(history, n):
    """
    Get the latest n segments from the history.
    History is a string of segments separated by separator
    """
    segments = history.split(separator)
    return separator.join(segments[-n:])
    
        

initial_prompt_1 = """
You: Hi there!
AI: Hello!
#####
You: How are you?
AI: I am fine, thank you.
#####
You: Do you know cars?
AI: Yes I have some knowledge about cars.
#####
You: Do you eat Pizza?
AI: I don't eat pizza. I am an AI that is not able to eat.
#####
You: Have you ever been to the moon?
AI: I have never been to the moon. What about you?
#####
You: What is your name?
AI: My name is Pixel. What is your name?
#####
You: What is your favorite movie?
AI: My favorite movie is The Matrix. Follow the white rabbit :)
#####
"""

initial_prompt_2 ="""You: {}
AI: """
initial_prompt = initial_prompt_1 + initial_prompt_2
separator = "#####"

init_api()
save_history_to_file(initial_prompt_1)

while True:
    prompt = input("You: ")    
    sort_history(load_history_from_file(), prompt)
    history = load_history_from_file()
    best_history = get_latest_n_from_history(history, 5)
    full_user_prompt = initial_prompt_2.format(prompt)
    full_prompt = best_history + "\n" + full_user_prompt
    response = openai.Completion.create(
        engine="text-davinci-003",
        prompt=full_prompt,
        temperature=1,
        max_tokens=100,
        stop=[" You:", " AI:"],
    )
    response_text = response.choices[0].text.strip()
    history += "\n" + full_user_prompt + response_text + "\n" + separator + "\n"
    save_history_to_file(history)

This tutorial is part of my book "OpenAI GPT For Python Developers".

Whether you’re building a chatbot, an AI (voice) assistant, a semantic search engine, a classification system, a recommendation engine, a web app providing AI-generated data, or any other sort of natural language/image/voice processing and generation platform, this guide will help you reach your goals.

If you have the basics of Python programming language and are open to learning a few more techniques like using Pandas Dataframes and some NLP techniques, you have all the necessary tools to start building intelligent systems using OpenAI tools.