How To Customize an OpenAI Chatbot With Embedding

In the last two weeks, we built a chatbot with React, Node.js, and OpenAI. When asking a general question like What is the typical weather in Dubai?, the chatbot responds with a very relevant answer.

If we ask our chatbot something very specific (i.e. something that requires some initial context or knowledge) like Who is Hassan Djirdeh?, our chatbot will either respond by saying it doesn't know or respond with a completely incorrect answer.

Though that would be really cool — I'm not an Abu Dhabi-based entrepreneur, investor, or advisor. We see the above issue since the OpenAI language models we use don't have access to the internet and don't perform web searches in real-time. The models instead get information from a large corpus of text data that they're trained on.

In today's article, we'll explore some ways on how we can have our chatbot respond correctly to specific contextual questions that are asked.

Note: this article is only a pseudo-tutorial. We won't be building out this capability step by step like the last two articles and instead explain the overall approach. We'll share a link to a running code example at the end for you to try out the full implementation if you're interested.

Fine-tuning

There are two capabilities that OpenAI allows us to use to build a custom chatbot experience.

Fine-tuning
Prompt Engineering & Embeddings

Fine-tuning can help produce more quality results from OpenAI by pre-training it with large amounts of information upfront. The steps to do this involve:

Preparing and uploading initial training data.
Training the new model.
Using the newly trained model in our API requests.

OpenAI recommends having at least a couple hundred examples in the fine-trained dataset. We won't be going with this approach and instead go with the more recommended approach for our use case - using embeddings.

Prompt Engineering & Embeddings

Before discussing what embeddings are, it's helpful to first understand the concept of Prompt Engineering when interacting with OpenAI. This very helpful doc in OpenAI's cookbook goes into more detail but we'll summarize it below.

As we've seen, when asking OpenAI a question that requires some specific context:

Who is Hassan Djirdeh?

It can either hallucinate an answer or tell us "I don't know".

To help OpenAI answer the question, we can provide extra contextual information in the prompt itself.

Hassan Djirdeh is a front-end engineer based in Toronto, Canada. He is currently working on producing a newsletter called Frontend Fresh where he shares tips, tutorials, and articles on a weekly basis.

Who is Hassan Djirdeh?

This time, we get a useful response back from OpenAI because our prompt contains the relevant information to the question being asked.

How can we leverage this capability to help us build a more custom chatbot experience? We can do so with the concept known as embeddings.

Embeddings

OpenAI provides a capability known as text embeddings to measure the relatedness of text strings.

For every block of text, chapter, or subject — we can send that information to OpenAI's Embedding service to receive back its embedding data (i.e. a vector list of floating-point numbers).

Request

curl https://api.openai.com/v1/embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{"input": "Hassan Djirdeh is a software engineer...",
       "model":"text-embedding-ada-002"}'

Response

{
  "data": [
    {
      "embedding": [
        -0.006929283495992422,
        -0.005336422007530928,
        ...
        -4.547132266452536e-05,
        -0.024047505110502243
      ],
      "index": 0,
      "object": "embedding"
    }
  ],
  // ...
}

Once we have the embedding data for the different blurbs of contextual information we want our chatbot to know, we need to store it somewhere. Note that this is a one-time thing — we get the embedding data for text blurbs once and only update the embeddings if the contextual information changes.

Here's an example of me storing that information in an Airtable base (which you can see here).

Next, when a user provides a prompt, we get the embedding data of that specific prompt. We can then use a simple cosine similarity check between the embedding data of the prompt that was asked with the embedding data of the information stored in our database.

Let's break this down with an easy-to-understand example. In our saved data table, assume we have embedding data for 4 different sets of information.

Text Blurb	Embeddings Data
The Frontend Fresh Newsletter was created...	[0.000742552,-0.0049907574, ...]
By being part of the newsletter, readers...	[-0.027452856,-0.0023051118, ...]
In the first four to five emails...	[-0.007873567,-0.014787777, ...]
Hassan Djirdeh is a front-end engineer...	[-0.008849341,0.011449747, ...]

3 of the blocks of text above have information about the newsletter and 1 block of text has information about the author of the newsletter.

When the user makes a prompt of "Who is Hassan Djirdeh", we receive the embeddings data for that specific prompt.

Prompt	Embeddings Data
Who is Hassan Djirdeh?	[-0.009949322,0.044449654, ...]

We then compute the cosine similarity for the embeddings data in our database with the embedding data of the prompt question. This gives us a number between 0 to 1 for each comparison.

Text Blurb	Cosine similarity to prompt
The Frontend Fresh Newsletter was created...	0.5
By being part of the newsletter, readers...	0.4
In the first four to five emails...	0.6
Hassan Djirdeh is a front-end engineer...	0.9

The embeddings data with the highest cosine score is the text that is most similar to what the user is asking. We then take that text blurb and embed it within a "super" prompt that we send to OpenAI.

const initial Prompt = "Who is Hassan Djirdeh?"

/* after some work, we determine the following contextual information to be closest in similarity to the prompt question above. */
const textWithHighestScore = "Hassan Djirdeh is a front-end engineer..."

// build final prompt
const finalPrompt = `
  Info: ${textWithHighestScore}
  Question: ${prompt}
  Answer:
`;

// ask Open AI to answer the prompt
const response = await openai.createCompletion({
  model: "text-davinci-003",
  prompt: finalPrompt,
  max_tokens: 64,
});

Now when we ask a question to our chatbot that requires some context, the final prompt is built behind the scenes before being sent to OpenAI. In the end, we get a more relevant answer returned to us!

That's pretty much it! This is a simple introduction to how we can leverage prompt engineering and embeddings to make a chatbot more unique with OpenAI.

For more details on fine-tuning and embeddings, be sure to check out OpenAI's detailed documentation.

Closing thoughts

You can find the source code for updating the Node/Express server of our chatbot with prompt engineering and embeddings here.
One great advantage of having embeddings data stored in a separate spreadsheet database than our app is that we can make changes to the spreadsheet database in real-time with no downtime of the app.
Subscribe to https://www.frontendfresh.com/ for more tutorials like this to hit your inbox on a weekly basis!

✨ This article is the third article sent to the frontendfresh.com newsletter. Subscribe to the Front-end Fresh newsletter to get front-end engineering tips, tutorials, and projects sent to your inbox every week!

Also published here.