Gptrim: Reduce Your GPT Prompt Size by 50% For Free!

Written by vladpublish | Published 2023/04/18
Tech Story Tags: artificial-intelligence | ai | gpt | large-language-models | prompt-engineering | gptrim | gpt-prompt-size | hackernoon-top-story | hackernoon-es | hackernoon-hi | hackernoon-zh | hackernoon-vi | hackernoon-fr | hackernoon-pt | hackernoon-ja

TLDRgptrim reduces your prompts by 40-60%. GPT is trained at predicting human language. If you give it a text that is highly condensed and compressed, it will still be able to read it. The biggest limit on what you can achieve comes from the context window, i.e. the total number of tokens that GPT can see at one time. Reducing the tokens in your prompts means you can have a bigger context window. You will spend less money doing the same job.via the TL;DR App

Introducing gptrim, a free web app that will reduce the size of your prompts by 40%-60% while preserving most of the original information for GPT to process. gptrim is also a Python library.

How It Works

Paste your GPT prompt in gptrim. Copy the trimmed text and give it to GPT.

The trimmed text looks like gibberish. But GPT understands it! šŸ˜

Hereā€™s the prompt that you can use to check the compression quality:

This is an instance of compressed text. 
Rewrite it so that it has perfect grammar and is understandable by a human.
Try to interpret it as faithfully as possible. 
Do not paraphrase or add anything to the text.

The Problem: The Context Window Is Too Damn Small!

If youā€™ve played around with GPT, you know that the biggest limit on what you can achieve comes from the context window, i.e. the total number of tokens that GPT can see at one time.

Hereā€™s where things stand at the time of this article (OpenAI overview):

  • The size of the context window is measured in tokens. 1000 tokens correspond to approximately 750 words.

  • The GPT-3.5 API has a context window of 4k tokens or about 6 Word pages.

  • With the GPT-4 API you can get a context window of size 8k or 32k, depending on how much youā€™re willing to pay.

  • Even if youā€™re willing to pay, GPT-4 API is in limited beta and most people, myself included, canā€™t access it. Hey OpenAI, still waiting on that invite.

  • When youā€™re using the APIs, every single token costs you.

  • You can use GPT-4 interactively in ChatGPT, for a monthly fee. Unfortunately, chat messages can only fit a small number of characters. Both ChatGPT and I sometimes cross that limit and our messages get interrupted.

In practice, this makes it very hard to build applications that work on large amounts of text.

Reducing the tokens in your prompts means:

  1. You can have a bigger context window, build cooler things.
  2. You will spend less money doing the same job.

In general, there is never too much of a context window. Even if you are willing to pay for 32k, you will still want more.

The Solution: Forget About Readability

GPT is trained at predicting human language. On a day-to-day basis, it is much better at predicting human language than any human will ever be. If you give it a text that is highly condensed and compressed, it will still be able to read it.

Consider the use of spaces. Humans need spaces between words because we need to see the text. ButifIwrotewithoutspacesyoucouldprobablystillunderstandme.

My guess is that 95% of spaces in a text are just there to make reading easier on the eyes. GPT doesnā€™t care. Every space you eliminate is one extra token that you can use to convey information.

Can we do better than just removing spaces? Of course, I asked GPT. (I was frustrated because I was trying to feed it large blog posts and kept running up against the limit.) It came up with a Python function that does the following:

  • Tokenizes the text

  • Removes stopwords

  • Applies the Porter stemming algorithm

  • Removes a few common words: 'the', 'a', 'an', 'in', 'on', 'at', 'for', 'to', 'of'

  • Removes all spaces and jumbles the words together

You can read the code here. Itā€™s very simple! This is standard NLP preprocessing stuff. But I havenā€™t seen anyone use it for this purpose yet.

A couple of weeks ago Twitter discovered "Shoggoth Tongueā€. You can get GPT to write highly compressed text in an idiosyncratic language that its own instances can understand. This is extremely fascinating. However, it is not effective as a method to save money on GPT, because you still need to use GPT for the compression.

gptrim does not need GPT to compress text, which makes it quick and free.

How Can I Use This?

gptrim rewrites your prompts so they are ~50% shorter. You can simply paste the shortened prompt in ChatGPT or feed it to your API. GPT will then follow your instructions. No special explanation is needed. GPT wonā€™t see anything weird about your text!

How Well Does It Work?

I havenā€™t tested this extensively. From what Iā€™ve seen, GPT can recover most of the original meaning. This is also true for GPT-3.5.

The best way to verify compression quality is to ask GPT to decompress the text. Iā€™ve shared a prompt for that at the top of the article.

The compression is not perfect. For some sentences, the meaning gets lost or misinterpreted. I donā€™t recommend using this for applications where nuance is crucial (e.g. medical diagnosis).

Future Steps

This project was hacked together in an evening. It was very much a collaborative effort. I came up with the idea, and GPT wrote the trimming function. It also did the heavy lifting for writing the Flask web app.

There are several improvements that could be added:

  • Publish a Python library to do this programmatically.

  • Measure savings in GPT tokens, not character numbers.

  • Compute dollar savings based on OpenAIā€™s pricing.

  • Run more experiments. Can we get GPT to answer in trimmed language, think to itself in trimmed language, and only decompress the text as a final step?

Finally, there must be better methods to compress text for GPT, without using GPT. I look forward to new ideas in this space.

Letā€™s Connect!

I like to build stuff with AI and write about it. Find me on LinkedIn and Twitter.


Written by vladpublish | Data Engineer. I like to learn complex things and explain them simply.
Published by HackerNoon on 2023/04/18