CassIO: The Best Library for Generative AI, Inspired by OpenAI

If you’re a frequent user of ChatGPT, you know the tendency it has to wander off into what is known as hallucinations. A great collection of statistically correct words that have no basis in reality. A few months ago, a prompt about using Apache Cassandra for large language models (LLMs) and LangChain resulted in a curious response. ChatGPT reported that not only was Cassandra a good tool choice when creating LLMs, OpenAI used Cassandra with an MIT-licensed Python library they called CassIO.

Into the rabbit hole we went, and through more prompting, ChatGPT described many details about how CassIO was used. It even included some sample code and a website. Subsequent research found no evidence of CassIO outside of ChatGPT responses, but the seed was sown. If this library didn’t exist, it needed to, and we started work on it shortly after.

Best hallucination ever.

Will the real CassIO please stand up?

What was this great idea that ChatGPT (and, by association, OpenAI) inspired? A great Python library enables developers to do more with less. DataStax and Anant combined forces in developing CassIO to make the integration of Cassandra with generative artificial intelligence and other machine learning workloads seamless. Its principal purpose is to abstract the process of accessing the Cassandra database, including its vector search capabilities, offering a set of ready-to-use tools that minimize the need for additional code. As a result, developers can focus on designing and implementing their AI systems, knowing that CassIO has taken care of the underlying database complexities. The result is access to a proven database for affordable scale and low latency. The essence of CassIO is all about facilitating and simplifying the implementation process.

CassIO's strength lies in its agnosticism toward specific AI frameworks. It doesn't concern itself with the specific implementation details of interfaces like LangChain, LlamaIndex, Microsoft Semantic Kernel, or various other generative AI toolkits. Instead, it provides a set of "thin adapters" that conform to the framework's interfaces while using the capabilities of CassIO. This enables CassIO to bridge the gap between your AI application and the database, thus enabling the application to leverage the power of Cassandra without getting entangled in its details.

Integration with LangChain

LangChain automates the majority of management tasks and interactions with LLMs. It provides support for memory, vector-based similarity search, advanced prompt templating abstraction, and a wealth of other features. CassIO integrates seamlessly with LangChain, extending Cassandra-specific tools to streamline tasks such as:

A memory module for LLMs that uses Cassandra for storage, which can remember recent exchanges in a chat interaction, or even keep a summary of the entire past conversation.
A feature to cache LLM responses on Cassandra, thereby saving on latency and tokens where possible.Automatic injection of data from Cassandra into a prompt or within a longer LLM conversation.
Support for "partialing" of prompts, leaving some input unspecified for future supply.
Automatic injection of data from a Feast feature store (potentially backed by Cassandra) into a prompt.

These components work together to streamline the process of incorporating data into prompts and ensure smooth interaction between the LLM and the database.

Integration with vector search

The inclusion of vector search capabilities in Cassandra and DataStax Astra DB recently (read about the news here) has integrated a key feature into an already popular database for transactional data. Cassandra's reputation for high scale means that you have a single place to store and process data without moving data around in costly operations. The addition of vector search has opened doors to a suite of "semantically aware" tooling made available in CassIO, such as:

A cache of LLM responses that are not dependent on the exact phrasing of a query.
A "semantic index" that can store a knowledge base and retrieve relevant parts to construct the best answer to a given question. This tool can be adapted to suit many specific needs and can be configured to retrieve diverse information to maximize the actual information flowing into the answer.
A "semantic memory" element for LLM chat interactions, which can retrieve relevant past exchanges even if they occurred in the distant past.

The combination of CassIO and LangChain continues to expand and refine these capabilities over time to meet the ever-evolving needs of LLM management. The current state-of-the-art is in chaining prompts to get more accurate responses from LLMs. In a recent paper describing a technique called tree-of-thought, the role of vector search plays a critical role in persistence from one prompt to the next. As these ideas move from academia to production, Cassandra will serve as an important part of the implementation.

Next prompt: What's ahead for CassIO

As an evolving tool, CassIO is growing rapidly, with new developments and updates frequently added. At the time of writing, CassIO supports LangChain, with LlamaIndex coming soon. The long-term goal of this project is to support high-scale memory for autonomous AI agents such as the JARVIS project. Agents with LLMs are an exciting development that will have an incredible impact on many industries with complex task handling. These agents will need to keep track of many aspects of data and interactions, and Cassandra is the right database for the job. Reliable and performant.

An upcoming boot camp, “NoCode, Data & AI: LLM Bootcamp with Cassandra,” will offer developers a chance to work hands-on with the library to build a chat bot. Look for more activities like this coming to a city near you! We encourage users exploring CassIO to file issues, participate in the forums and help us improve this rapidly materializing hallucination.

Who knows how history will judge this moment? Was it a leak of internal information from OpenAI? Or, thinking a bit more darkly, is this the first step of AI to get humans to do its bidding? Either way, developers now have a simple-to-use library to tap into the near-infinite scale of Cassandra when striking off into the world of generative AI.

ChatGPT has given us a gift, so what are you going to build with this? I’m going to be diving into vector search in an upcoming webinar (register here!), and if you just want to get in and start working today, DataStax Astra has some great tutorials.

By Patrick McFadin, DataStax

Patrick McFadin is the co-author of the O’Reilly book 'Managing Cloud Native Data on Kubernetes.' He currently works at DataStax in developer relations and as a contributor to the Apache Cassandra project. Patrick has worked as chief evangelist for Apache Cassandra (he’s also a newly minted Cassandra committer!) and as a consultant for DataStax, where he had a great time building some of the largest deployments in production.