From Reds to RediSearch

Redis search got a lot more interesting

Have you heard? Redis 4.0 is out. That means modules. Not just in the theoretical, shouldn’t-run-in-production sense. It means primetime. Throw’em on your production server and go.

Note: Redis (the database) and Node.js (the JS runtime) both use the term “module” for different concepts — the language herein has to be a bit laboured to be clear and differentiate.

Recently, I wrote about Reds, the Node.js/Redis search engine module. Reds is quite good — I’ve built entire web services around it and they run really quickly. However, Node.js is Javascript, and Javascript, while decently speedy, is still interpreted and is on an entirely different layer from the database (Redis). Modules, however, are built in a compiled language and without a network layer between logic and data. They have to be faster, right? Well, that was my task — how much faster will a module be over a script. Let’s find out!

Enter: RediSearch

I’ve been maintaining Reds for years now and I’m pretty familiar with the code as well as the use of the script. I’ve also been eyeing the development of RediSearch for a few months (this white paper on RediSearch got me interested). It has really evolved from something simple to something really powerful:

RediSearch has it’s own module system for extensibility (trigger Xzibit image macro about modules inside modules)
During RedisConf 17, a preview demo of a clustered implementation of RediSearch was shown that has indexed over a billion (yes, with a b) documents at 100k documents/sec.

With the exception of the metaphone indexing, RediSearch has everything Reds has and a ton more. My goal was to see if I could map out the Reds functionality to RediSearch and create a (more-or-less) syntax compatible Node.js module: RedRediSearch. The syntax compatibility would allow the existing users of Reds to do a fairly painless rip-and-replace.

The small differences

Unfortunately, you’re not going to be able to just point RediSearch at the Reds data and magically gain performance, you’ll need to re-index your data. RediSearch is a Redis module that uses a different (and, frankly, more efficient) custom data type, as compared to Reds that uses native zsets. Aside from that, it’s painless. And indexing is pretty quick as you’ll find out.

The “natural language processing” functions can’t map over, although you will likely not need it or it can be achieved through a few tweaks to your index (I’ll cover that later). The other thing is that creating an index is not conceptually the same — in Reds reds.createSearch only prepares Node.js for accepting data. In RediSearch, however, creating an index is an actual Redis call (FT.CREATE) and consequently needs a callback. Aside from that, searching and indexing are the same syntax. Adapting a few of my scripts were as easy as just 3–4 lines of adjustment. I also added a function to confirm the existence of the module. I’m just that nice of a person.

What do you get from RediSearch over Reds?

You get better performance. Running the Reds benchmark over the same data on the same machine, I’m getting between a 6.6x and 13x better performance out of RediSearch over Reds while indexing (depending on the size of the documents). Querying is faster too — between 1.2x and 7.4x faster. The whole benchmark is completing 12x faster. That’s a huge improvement for a drop-in replacement.

Yowza! Note: “ops” here are whole documents or search queries indexed per sec, not Redis operations per second!

This is on a modest laptop with a stock, non-tuned install of OSS Redis.

The other thing you gain by switching to RedRediSearch is a richer, more flexible query system.

Reds had the ability to do two types of search: AND and OR. That is, you could only query a document that had one of the query keywords or all the query keywords. Useful, but not very flexible.

With RediSearch you’ve got a rich query language that allows you to do all Reds can do but also exact phrase searches, negated searches, prefix searches, optional keywords and combinations. Out of the box, RedRediSearch can accommodate syntax compatible searches:

//defaults to an "AND" searchsearch.query('cat dog').end(function(err, documents){/* ... */});

//using the `type` fn to get an "OR" searchsearch.query('cat dog').type('or').end(function(err, documents){/* ... */});

But you can also leverage the cool query language by specifying a direct search like this:

//documents without 'dog' but with 'cat'search.query('cat -dog').type('direct').end(function(err, documents){/* ... */});

//documents with 'dog' and optionally with 'cat' but having 'cat' will boost placementsearch.query('cat ~dog').type('direct').end(function(err, documents){/* ... */});

//complex and combinations are also possiblesearch.query('(cat|dog) (felix|lassie)') //must have cat or dog AND felix or lassie.type('direct').end(function(err, documents){/* ... */});

Now, while all this is super cool and useful — it’s actually a dumbed down client for RediSearch. RediSearch can do a ton more things:

filter
sort
index and return hashes
search with related geo information
search with words n between (‘slop’)
search ranges of numeric fields rather than just text

And probably more features that I’ve yet to fully understand or discover.

You can also use RediSearch to manage suggestions (aka auto complete). The suggestions are completely divorced from the indexing process, you can add and remove items for the suggestion list then get all items that start with the same letters. You can even do “fuzzy” searching where exact matching is needed. Here is a short example:

The idea with the suggestions is to maintain the suggestion list based off the queries coming into your search, adding results that aren’t search misses. When a user starts typing, the auto complete can be fetch asynchronously from the backend and presented to the user as they type.

RediSearch is indeed a very full toolkit.

Interacting with RediSearch

Earlier, I mentioned that you can make use of more than just what is contained in the Node.js module. Node_redis has the capability to send commands directly to Redis and it’s very useful when needing some special parameters and, boy, does RediSearch have extra parameters. Reds and RedRediSearch are effectively just wrappers sending commands directly to Redis. You can absolutely use them together if a feature doesn’t exist (yet) in the Node.js module.

As an example, let’s say you want to have a minimal set of “stop words” that are ignored in indexing and querying. The existing (Reds) syntax uses a callback function for stop words, but before you initially create index, you can manually call FT.CREATE with something like:

client.send_command('FT.CREATE',[ myKey,'SCHEMA','payload','TEXT','STOPWORDS','2','bunny','rabbits'],function(err,response) {/* ... */});

This would create a two stop words — ‘bunny’ and ‘rabbits’ that would be ignored entirely. So, if a feature is not yet supported, it’s easy to break free from the existing confines and leverage the full power of the RediSearch module.

Using the Redis module

First you need to install Redis 4.0 (if you don’t already have it) and then the module. I’m using 0.19.3, but check on RediSearch.io or the github releases page for the latest version.

$ wget https://github.com/RedisLabsModules/RediSearch/archive/v0.19.3.tar.gz...Download messages...

$ tar -xvzf v0.19.3.tar.gz...decompression messages...

This should yield a directory called /RediSearch-0.19.3. Switch into that directory and we’ll make everything.

/RediSearch-0.19.3$ make all...build messages...

Now, at the end of your Redis 4.0+ redis.conf file , add in the module loading command:

loadmodule /path/to/RediSearch-0.19.3/src/redisearch.so

Your path might need to be tweaked if you decompressed the tarball in different place or used a different version, but the most relevant parts are the file (redisearch.so) and the last directory (src).

At this point you should reload the redis-server and test out your module install with redis-cli:

> ft.create(error) ERR wrong number of arguments for 'ft.create' command

The error is good — that means your server knows the number of commands FT.CREATE needs. If the module wasn’t loaded this would happen:

> ft.create(error) ERR unknown command 'ft.create'

After verifying the installation of the module, all we need to do is use the Node.js module. Using the RedRediSearch module is pretty straight forward if you’re used to Reds:

Should I switch from Reds to RediSearch?

Generally, I would say “yes, absolutely.” However, you may need to hold off. If you’re using open-source Redis and your data fits into a single instance — go for it right now. If you’re using Redis Enterprise Pack, you’ll need to be a little patience as they are putting the finishing touches on RediSearch + Redis Enterprise Pack, which will enable indexes that span across your entire cluster (think of the possibilities!). If you’re using Redis Enterprise Cloud, stick with Reds as the modules and (consequently) RediSearch isn’t available. However, this will be an option soon with Redis Cloud Private, so stay tuned for that. In all these situations, [Red]RediSearch or Reds, you’re covered because your app-level logic will stay the same. Hopefully RedRediSearch will continue to evolve to encompass all the features of RediSearch as well as maintain backwards compatibility with the Reds syntax. You can pick up the module on Github or through NPM.