HarperDB is More Than Just a Database: Here's Why

Written by margohdb | Published 2021/08/22
Tech Story Tags: database | distributed-systems | edge | cloud | data-engineering | data-management | data-analytics | databases

TLDR HarperDB is a data mesh, fabric, or surface instead of a database. Lumen Technologies developed HarperDB for Department of Defense project that required moving, contextualizing, and converging data. HarperDB could be deployed on devices as small as microprocessors like a Raspberry Pi or Tinker Board, all the way up to large-scale servers, cloud machines, or supercomputers. The solution is designed to provide a holistic analytics domain that makes data management and data management easy. For example, gaming and media industries could benefit from HarperDB’s high performance and low latency with clear latency.via the TL;DR App

I recently had a fascinating conversation on our podcast with Ron Lewis, the Director of Innovation and Engineering at Lumen Technologies. Ron brought up the notion that HarperDB is more than just a database, and for certain users or projects, HarperDB is not serving as a database at all. How can this be possible?

Database, Explained

Well, what really is a database? Wikipedia states that “In computing, a database is an organized collection of data stored and accessed electronically from a computer system.” Another site states that “​​A database is a systematic collection of data. They support electronic storage and manipulation of data. Databases make data management easy.”
So at its core, yes, HarperDB is certainly a database and can fulfill this functionality (after all, that’s what the DB stands for). But it can do so much more. For example, there are many cases where organizations keep their existing database system(s) in place and use HarperDB to extend their current functionality or for a different capability altogether.
Especially when it comes to solving complex enterprise data management challenges, the answer rarely (if ever) comes down to this database vs. that database.
There’s much more to it.
There are many different moving parts related to capturing the right data, getting data to where it needs to be in a timely manner, analyzing and acting on that data, etc. This is really where HarperDB shines.

HarperDB: A Runway for Launching Industry 4.0 Technology

Ron mentioned in our podcast discussion:
“The reason I see HarperDB as a disruptive technology is because you often call HarperDB a database, but it’s not really a database. It’s maybe what some people call a data mesh or data fabric... I see HarperDB more as a data surface, especially with Functions. The whole idea is to be able to converge and contextualize data to support decision making.”
The new Custom Functions Ron refers to will enable users to define their own API endpoints within HarperDB, ultimately expanding HarperDB from a distributed database to a distributed application development platform with integrated persistence.
So, now we’re thinking of HarperDB as a data mesh, fabric, or surface instead of a database. That’s a lot of buzzwords! Let’s take a step back.
When I asked Ron what initially drew him and his team to HarperDB, he provided some great insight. Ron mentioned that they were working on a project for the Department of Defense (DoD) that required moving, contextualizing, and converging data and they needed something super fast and intuitive.
They were essentially looking for something easy to use and easy to deploy that’s also flexible and scalable. Once HarperDB and Ron connected, he discovered that HarperDB could be deployed on devices as small as microprocessors like a Raspberry Pi or Tinker Board, all the way up to large-scale servers, cloud machines, or supercomputers.
This piqued his interest, as he needed the ability to do large-scale analytics and move the data between devices in a simple manner.
At a basic level, we quickly realized that Ron and the HarperDB team were asking the same questions:
  • When we look at how much data has to move, how much is being created on an hourly basis from OT data onsite etc., how do we manage, transport, and take advantage of all of that data?
  • How do we get the data to where it needs to be in the most efficient manner possible?

Extended Functionality

Ron said that with HarperDB, he and his team could “define the data movement and do all these crazy cool things.” As they were looking at different military adaptations, they were able to take data that is running integrated into a controller environment (OT) and expose that data without needing to have a human-machine interface (HMI).
They could securely move that OT data into the cloud, into a highly scalable enterprise analytics domain powered by HarperDB in the cloud on compute nodes.
There are many use cases similar to this, where HarperDB can provide a holistic solution that makes data sync and management easy. In the defense space, HarperDB’s bidirectional data movement enables the collection and movement of data and logic in real-time, shifting decision-making throughout the network as needed.
Gaming and media industries benefit from HarperDB’s high performance and low latency, with clear implications for both the organization and the end-user. Retail and ticketing can recognize and block bad bots in real-time with HarperDB’s global replication and edge persistence.
The list goes on!

Why HarperDB?

Ron explained, “We started looking at all the different databases that are scalable like Couchbase and a bunch of others, but we ended up focusing on HarperDB because of the flexibility.
Then, Stephen came up with the idea of Functions because a lot of what we did required us to put an API proxy in front of the data engine. He said, how about I make your life simpler? It’s just amazing how HarperDB checks all the boxes.”
Ron continued, “If you think about how databases communicate and the different models, I love the way HarperDB does it through the native integration of all of these components. No matter what it’s running on or where, HarperDB is disruptive because I’m able to move the different types of data, and different types of assets like functionality, from place to place seamlessly without having to worry about the interoperability of different data engines, nor do I have to worry about the size and scale.
Databases are not typically designed as persistent vs. non-persistent; they tend to be scaled vertically instead of horizontally. HarperDB scales beautifully; a containerized version of HarperDB tied to persistent storage allows me to scale HarperDB to meet my performance goals.
The workload it can perform is amazing, and the ability to actually scale horizontally is amazing as well because it’s not typical for database engines.”
Therefore, HarperDB is a unique solution for complex enterprise data challenges because the database engine is small and flexible enough to run on a microcontroller running on an onboard system, that can also be extended to edge bare metal or some edge computing environment for higher fidelity analysis, and can also be moved to the cloud -- all at the speed of the Internet.
HarperDB can scale vertically and horizontally while meeting performance needs. It really is more than just a database.
To sum it up, Ron stated:
“From a data driven ecosystem, HarperDB is paving the path forward moving from mesh to fabric to actual data surface and providing that contextualization of data right out of the database engine, which will be key to a fundamental shift in application behavior.”

Predictions for the Future

To wrap up our conversation, I asked Ron about the future of technology. He mentioned a few key things:
The move from cloud to edge is almost certain. The nature of applications will change to take on a more distributed nature, along the lines of distributed functionality with edge workloads managed and deployed from some cloud orchestrator.
As we look at the nature of apps changing, data will be more contextualized from a database engine or persistence layer perspective rather than an application or business layer, and HarperDB is leading the charge on that.
There you have it, folks. The future is all about data. There is a constant need for organizations to have their data where they need it, with the ability to orchestrate data where it’s both being created and consumed.
If you’re not evaluating your data and how you’re handling your data assets, where will you be in 1, 5, or 10 years from now?
Also published here

Written by margohdb | On the innovative team @ HarperDB. Podcast host. Tech blogger. DevRel. Women in tech
Published by HackerNoon on 2021/08/22