PostgreSQL vs. MongoDB: Evaluating Database Structure, Speed, and More

Written by karllhughes | Published 2021/10/13
Tech Story Tags: mongo | postgres | mongodb | postgresql | databases | postgresql-vs.-mongodb | coding-tutorial | choosing-a-database

TLDR It’s been over 10 years since MongoDB was released, but the NoSQL vs. NoSQL debate rages on. NoSQL databases are flexible, fast to write to, and created with the modern web in mind. On the other hand, SQL databases (of which Postgres is one of the most popular) are still the backbone of applications today. In this article, I’ll compare Postgres and MongoDB in terms of speed, usability, deployment options, and scalability.via the TL;DR App

It’s been over 10 years since MongoDB was released, but the SQL vs. NoSQL debate rages on.

On one hand, NoSQL databases (of which Mongo is one specific type, a document store) are flexible, fast to write to, and created with the modern web in mind.

On the other hand, SQL databases (of which Postgres is one of the most popular) are still the backbone of applications today. Everyone knows how to use them and many incorporate NoSQL-like optimizations, making the line a bit blurry.

Either way, whenever you start a new application, you have to choose a database. This can be a scary prospect as it’s nearly impossible to change your core data model once you’ve built something significant. That’s why the debate is important, even now.

In this article, I’ll compare PostgreSQL and MongoDB. You’ll see how they stack up in speed, usability, deployment options, and scalability.

While it’s a complex topic (and I’m sure I’ll miss a few things that you can deride me for in the comments), I hope this gives you a great starting point for picking your next database.

Overview

PostgreSQL is a traditional, open-source RDBMS (relational database management system). It is not owned by a private corporation or entity and the source code is available free of charge. It has earned a strong reputation for reliability, extensibility, feature robustness, and performance.

A relational database like PostgreSQL is a collection of data items organized in tables. A table consists of rows, and each row contains the same set of columns. PostgreSQL uses primary keys to uniquely identify each row (a.k.a. record) in a table, and foreign keys to assure the referential integrity between two related tables.

PostgreSQL is ACID-compliant, transactional, has updatable and materialized views. It also supports triggers, functions, stored procedures, and foreign keys.

In PostgreSQL, database schemas and models need to be defined ahead of time, and data must match this schema to be stored in the database.

MongoDB is a document-oriented database that stores data in JSON-like documents with dynamic schema. You can store your records without worrying about the data structure such as the number of fields or types of fields to store values. MongoDB documents are essentially JSON objects.

In MongoDB, collections of documents do not require a predefined structure and columns can vary for different documents. MongoDB offers an expressive query language and strong consistency.

Database Structure

PostgreSQL stores data in the form of tables. Each table will have a predefined set of columns and a table is a collection of rows. Each row has a set of values corresponding to each column.

MongoDB stores data in the form of collections. Each collection will have a set of keys and corresponding values. A collection will have a set of documents.

These naming differences might seem petty, but they’re important to grasp as using the wrong names can confuse other engineers on your team.

PostgreSQL

MongoDB

Table

Collection

Column

Key

Value

Value

Records

Documents

Speed and Indexes

Without an index, any database will have to to a full-table-scan, essentially searching through all records one by one, in the order they appear to find the matching query. This operation becomes extremely time-consuming as your table size increases.

An index sets up a column in a specified order to help optimize query performance by fetching results quickly. But indexes also add overhead to the database system as a whole, so they should be used sensibly.

This video gives a great overview of indexes from a more technical perspective.

PostgreSQL Indexes

PostgreSQL includes built-in support for regular B-tree, Hash, GiST, and GIN. Each index type uses a different algorithm that is best suited to different types of queries. By default, the CREATE INDEX command creates B-tree indexes, which are usually a good option. Indexes in PostgreSQL also support the following:

  • Expression indexes - created with an index of the result of an expression or function, instead of simply the value of a column
  • Partial indexes - index only a part of a table
  • Multifield/Compound indexes - index multiple fields of a table
  • Geospatial Indexes - index to support geospatial queries (via PostGIS plugin)

MongoDB Indexes

Indexes are preferred in MongoDB. If an index is missing, every document within the collection must be searched to select the documents that were requested in the query. This can greatly hamper performance and read times.

Indexes in MongoDB also support:

  • Single Field Indexes - index on a single field of the collection
  • Compound Indexes - index multiple fields of the collection
  • Geospatial Indexes - index to support geospatial queries
  • Text Indexes - index type that supports searching for string content in a collection
  • Partial Indexes - only index the documents in a collection that meet a specified filter expression

Query Language

One of the biggest differences between using PostgreSQL and MongoDB in practice is the query syntax.

To demonstrate this, let’s look at how to create a customers table with id, name, email columns and compare basic CRUD operations using PostgreSQL and MongoDB.

Create customers table/collection

PostgreSQL : In PostgreSQL, we need to define table schema before inserting records.

CREATE TABLE customers (
    id int,
    name varchar(255),
    email varchar(255),
    PRIMARY KEY (id)
);

MongoDB: In MongoDB, there is no need to pre-define collection schema as it allows dynamic schema.

Insert records into the customers table/collection

PostgreSQL :

INSERT INTO `customers` (`id`, `name`, `email`) 
VALUES (1, 'Mr. XYZ', 'xyz@domain.com');

MongoDB :

db.customers.insert(
   {id: 1, name: "Mr. XYZ", email: "xyz@domain.com"}
)

Update records in customers table/collection

PostgreSQL :

UPDATE customers set name='Mr. ABC' where name='Mr.XYZ';

MongoDB :

db.customers.update({"name": "Mr. XYZ"}, {$set: {"name": 'Mr. ABC'}}, {multi: true})

Delete records in customers table/collection

PostgreSQL :

DELETE from customers where id=1;

MongoDB :

db.customers.remove({ id: 1 })

Select records from customers table/collection

PostgreSQL :

select * from customers;

MongoDB :

db.customers.find()

If you are familiar with javascript and JSON, you will find MongoDB Query Language to be simpler and easy to learn whereas if you’re familiar with traditional SQL(Structured Query Language), you’ll likely find PostgreSQL more natural.

Deployment

PostgreSQL Deployment Options

PostgreSQL is available on Windows, macOS, and Linux, and all the major cloud providers offer support for it. Many also provide zero-downtime migration if you ever need to move your data and offer a pay-as-you-go pricing model.

MongoDB Deployment Options

MongoDB similarly runs on all major operating systems and most of the big cloud providers. MongoDB also offers its own proprietary service called Atlas to deploy and scale your instance in the cloud.

In either case, managed hosting is probably a good idea for small teams. Having your database go down can be really scary, so unless you have the expertise in-house, you’re probably better off paying for managed hosting.

Replication

Replication lets you create multiple copies of your data to improve:

  • Fault tolerance - In the event of a primary database server failure, the replica server can take over, since it already contains the primary server’s data.
  • Data migration - This allows you to upgrade servers or deploy the same system for another customer.
  • OLTP Performance - Allows a dedicated instance for read-only queries (like internal analytics) that doesn’t harm the performance of the primary database.
  • Query performance - Spreading read queries across copies will improve overall performance.

PostgreSQL Replication

There are two methods for replication in PostgreSQL:

Synchronous replication or 2-safe replication. In synchronous replication, transactions on the primary database are declared complete only when those changes have been replicated to all the replicas. Unless both databases crash simultaneously, data won't be lost.

Asynchronous replication means that transactions on the primary server can be declared complete when the changes have been done on at least the primary server. These changes are then replicated to the replicas when they’re available. The replica servers can remain out-of-sync for a certain duration, which is called a replication lag.

Both synchronous and asynchronous modes both have their pros and cons. In synchronous mode, write latency will increase but the data replication is always in sync. In asynchronous mode, writes will be faster, but there is a slight chance of data inconsistency, especially if the primary database goes down.

MongoDB Replication

A replica set in MongoDB offers redundancy and high availability. Using a similar model (primary and secondary nodes), you can replicate data across multiple servers to ensure no downtime or lost data in the case of an outage.

Typically, the primary node will receive all write operations and records changes data sets in its operation log.

Meanwhile, the secondaries replicate the primary's oplog and apply the operations to their data sets. If the primary is unavailable, an eligible secondary will hold an election to elect itself the new primary.

Security

MongoDB uses role-based access control and flexible permissions that admins can set. All data is encrypted with TLS in transit, and it’s possible to encrypt documents in a collection at rest using a master key.

PostgreSQL supports the same encryption features as MongoDB and its authentication model is similar. Users can be granted roles and privileges, giving them permissions over particular database operations and against particular data sets.

Community and Support

Interestingly, both PostgreSQL and MongoDB are open-source, but their supporting organizations are financed much differently.

PostgreSQL is maintained by PostgreSQL Global Development Group and is financially supported by sponsors (mostly large tech companies). The PostgreSQL community has been active for over 20 years so there are innumerable tutorials, guides, and extensions available.

MongoDB is run by a for-profit corporation, and while also open-source, the company offers commercial support and hosting services to fund the project. Most of their paid services are geared towards enterprise development, so most small-time projects will never need to pay for MongoDB. The MongoDB community is also very active, so finding resources to solve specific problems is usually pretty easy, although slightly less so than PostgreSQL in my experience.

Choosing the Right Database

PostgreSQL has been around longer, so there’s probably more large-scale enterprise applications using it, but that doesn’t mean it’s always better.

MongoDB can be a great choice if you need scalability and caching for real-time analytics; however, it is not built for transactional data (accounting systems, etc.). MongoDB is frequently used for mobile apps, content management, real-time analytics, and applications involving the Internet of Things.

Ultimately, there’s no “right” option for every project, but here’s how I’d think about picking between PostgreSQL and MongoDB:

  • How certain are you about your data model? Postgres is fantastic and fast when you know your schema up-front. If you need flexibility or you’re in a domain that requires a lot of flexibility, MongoDB might be a better option. (Although PosgreSQL’s JSON functions are a nice middle ground.)
  • Is your application read or write heavy? Generally, PostgreSQL is better for read-heavy ops while MongoDB can be faster with writes.
  • Which does your team know better? Ultimately, developer experience is a huge factor and the database you can work with most comfortably might be best for large-scale projects.

I hope this has given you some insights into choosing between these two popular databases. If you have questions, leave a comment, and let’s keep the conversation going.


Written by karllhughes | Former startup CTO turned writer. Founder of draft.dev
Published by HackerNoon on 2021/10/13