How to Build a Node.js Weather App with Cassandra

Written by kovidrathee | Published 2021/09/22
Tech Story Tags: cassandra | nodejs | realtime | build-real-time-applications | apache-cassandra | clone-weatherapi | programming | javascript-tutorial

TLDRIn this tutorial, you will build a minimal Node.js application that tracks real-time weather data from [WeatherAPI.com] and ingests it into Apache Cassandra. This app will push weather alerts or notifications and analyze weather patterns. It is a single-threaded application with configurable batch querying and ingestion capability. The application will fetch and ingest weather data: temperature, wind speed, humidity, visibility, visibility, and wind direction. We will discuss configuration settings and location types accepted by the application later in the tutorial.via the TL;DR App

In this tutorial, you will build a minimal Node.js application that tracks real-time weather data from WeatherAPI.com using a Node.js application and ingests it into Apache Cassandra. You can create a weather app after reading this tutorial. This app will push weather alerts or notifications and analyze weather patterns.

Prerequisites

Before getting started with the tutorial, you'll need the following:

Clone the WeatherAPI SDK

We will be using the official SDK for WeatherAPI. To get started, clone the GitHub repository using the following git clone command:

git clone git@github.com:weatherapicom/weatherapi-Node-js

Go to the repository and run the following npm link command to enable linking the SDK with our weatherapi Node.js tutorial application:

cd weatherapi-Node-js
npm install
sudo npm link

Make sure you run npm link with sudo to ensure that the npm link has access to the global npm packages for creating the symlinks. You are now ready to clone the next repository.

Clone the Cassandra Node.js application

This repository hosts a minimal Node.js application that fetches and ingests real-time weather data into Cassandra from WeatherAPI.com. The application uses the official Node.js SDK to make API calls to the weather API. It is a single-threaded application with configurable batch querying and ingestion capability. The application uses the Node.js config package to store the configurable settings.

The application will fetch and ingest weather data: temperature, wind speed, humidity, visibility, and wind direction. We will discuss the configuration settings and location types accepted by the application later in the tutorial.

You can clone the Cassandra Node.js WeatherAPI.com application using the git clone command:

cd ..
git clone git@github.com:kovid-r/cassandra-nodejs-weatherapi.git
cd cassandra-nodejs-weatherapi

After cloning the repository, you should install the Node.js dependencies specified in the package.json and package-lock.json files. npm will take care of installing the dependencies using the following command in the repository directory:

npm ci

To link the SDK node modules, run the following command:

npm link weatherapilib

Please note that you will need to rerun this command whenever you make a change in the application dependencies and run npm ci. These instructions are also present in the README file in the repository.

Create a database

In this tutorial, we will use DataStax's multi-cloud DBaaS, built on Apache Cassandra, called Astra DB. You will need to sign up for DataStax Astra DB to spin up a Cassandra database.

For learning and playing around Cassandra, you won't need any more than the free $25 credit per month that Astra already offers its users. This translates to a free monthly allowance of roughly 40 GB of storage, 30 million reads, and 5 million writes - sufficient for small production applications. You can sign up with Google or Github, and no credit card is required.

Click on the Create Database option. This will lead you to the following page, where you can create a database on any of the three leading cloud providers globally: Google Cloud, AWS, and Azure. You will need to provide the database name (weather) and the keyspace name (realtime). You can read more about creating databases in Astra's official documentation.

Initialize the database To access the database for the first time, use the in-browser CQL shell. You can initialize the database by running the following command in the shell. This command creates a table that will store all the different fields from the WeatherAPI.com API.

Initialize the database

To access the database for the first time, use the in-browser CQL shell. You can initialize the database by running the following command in the shell. This command creates a table that will store all the different fields from the WeatherAPI.com API.

-- Create a table to realtime weather data
CREATE TABLE realtime.weather (
    id UUID PRIMARY KEY, 
    city TEXT, 
    region TEXT, 
    country TEXT,
    lat DOUBLE,
    lon DOUBLE,
    tzid TEXT,
    temp_c DOUBLE,
    temp_f DOUBLE,
    condition TEXT,
    wind_mph DOUBLE,
    wind_kph DOUBLE,
    wind_degree DOUBLE,
    wind_direction TEXT,
    pressure_mb DOUBLE,
    pressure_in DOUBLE,
    precipitation_mm INT,
    precipitation_in INT,
    humidity INT,
    cloud INT,
    feelslike_c DOUBLE,
    feelslike_f DOUBLE,
    visibility_km DOUBLE,
    visibility_m DOUBLE,
    uv INT,
    gust_mph DOUBLE,
    gust_kph DOUBLE,
    apicall_epoch BIGINT,
    apicall_dt TEXT,
    lastupdated_epoch BIGINT,
    lastupdated_dt TEXT
    );

Download the secure bundle

You need to download the DataStax Secure Bundle zip file to your Node.js application directory. Make sure that your browser doesn't unzip the file while downloading it. You will need to provide the path to this file to the configuration later; for more information about the Secure Bundle, head over to DataStax Astra's official documentation.

Generate an authentication token

The next step is to generate an authentication token. To do that, click the Billing option on the top menu bar. You will land on a screen shown in the image below:

Go to the Token Management option in the left menu bar, choose the R/W User role in the Select Role drop-down menu, and press Generate Token.

You will see the following screen with your authentication token. Download the Client ID, Client Secret, and Token in a CSV. You can also copy them directly.

Now that you have installed the repositories, generated the token, created the database and database objects, you're good to give your application a run! For more information, please check the official documentation for managing tokens in Astra.

Use npm config for authentication

You'll see that we are using the config package to avoid storing token information in our code in the application. There are two JSON files in the configuration:

  • default.json -  contains secrets and application parameters.

  • development.json  -  contains locations for fetching weather data.

The first file default.json looks like the following:

{
    "apiKey": "Place your WeatherAPI.com API Key here!",
    "dbSettings": {
        "cloud": {
            "secureConnectBundle": "/path/to/secure-connect-bundle.zip"
        },
        "credentials": {
            "username": "Place your Astra clientID here!",
            "password": "Place your Astra clientSecret here!"
        }
    },
    "requestBatchSize": 1000,
    "insertBatchSize": 1000,
    "getWeatherDataFreqMs": 100,
    "getWeatherDataDurationMs": 60000,
    "language": "en"
}

Make sure you put the API keys and secrets for Astra and WeatherAPI.com before running the application. Also, be careful about changing this file and committing the changes back to your remote repository. Let's be honest; we've all done that. To prevent that mistake from happening, you can add this file to .gitignore.

The second file is just a single-key JSON file with an extensive array of locations in heterogeneous formats - lat/long combinations, city names, zip codes, IP addresses, IATA codes, and so on.

{
    "placeCodes": ["95227", "95228", "95230", "Melbourne", "Delhi", "New York", "Cape Town", "London", "Tokyo", "Singapore", "-44.31208,-75.81100", "DXB", "CT3", "LE14", "BN41", "Dubai", "Moscow", "Lahore", "Karachi", "Kabul", "M4B 1B3", "K8N 5W6", "Rohtak", "Hobart", "Shimla", "Cleveland", "Santa Clara", "90210", "Dalhousie", "Sydney", "L7B 0G6", "L9C 1J4", "R7N 1Y1", "J4M 1B5", "J7B 0A2", "V1J 6W3", "S0H 3P0", "T2W 0Z5", "G1S 4T5", "V3V 3H7", "T3M 2E6", "R3L 1E5", "H9R 3J3", "H9X 4A1", "G1H 7C3", "G0A 3C0", "K8N 5R1", "G1R 5B9", "J8L 2C9", "H3X 1N8", "M9B 2N2", "J9J 1N7", "T3G 2W9", "J6Z 2X2", "E4X 2H6", "T3E 3L1", "E1N 2B7", "V1J 1Y9", "J0B 3E2", "V2P 1E6", "R2V 2P8", "L6Y 1C7", "CAN", "ATL", "CTU", "DFW", "LHR", "PHX", "DEN", "HND", "DEL", "IST", "MEX", "GRU", "Bangalore", "Venice", "Vienna", "Prague", "Tallinn", "Delft", "Auckland", "Wellington", "59858", "59859", "59860", "59863", "59864", "59865", "59866", "59867", "59868", "59870", "59871", "59872", "59873", "59874", "59875", "59901", "59910", "59911", "59912", "176.13.69.63", "129.134.0.3", "129.134.0.1", "Shanghai", "Sao Paolo", "Beijing", "Mexico City", "Cairo", "Ahmedabad", "Mumbai", "Okaka", "Kolkata", "Manila", "Lagos", "Lima", "Paris", "Jakarta", "Chennai", "Pune", "Shillong", "Srinagar", "Rajkot", "Seoul", "Chicago", "Tehran", "Dallas", "Wuhan", "Ho Chi Minh City", "Hong Kong", "Riyadh", "Baghdad", "Surat", "Madrid", "Houston", "Hawaii", "Miami", "Philedelphia", "Atlanta", "Barcelona", "209.85.231.104", "207.46.170.123", "Seattle", "72.30.2.43", "72.30.2.43", "208.80.152.2", "143.166.83.38", "143.166.83.38", "143.166.83.38", "72.21.211.176", "72.247.244.88", "212.58.241.131", "212.58.241.131", "207.97.227.239", "74070", "74071", "74072", "74073", "74074", "74075", "74078", "74079", "74080", "74081", "74082", "74083", "74084", "74085", "74103", "74104", "74105", "74106", "74107", "74108", "74110", "74112", "74114", "74115", "74116", "74117", "74119", "74120", "74126", "74127", "74128", "74129", "74130", "74131", "74132", "74133", "74134", "74135", "74136", "74137", "74145", "74146", "74301", "74330", "74331", "74332"]
}

Run the Node.js application

If you have configured everything correctly, you should be able to run the following command from your Node.js application directory now:

npm start

Once your application starts fetching data and inserting it into Cassandra, you'll see some basic log messages on your terminal, as shown below:

If you're going to be ingesting a lot of data, you can keep track of your quota usage on the Astra website, as shown in the image below:

Astra also provides a browser-based Grafana dashboard for you to analyze Astra's performance with reads, writes, and latency. You can find this dashboard under the Health tab as shown below:

You can also log into your CQL console to fetch some data from the realtime.weather as shown in the CQL query and image below:

select id, city, country, lat, lon, 
       temp_c, temp_f, feelslike_c, feelslike_f, 
       wind_kph, wind_direction, condition,
       lastupdated_dt
  from realtime.weather 
 limit 50;
 
select id, city, country, lat, lon, 
       temp_c, temp_f, feelslike_c, feelslike_f, 
       wind_kph, wind_direction, condition,
       lastupdated_dt
  from realtime.weather 
 where country in ('USA','Canada','UK')
limit 50
allow filtering;

Cassandra is a great place for fast ingestion. Since it often acts as the system-of-record for customer-facing applications, Cassandra is frequently the source database in ETL pipelines, Change Data Capture (CDC) messages, or streaming data to other analytics, warehousing, or search functions in your IT landscape. As long as the reads are within the best practices of querying in Cassandra, you'll get a compelling performance.

Querying

Now that you have ingested weather data into the table, let's look at a few simple queries to retrieve data from the table. One of the first queries you'll want to write is to look at the weather for a given city.

One of the best features of Cassandra is that it explicitly prevents you from writing bad queries. In the above example, you can see that the first query fails because there is no index on the city. You can explicitly do a full scan by using the allow filtering clause; however, it will be a very costly operation. A better way to do this is by creating a storage-attached index (SAI) on the city column. Once you add the index, you won't need to add the allow filtering clause to your query, and Cassandra won't scan the whole table, saving compute resources. Notice that just after creating the index, you won't be able to query the index. It takes a while for the index to be available. This minimal lag will also depend on the read consistency you have configured for your database.

Let's look at another query. Instead of querying based on the city, this time, let's say you want to get the latest records from the table based on the last updated time. Note that in this example, we have used a different method of creating a storage-attached index specific to AstraDB.

Conclusion

In this tutorial, you learned how to use Cassandra as a time-series database to capture real-time (10 ms) weather data for numerous locations worldwide using a Node.js weather tracking application. You also learned how to spin up a DataStax Astra database in the cloud using the cloud provider of your preference. Currently, DataStax Astra supports Google Cloud, AWS, and Azure.

As mentioned at the beginning of the tutorial, Cassandra makes a great candidate for a wide range of applications, especially for append-only, write-intensive loads, which is why Cassandra works so well with high frequency, high volume applications.

Go ahead, give it a try, and please reach out if you have any questions or feedback at kovid.rathee@protonmail.com or connect with me on LinkedIn.


Written by kovidrathee | I build infrastructure, platform, and data engineering solutions. I love writing about interesting things in tech.
Published by HackerNoon on 2021/09/22