Data Pipelines: OpenWeatherMap-Airflow [A How-To Guide]

Written by ashish-ghimire | Published 2020/02/18
Tech Story Tags: data-pipeline | etl | airflow | api | how-to-develop-etl-pipeline | setup-database-etl-pipeline | setup-airflow-etl-pipeline | amaoche-airflow-etl-pipeline

TLDR In this article, we will learn how to develop ETL(Extract Transform Load) pipeline using Apache Airflow. In order to use Airflow, you will have to set up Airflow first. We will then create a directory where we will save daily data obtained from API. The code for this article can be found in this Github Repository. You can see Airflow installation documentation on how to setup Airflow. You can find it over the GitHub Repository. This article is developed using Michael Harmon's publication.via the TL;DR App

In this article, we will learn how to develop ETL(Extract Transform Load) pipeline using Apache Airflow. Here are list of things that we will do in this article:

  • Call an API
  • Setup database
  • Setup airflow

Call an API

We will create a module
getWeather.py
, and inside it we will create a
get_weather()
function which will call the API.
We will then create a directory
data/
where we will save daily data obtained from API. We do this under
createDirectory()
function as shown below.

Setup Database

We will create a module
createTable.py
, and inside it we will create a
make_database()
function which will create database.

Setup Airflow

In order to use Airflow, you will have to set up Airflow first. You can see Airflow installation documentation on how to setup Airflow.
Once Airflow has been set up, we will define our dag.
Now we can run our DAG from Apache Airflow.
Complete code for this article can be found in this Github Repository.
Special thanks to Michael Harmon. This article is developed using his publication. You can find it over
here
.

Published by HackerNoon on 2020/02/18