A Better Guide to Build Apache Superset From source

Written by kharekartik | Published 2019/09/22
Tech Story Tags: data | analytics | python | nodejs | superset | apache-superset | latest-tech-stories | programming

TLDR In this article, we’ll be deep-diving on how to build Apache Superset from the source. The official documentation is too complicated for a new contributor and thus my attempt to simplify it. The official build guide is available to download and test your knowledge of the build process. The front-end and the backend need to be built separately. Let’s start by building the backend first and then build the front end using the Yarn tool to build superset UI.via the TL;DR App

In this article, we’ll be deep-diving on how to build Apache Superset from the source. The official documentation is too complicated for a new contributor and thus my attempt to simplify it.
First, you’ll need the following installed on your system
  • Python 3.6 or 3.7
  • NodeJS
  • NPM
  • Yarn package manager for NodeJS
Let’s first install OS dependencies. Most of these should already be there in your system.

MacOS:

brew install pkg-config libffi openssl python
env LDFLAGS="-L$(brew --prefix openssl)/lib" CFLAGS="-I$(brew --prefix openssl)/include" pip install cryptography==2.4.2

Debian/Ubuntu:

sudo apt-get install build-essential libssl-dev libffi-dev python3.6-dev python-pip libsasl2-dev libldap2-dev

Fedora/RHEL-Derivatives:

sudo yum upgrade python36u-setuptools
sudo yum install gcc gcc-c++ libffi-devel python36u-devel python36u-pip python-wheel openssl-devel libsasl2-devel openldap-devel
For RHEL, you might find the python36u-devel doesn’t exist. In that case search for the correct python3-devel dependency related to your architecture using
yum search python3 | grep devel
Once you have these dependencies setup you are ready to go to the next step.
Superset Repository contains both the frontend and the backend, both of which need to be built separately. Let’s start by building the backend first.

Clone the repository

git clone https://github.com/apache/incubator-superset.git
cd incubator-superset/

Create a virtual environment

A python virtual environment isolates your dependencies from the rest of the system. This eliminates dependency conflicts and hence recommended.
python3 -m venv path/to/new/virtual/env
Once you have created virtual env, activate it using
source path/to/new/virtual/env/bin/activate

Install the dependencies

Now you can install all the required dependencies.
pip install -r requirements.txt
pip install -r requirements-dev.txt
Some of the dependencies present in requirements.txt create usually give errors while running. To avoid them, install all the dependencies mentioned below
pip install numpy==1.17
pip install sqlalchemy==1.2.18
pip install pandas==0.23.4
pip install markupsafe==1.0
pip install mysqlclient

Install superset

pip install -e .
This will use setup.py file to install superset.
Now, let’s proceed to build the frontend.
First, we need to change the directory to superset/assets/
cd superset/assets/
Once that is done, we can start building superset UI.

Pull the dependencies

First, we pull all the required node js dependencies using yarn. To do that just run
yarn
Yarn will download and install all the dependencies present in package.json file.

Build UI

To finally build the front-end, use
npm run build
Voila`! We are done with installing superset from source. Now you can simply run superset using
superset run

Run Tests

To run all the tests, you can run
tox
Let us look at some of the common errors which you might encounter during the build process.

1. Error: flask_appbuilder.base: ‘NoneType’ object has no attribute ‘name’
Solution:
superset init
superset db upgrade
2. Error: Failure while creating virtualenv.
  • virtualenv is installed with python3.6 but you are using python3.7 to create venv. 
    Solution: Reinstall virtualenv for python3.7
  • libffi.so missing. 
    Solution: Install python3-devel to fix that
3. Error: flask command not found
Solution :
pip install flask-cli
4. Error: Yarn — There appears to be trouble with your network connection. Retrying...
Solution:
You are probably installing all these dependencies in a closed network such as an office or college. Set up a valid yarn proxy to allow it to download dependencies. 
export http_proxy=http://host:port/
export https_proxy=https://host:port/
yarn config set proxy http://host:port/
yarn config set https-proxy https://host:port/
If you encounter any other errors, apart from the ones mentioned above, please refer the official build guide.
Connect with me on LinkedIn or Facebook or drop a mail to kharekartik@gmail.com

Written by kharekartik | Software Developer by choice!!
Published by HackerNoon on 2019/09/22