How to Create Abstract Syntax Tree with Acorn

Written by pwnbykenny | Published 2020/12/06
Tech Story Tags: javascript | parser | syntaxing | abstract-syntax-tree | ast | nodejs | design-patterns | optimization

TLDRvia the TL;DR App

0. Preface

JavaScript parser — Acorn and AST are useful things. They help us automatically and efficiently edit source code. This post shows you how to build and edit the AST of JavaScript code.

1. Content

2. Install the JavaScript Parser — Acorn

To quote from the github repository, Acorn is a tiny, fast JavaScript parser, written completely in JavaScript, released under a MIT license. Acorn can generate abstract syntax trees for JavaScript codes. It has 3 modules: the main JavaScript parser named “acorn”, the error-tolerant parser named “acorn-loose”, the syntax tree walker named “acorn-walk”. This post focuses on the main parser. In this section, we introduce its installation.
The installation is easy. You just need to run this command in the Linux terminal: “npm install acorn”. Then you will see that a folder named “node_modules” is created in the current directory. And you will see the executable acorn file in this directory: “node_modules/acorn/bin/”. The next section introduces how to use it.

3. Use Acorn to Create an AST

We will create a JavaScript file named “hello.js” under this directory: “node_modules/acorn/bin/”. The content of this file is “var str = ‘hello’;”. Next, we run this command: “acorn hello.js” in the terminal. Then you will see an output in the terminal. The output is the AST of the JavaScript code in hello.js. Here is the output:
{
  "type": "Program",
  "start": 0,
  "end": 19,
  "body": [
    {
      "type": "VariableDeclaration",
      "start": 0,
      "end": 18,
      "declarations": [
        {
          "type": "VariableDeclarator",
          "start": 4,
          "end": 17,
          "id": {
            "type": "Identifier",
            "start": 4,
            "end": 7,
            "name": "str"
          },
          "init": {
            "type": "Literal",
            "start": 10,
            "end": 17,
            "value": "hello",
            "raw": "'hello'"
          }
        }
      ],
      "kind": "var"
    }
  ],
  "sourceType": "script"
}

4. Understand the Structure of an AST

This section uses the AST in section 3 as an example. The most important thing about an AST is its nodes. After all, a tree is made of nodes. In the AST above, each node starts with a “{” and ends with a “}”. The structure of each node is a little bit different according to its node type.
But what’s in common among all the nodes is that they all have three properties: type, start, end. The “type” property indicates the type of the node. For example, it can be “Identifier” which means that the node saves a variable name, or a function name, or etc. The “start” and “end” properties indicate where in the source code the node starts and ends. For example, the “Identifier” node saves the name “str”. The name starts from the 4th(closed) character and ends at the 7th(open) character in the source code:
Besides the 3 properties, other properties are dependent on the node type. For example, the “Identifier” node has a property “name” because it records a name. The “Literal” node has two additional properties “value” and “raw” because it records literals such as strings, numbers, and etc. There are too many node types to list here. But you can always find out the structure of a type of node by using the acorn to parse some JavaScript code just like the hello.js example. You will find the structure in the AST.
After understanding the nodes’ structures, you can traverse the AST by accessing each node’s properties. Each node connects to its child nodes by its properties. The following section gives you an example of how to traverse an AST.

5. Use Node.js to Traverse an AST

In case you don’t know Node.js, let me give you a simple description of it. Node.js is a JavaScript interpreter. It builds on v8. What’s different between Node.js and v8 is that v8 runs in a browser(front end) but Node.js runs in a server(back end). After installing Node.js, you can use it to execute JavaScript code like this: node hello.js.
In this section, I will give you some code to traverse the AST in section 3. This is just a simple example for you to know how the traversal works. The code is saved to a file named “ast.js” under this directory “node_modules/acorn/bin/”, and it’s executed by Node.js using this command: node ast.js.
class Visitor {
        /* Deal with nodes in an array */
        visitNodes(nodes) { for (const node of nodes) this.visitNode(node); }
        /* Dispatch each type of node to a function */
        visitNode(node) {
                switch (node.type) {
                        case 'Program': return this.visitProgram(node);
                        case 'VariableDeclaration': return this.visitVariableDeclaration(node);
                        case 'VariableDeclarator': return this.visitVariableDeclarator(node);
                        case 'Identifier': return this.visitIdentifier(node);
                        case 'Literal': return this.visitLiteral(node);
                }
        }
        /* Functions to deal with each type of node */
        visitProgram(node) { return this.visitNodes(node.body); }
        visitVariableDeclaration(node) { return this.visitNodes(node.declarations); }
        visitVariableDeclarator(node) {
                this.visitNode(node.id);
                return this.visitNode(node.init);
        }
        visitIdentifier(node) { return node.name; }
        visitLiteral(node) { return node.value; }
}

/* Import necessary modules */
var acorn = require('acorn');
var fs = require('fs');
/* Read the hello.js file */
var hello = fs.readFileSync('hello.js').toString();
/* Use acorn to generate the AST of hello.js */
var ast = acorn.parse(hello);
/* Create a Visitor object and use it to traverse the AST */
var visitor = new Visitor();
visitor.visitNode(ast);
The comments are clear in the code. If you compare the code with the AST in section 3, you will find it easy to understand.

6. Summary

The post introduced how to use the JavaScript parser — acorn to create the Abstract Syntax Tree of a JavaScript program, and how to use Node.js to traverse the Abstract Syntax Tree. Please put all the files under the directory specified in this post. Otherwise, the system probably cannot find the path. If you like this post or find it useful, please help me share it on your social media ~ Thank you! 
For more similar contents, you can visit here: PhD & Automatic Exploitation of JIT Compilers — Pwn By Kenny. This is my personal website. Welcome!

Written by pwnbykenny | A Ph.D. A Hacker. My personal website: https://pwnbykenny.com
Published by HackerNoon on 2020/12/06