My Takeaways After Attending Google Summer of Code 2019

This summer, my proposal got selected by Open Source Robotics Foundation for Google Summer of Code 2019. It was an awesome learning experience!

Open Source Robotics Foundation

Open Source Robotics Foundation (OSRF), or Open Robotics is an independent non-profit organisation that works on open software and hardware for use in robotics.

Their two open-source software projects — ROS and Gazebo are the most commonly adopted frameworks for developing robots and their simulations. ROS and Gazebo have done wonders for the robotics community, driving academic research and product development in robotics all over the world.

Open Robotics offers consultance, R&D help, open source software services and custom engineering for robotics to industry and government.

My project

My GSoC proposal was for the Gazebo Documentation Index project. It was mentored by the best, Jose Luis Rivero, who is a Software Engineer at OSRF. My proposal can be accessed here. The documentation index that I developed can be accessed here and the corresponding Github repo can be found here.

Gazebo is an open source software built and maintained by OSRF to accelerate research and development using robotic simulations that make it easy to test and validate algorithmic implementations in real-time on robots in simulated environments, without having to deploy actual physical vehicles every time.

This helps in reducing costs, resources and time.

At present, the learning resources for Gazebo are distributed over the internet — the official documentation, the Q&A website and ROS-Answers. Some noteworthy help can also be found through examples and explanations in the comments of Bitbucket issues (Gazebo’s repo) and Gazebo’s API.

There are also third-party sources that provide video tutorials and blog posts that are helpful in learning Gazebo. All of this information is distributed across the internet with some links joining each other.

It can be a bit overwhelming to keep track of one’s learning when the content is distributed as in this case.

The aim of this project was to bring all the learning material under one webpage in the form of a documentation index that contains links to the content where the respective information is hosted.

A documentation index is a platform where links to relevant learning resources for a software system are indexed to allow users to find any help at one place.

HTML, Intel and Oracle, among many others have such doc indexes in place.

A documentation index is a neat way to accommodate links to all the relevant learning content into one webpage. It is very convenient since almost any help is just a page quicksearch away. The user can think of related keywords or categories and then look through the index to find the relevant information. Such a platform can act as a one-stop place to get all relevant information about Gazebo.

Compilation of best learning resources from across the internet, including tutorials, third-party blog posts, Bitbucket issues’ comments and the Gazebo answers website.All relevant content under one roof.

My GSoC project was to build such a documentation index for Gazebo.

Proposed timeline

The following is the timeline I presented in my proposal -

Discussion with the mentees about the frontend and backend framework to be used along with the data format in which the index links will be stored.
Accumulating the data for the index’s links of where the relevant information is hosted and finalising the structuring of data across the chosen data format files.
Making index data files available for collaborative maintenance along with rules and recommendations for any contributions to the documentation index in the form of pull requests.
Simultaneously, developing a prototype that can work with limited number of links.
Writing test cases to validate the webpage requirements.
A demo to the mentees and other relevant people in the organisation followed by discussion about areas of improvement in terms of performance and design.
Working on final version and completion within a stipulated time period.
After a demo to the mentees, if the implementation is successful, discussion on build and deployment pipelines.
CI (Continuous Integration) can be set up with Travis CI for automated testing and code building.
After all the aforementioned requirements are satisfied, the website can be put into production.

Getting started

As a part of my proposal for GSoC, I was asked to complete and submit a task that can speak for my experience with web development and a basic understanding of Gazebo. Creating a minimal stub of the doc index website was one of these tasks. My mentor had prepared a basic design for a couple of webpages that I was supposed to emulate.

Choosing the stack

It was during developing these pages that I started thinking about how I could implement the documentation index in a way that it is easy to build, easy to maintain (adding, removing and updating index entries by open-source collaboration), light-weight and fast.

My preferred tool-kit for building websites was a Vuejs frontend + a Python (Flask) or Nodejs (Expressjs) backend + Heroku for serving a PostgresQL DB and for deployment.

But this had to be approached differently.

For example, the doc index would have had been difficult to maintain if the index entries were going to be stored in a conventional database, requiring a lot of pipeline setup, sync issues and making it difficult for any Gazebo developer who wishes to contribute because of requirement of pre-requisite knowledge of the stack.

Also, I realised that the website would only require two (or three) major template pages, so going for a comprehensive front-end framework would have been overkill.

I was looking for a more integrated approach (Oh, Steve Jobs would have been so happy to hear that), that would be easier to maintain.

Jekyll had everything I was looking for. With it’s Static Site Rendering (SSR), templating engines and easy data management using YAML-based frontmatter in Markdown files, Jekyll was the perfect choicde.

The best thing about Jekyll was how the data of the doc index could reside in these adorable Markdown files that were easy to maintain and collaborate on using Github PRs. This helped in eliminating the requirement of a backend server and a database. Because of Github Pages support for Jekyll, hosting also wasn’t an issue anymore.

I developed the prototype using Jekyll and after the project started, I discussed the advantages of using Jekyll with my mentor, who agreed that this was the way to go.

Some important index entries, that my mentor had listed, were added to the website.

Structuring the doc index

The next step was to decide how to structure the doc index in a way that it was easy to scale and again, easy to maintain. We came up with a structure that is two levels deep, classified into categories and subcategories. The first level of abstraction is the category level and the second level is the subcategory level.

Category >> Subcategory >> Index entries

Each category comprises of subcategories and each subcategory comprises of the corresponding index items.Each category and index item also contain a brief description of the same.More important index items can be shown starred.

This hierarchical classification, as shown below has complied well with the requirements of the project.

- category 1
    - subcategory 1
        - item 1 
        - item 2
    - subcategory 2
        - item 1 
        - item 2

- category 2 
- category 3

All links page

To allow for the user to be able to find all relevant content on one webpage, an ‘All links’ page was also added. This can help users in finding any help using a browser quicksearch (Ctrl + F).

Testing

Test cases had to be written to automate the procedure of verifying that everything worked fine after any update to the code (commits and PRs). The two primary things to test are:

Integrity of index data structure
Validity of external links in index items

For the first part, I wrote a spec file in Ruby that parses all the data files and expects the structure of the index data to be maintained.

For testing validity of external links, we used the very common html-proofer library that takes care of that.

These test cases have been automated using Travis CI, that runs these tests for us every time there is a commit or a pull-request in the repo.

User interface

For the user interface of the website, I made improvements to the minmal stub created by my mentor, trying to make it clean and minimal so that the prime focus was on the doc index entries.

For ease of collaboration, a

‘Suggest edits’

button was added that directly takes you to the Github editor using which any changes can be made to the index, and a PR can be opened.

Alpha launch

With this, we were able to launch the alpha version of the Gazebo Documentation Index, inviting developers and users to contribute to the growth of the platform. The alpha release announcement can be accessed here.

What next

This provided us a decent setup to expand on and start finessing the doc index in terms of the data that it provides — the amount of documentation that it covers.

Other than suggestions from the community about the items that can be added to the doc index, we decided to develop a suggestions-tool — an application that can suggest index entries by scraping the issues on Gazebo’s Bitbucket repository.

There is a lot of information in the issues and their comments that is not otherwise documented. These issues can also be a reminder of relevant topics that people are facing difficulties with.

The suggestions-tool can be accessed here.

The suggestions-tool scrapes content from Bitbucket issues, performs NLP based keyword extraction and provides a list of keywords pertaining to each issue. These keywords help in getting a gist of the issue’s content. This information can be used to know whether the corresponding topics have been covered in the doc index. If yes, then these can be marked in the suggestions-tool to keep a track of the issues that have already been covered in the doc index.

After a careful evaluation of the requirements, the app was developed using Vuejs + Flask + PostgresQL (hosted on Heroku) + Heroku.

Future work

I had a great time working on this project. I will continue to work on this project at least until it gains popularity and reaches the self-sufficient state.

The next step is to get more information about this out to the users, and to get people to refer to this in answers to questions on the community page or the Bitbucket issues page.

These are the very beginning days of the Documentation Index and it has serious potential to become an important documentation resource that brings together all the relevant learning material in one place,in an organised fashion, helping beginners and professionals save time and resources and find the right help immediately.