How to make wise choices in selecting libraries

Written by gauthamsanthosh | Published 2018/11/09
Tech Story Tags: python | libraries | bigquery | github

TLDRvia the TL;DR App

Choosing a library is like choosing a wife, you have to stick with it and often changing one involves complicated process that is very expensive. So the best way to do this by analysis.

Now a days most analysis of libraries involve the features of the library. So we won’t get into that. We will go through this problem by example choosing a model server library for machine learning. A model server library allows you to easily make a server for your machine learning model.

Select the library which gives you the most features

This is a fairly obvious choice picking the one that gives you the most features. You look at your requirements and you see the library which gives you the minimal features you need.

So i have shortlisted 4 libraries:

  1. Tensorflow serving
  2. Clipper.ai
  3. Model Aerver for Apache MXNet
  4. DeepDetect

We can look at the release date, stars on GitHub.

Now you compare features and select the best. But this is where is used to get stuck. I used to find 3 very good libraries and i can’t figure out why one is better than the other. One may have more stars since it’s older. Doesn’t mean much. Does it.

See downloads of each library

Data-driven Downloads can be taken into account by using Google’s Big Query Cloud platform the analyze the python package of all the API’s to make an informed decision about which platform to use.

See how many people downloaded this package.

Example of how to see total downloads of tensorflow=-serving-api library using Google BigQuery

You can also set downloads for a shorter period, lets say last one month or so.

See how many open source projects are using them

This one is pretty simple and obvious, but most people don’t do it. See how many people are using this package on GitHub.

Projects using the libraries can be searched on GitHub using the search term “filename:requirements.txt library-name” This will search in “requirements.txt”, a file included in all standard python projects which are in GitHub to check for the library.

This can be used to measure how many people use this library. It was not available for Deep Detect as it was not a Python Package. So sad 😢

Search in GitHub to see how many people are using these libraries.

SEO helps

If a page has a higher SEO ranking that means that library is more used a visited by more people. That can help you make rational decisions as well.

Page Authority using MOZ.com based on backlinks and link metrics can also used to see which of these libraries are more popular. This takes into account how many sites link these libraries.

Conclusion

One can check if a library is being used by a large amount of people to make rational choices in the library you use. Since selecting one requires more than just looking up blog posts and GitHub stars. But also on how big the eco system is.

Other things i can suggest is see how active the community is in the library, if its dead probably don’t use it. As when you get stuck in a bug you will be stuck for a very long time. You can do this by seeing how many tags are of the library in stackoverflow.

Thank you for reading 😅. If you like the Article give it a clap 👏.

Do consider Buying me a Coffee https://www.buymeacoffee.com/gautham , If you loved the article.

If you wish to have a chat, DM me at https://twitter.com/gauthamzzz.

I am a Masters student at the Indian Institute of Information Technology, Allahabad. My Website http://gauthamzz.com.


Published by HackerNoon on 2018/11/09