5 Best Practices for Integrating with External APIs

Written by vgukasov | Published 2021/11/11
Tech Story Tags: go | golang | api | api-integration | prometheus | grafana | batching | api-rate-limiting

TLDRvia the TL;DR App

The modern world requires fast and cheap delivery of value to the end-user. That’s why we test tens of hypotheses per week in IT companies. For fast experiments, we usually prefer to use a ready-made solution instead of a self-developed one. Therefore, there is always a need to integrate with external services via API. And today I’d like to talk about best practices for these integrations.

#1 Timeouts

Timeouts are a crucial part of your fault tolerance. You should set it for all external calls. Otherwise, an external service can hang up and you will be frozen with it. For example, if you use Golang, then your code would be something like that:

import "net/http"

type Service struct {
  httpClient *http.Client
}

func NewService() *Service {
  httpClient := &http.Client{
	Timeout: 5 * time.Second, // set up your own timeout
  }

  return &Service{
    httpClient: httpClient,
  }
}

func (s *Service) CallAPI(req *http.Request) error {
  res, err := s.httpClient.Do(req)
  if err != nil {
    ...
  }

  ...
}

#2 Fallback Logic

Any external service (even Google or Amazon) can be down. You should consider the fallback logic for 5xx responses or unexpected responses. For instance, you can return a default response object or do some fallback job.

import (
  "log"
  "io/ioutil"
  "net/http"
)

type Service struct {
  httpClient *http.Client
}

type CallResponse struct {
  Payload string
}

func NewService() *Service {
  httpClient := &http.Client{
	Timeout: 5 * time.Second,
  }

  return &Service{
    httpClient: httpClient,
  }
}

func (s *Service) CallAPI(req *http.Request) (CallResponse, error) {
  res, err := s.httpClient.Do(req)
  if err != nil {
    return CallResponse{}, fmt.Errorf("do request: %w", err)
  }

  content, err := ioutil.ReadAll(res.Body)
  if err != nil {
	return CallResponse{}, fmt.Errorf("read response body: %w", err)
  }

  // gracefully handle the bad responses
  if res.StatusCode >= 400 && res.StatusCode < 500 {
    log.Printf("external service returned bad response. Code: %s. Content: %s\n", res.StatusCode, string(content))	

    return CallResponse{Payload: "default"}, nil
  }

  ...
}

#3 Batching

Every extra API call is an overhead to you and the external systems. Pore over the API docs to find batch methods for your needs.

For example, 1 call to create one item takes 20ms. Therefore, the synchronous creation of 10 items would take 200ms (actually it will take more because on load external services usually start to throttle your requests). But you can use the batch API method and create 10 items per single request and it takes 50 ms.

Usually, when your requests count is increasing the difference becomes much more prominent. It can save you a tremendous amount of execution time. In the corner case if there is no batch method, try to parallel your requests.

#4 Rate limiting

Most services have API limits. Investigate them and calculate how your requests will be placed within the limits. There is a useful lib in Go that can help you to control the API calls count.

import (
  "go.uber.org/ratelimit"
  "net/http"
)

type Service struct {
  limiter    ratelimit.Limiter
  httpClient *http.Client
}

func NewService() *Service {
  httpClient := &http.Client{
	Timeout: 5 * time.Second,
  }

  return &Service{
    httpClient: httpClient,
    limiter:    ratelimit.New(10), // 10 is the max RPS that external API can handle
  }
}

func (s *Service) CallAPI(req *http.Request) error {
  s.limiter.Take() // hangs if the max RPS is reached

  res, err := s.httpClient.Do(req)
  if err != nil {
    ...
  }

  ...
}

#5 Metrics and alerts

Even if an external service returns successful responses, it can have issues with performance sometimes. For cases like these, you should use metrics and alerts on your side to see when it happens and react quickly.

In my team, we prefer to use widespread solutions like Prometheus and Grafana:

import (
	"github.com/prometheus/client_golang/prometheus"
)

var (
	// ExternalServiceHTTPCallHistogram observes http call duration in seconds
	ExternalServiceHTTPCallHistogram = prometheus.NewHistogramVec(
		prometheus.HistogramOpts{
			Namespace: "namespace",
			Subsystem: "subsystem",
			Name:      "external_service_call_duration_in_seconds",
			Help:      "http call duration in seconds to an external service",
		}, []string{"path", "method"},
	)
)

type Service struct {
  httpClient *http.Client
}

func NewService() *Service {
  httpClient := &http.Client{
	Timeout: 5 * time.Second,
  }

  return &Service{
    httpClient: httpClient,
  }
}

func (s *Service) CallAPI(req *http.Request) error {
  // save the request starting time point
  start := time.Now()
  // do the API call
  res, err := s.httpClient.Do(req)
  // calculate how much time the request takes
  spentSeconds := time.Since(start).Seconds()
  // send the measurement to Prometheus
  metric.ExternalServiceHTTPCallHistogram.WithLabelValues(req.URL.Path, req.Method).Observe(spentSeconds)
  ...
}

With data in Prometheus, we can set up alerts in Grafana when an external service is down or its response is taking too long.


Written by vgukasov | Senior SWE at Akma Trading
Published by HackerNoon on 2021/11/11