Technology in the Oil and Gas Industry: An MLOps Perspective

The Oil and gas industry generates an annual revenue that was approximately $3.3 trillion in 2019 and is one of the largest enterprises in the world. Oil and natural gas upstream, midstream and downstream processes constantly generate large amounts of data and is immensely dependent on sophisticated technologies to reveal new insights in the business i.e prevent equipment malfunctioning and improve operational efficiency…

In recent times, the industry has become a trend-setter in technology and is moving towards automation… and hence dependence on artificial intelligence (AI).

Surprising, Right?

But the reasons are clear…

(a) The O & G industry is turning to innovation and technology in a bid to boost efficiency and buoy profits.

(b) The global spotlight on sustainable practices to reduce emissions and water consumption, have made the O&G companies to seek new ways to optimize upstream operations and eliminate practices that waste time and money.

The tech revolution already started in one of the largest industries in the world. In Alberta, the heartland of the O&G industry in Canada, industry leaders are beginning to modernize business models through a savvy combination of data analytics, AI and machine learning — and the results are revolutionary. — jwnenergy.com

Big players like Royal Dutch Shell continuously look for new ways to keep up with the growing demand for dwindling resources. Shell’s software development team stays up to date and enhanced with new features. Shell software revolution is seen in Collaborating in the cloud, in “Software as a service’’, or SaaS solutions, in faster-moving environments and gaining more competition via DevOps.

So the question is what next?

How can the industry keep up with the O&G industry and stay abreast with developing technologies with a mission into the future.

The oil and gas industry faces an unprecedented shift in its workforce as automation and artificial intelligence (AI) continue to transform the way companies operate. Now more than ever, oil and gas organizations are using technology to drive down production costs to improve margins as they fight prolonged drops in oil prices. -EY report, 2020

A simple google search, reveals how O&G is increasingly developing AI/ automation tools to solve Drilling Operations, Facility Operations, Trades and Equipment, Marine and Geology.

The Gap ….

Increasing dependence of O&G enterprises on Deep Learning / Machine learning models, requires robust data pipeline.Voluminous sets of data generated in the upstream business. The data size usually reaches Petabyte (=1024Terabyte) or Exabyte (=1024 Petabyte). The complete set of data consists of approximately 40000 files.

Solution: A Robust and scalable MLOps pipeline.

MLOps: The future of Machine Learning

MLOps is the standardization and streamlining of machine learning lifecycle management. It refers to the concept of automating the lifecycle of machine learning models from data preparation and model building to production deployment and maintenance.

MLOps is not only some machine learning platform or technology, but instead it requires an entire change in the mindset of developing machine learning models towards best practices of software development.

In this blog post, I introduce the concepts and benefits of how enterprise companies, especially O & G can effectively use machine learning models in production.

There are three key reasons that managing machine learning life cycles at scale is challenging:

(a) Data is constantly changing, but business needs shift as well.

(b) Results need to be continually relayed back to the business to ensure that the model in production and on production data aligns

© Machine learning life cycle involves people from the business, data science, and IT teams, none of these groups are using the same tools or same fundamental skills to serve as a baseline of communication.

MLOps pulls heavily from the concept of DevOps, which streamlines the practice of software changes and updates. They both center around: Robust automation and trust between teams.

One critical difference between MLOps and DevOps…

Deploying software code in production is fundamentally different than deploying machine learning models into production. While software code is relatively static, data are always changing for machine learning models.

Pushing machine learning models into production without MLOps infrastructure is risky because the performance of a machine learning model can often only be done in the production environment.

Prediction models are only as good as the data they are trained on, which means the training data must be a good reflection of the data encountered in the production environment. If the production environment changes, then the model performance is likely to decrease rapidly.

AI engineers are constantly focusing on implementing state-of-the-art AI models and MLOps technology to make both upstream and downstream O&G operations flexible and scalable.

Seismic Interpretation ML models

Seismic interpretation plays a crucial role in various disciplines, such as civil engineering, geohazard assessment, and energy exploration. It’s a crucial process in O&G enterprises.

Here I demonstrate a state-of-the-art algorithm to interpret seismic sections. I give an overview how ML models are integrated into the MLOps pipeline.

The MLOps pipeline can be broadly divided into four tasks:.

Data Collection: Seismic sections are noisy semistructured dataset of seismic events that are collected from a specific region. The time-section data are in segy format. Detailed ETL pipelines are used to process data from the segy format to numpy arrays. A nice article on ETL processing can be found here.
Data Aggregation: This step combines data engineering and data science knowledge, with the goal of assuring the quality control, security, and integrity of the data.
Model Train : Deep learning is an increasingly popular subset of machine learning. Deep learning models are built using neural networks.
Deployment: Once the model is trained you need to evaluate the results. It’s important to understand what happens when a model gets deployed. Once deployed you access the model through a RESTFUL API.

FastAPI is a modern, high-performance, the Python web framework that’s perfect for building RESTful APIs. I deploy our state-of-the-art model using FastAPI that gives accuracy of the predicted sections based on inline numbers.

Step 1: Import FastAPI

import uvicorn 
from typing import Dict
from typing import List 
from fastapi import FastAPI, Depends
from model import test
from pydantic import BaseModel

Step 2: Create a FastAPI “instance”

app = FastAPI(title="EarthAdaptNet Image Segmentation", description="Obtain semantic segmentation images (Seismic Facies) of the seismic image in input via EarthAaptNet implemented in PyTorch.")

Step 3: Create and define a path operation¶

@app.get('/get_metrics/{inline}')
def get_predict_section(inline:int):
PA, CA, MCA, FWIoU, MIoU, IoU = test.predict_section(inline)
return SegmentationResponse(Pixel_Accuracy = PA, Class_Accuracy = dict(zip(range(6), list(CA))), Mean_Class_Accurcy = MCA, Frequency_Weighted_IoU = FWIoU, Mean_IoU = MIoU, IoU = dict(zip(range(6), list(IoU))))

Step 4: To run the app

cd Desktop/PETAI_APP
uvicorn main:app - reload

url with query parameter: http://127.0.0.1:8000/get_metrics/25

Response body for \GET for inline 25:

{ "Pixel_Accuracy": 0.7156722888870242,
"Class_Accuracy": {
"0": 0.9334791845017281,
"1": 0.7392144602967089,
"2": 0.8806730185476577,
"3": 0.6101912875743449,
"4": 0.15720375106564366,
"5": 1},
"Mean_Class_Accurcy": 0.7201269503310139,
"Frequency_Weighted_IoU": 0.5497397470771686,
"Mean_IoU": 0.4587748502388708,
"IoU": {
"0": 0.5905325443786982,
"1": 0.5601257644930657,
"2": 0.7114118141601777,
"3": 0.539856360662732,
"4": 0.153496115427303,
"5": 0.19722650231124808}
}

I welcome feedback and constructive criticism. Please visit us at PETAI to learn more about our research on AI and MLOps. I can be reached through LinkedIn. The code to this study is found here.

References: