Tackling Environmental Issues with Software: How Remote GPU Reduces the Impact of GPUs

Written by stego | Published 2021/11/15
Tech Story Tags: gpu | gpu-shortage | cloud-computing | data-centers | edge-computing | environmental-impact | remote-gpu-software | environmental-footprint-gpus

TLDRDemand for accelerated computing brings a large environmental impact. Remote GPU software abstracts GPU from the application host, sharply reducing environmental impact by enabling true sharing, wider resource pooling, and higher utilization of GPUs.via the TL;DR App

Software Lets Us Make Less GPU, But Use More

Understanding how the things that we buy and use affect our environment is not a simple exercise. Unless you're foraging for food and making tools by hand, everything you consume has a complex environmental impact that's probably larger than you think.

For example, that little piece of sashimi you're eating in SOHO might have been caught thousands of miles out in the Pacific and brought back to Japan (the boat burning fuel both ways), flown on an airplane (burning fuel), and delivered to the restaurant in a truck (burning fuel). Everyone involved - fisherman, airline crew, truck driver, chef, wait staff - had to get to and from work. And if you're eating an apex predator like a tuna, there's an additional set of downstream ecosystem impacts.

Computer chips are no different. In fact, manufacturing at such a staggering level of microscopic precision comes with a surprisingly large investment of energy and other resources, plus other ecological impacts. And beyond fabrication, chips must be distributed, installed, and ultimately powered inside computers - all activities that impact the environment.

As the juggernauts of the Metaverse, Machine Learning, Artificial Intelligence, and Gaming push forward, GPU acceleration will become even more critical and widespread - with corresponding environmental pressures mounting.

The Impact of GPU Manufacturing

The environmental impacts of modern chip fabrication, or "fab,” can be categorized like so (note: I don't mean to pick on TSMC in particular, but they are the largest chip manufacturer):

  • Construction of fab facilities: a full accounting of the environmental impact of a massive new factory is beyond this article, but the projected cost of one modern TSMC fab is close to $20B - you can imagine the scope of the environmental impact of such a construction project.

  • Carbon footprint of powering those facilities: Leading manufacturers like TSMC are making strides toward renewable energy, but the majority of their electricity still comes from traditional fossil fuel power plants. And these chip fabs use a shocking (yes, I know…) amount of power - TSMC uses about 15TWh annually, more than the entire country of Paraguay.

  • Other (non-CO2) greenhouse gases: According to the EPA, "Semiconductor manufacturing processes use high [global-warming-potential] fluorinated compounds including perfluorocarbons (e.g., CF4, C2F6, C3F8 and c-C4F8), hydrofluorocarbons (CHF3, CH3F and CH2F2), nitrogen trifluoride (NF3) and sulfur hexafluoride (SF6)…[up to] 80% percent of the fluorinated GHGs pass through the manufacturing tool chambers unreacted and are released into the air."

  • Water usage: TSMC uses 150,000 tons of water per day in Taiwan alone - the equivalent of 500,000 household residents. This isn't necessarily problematic in a monsoon country until a drought comes but could continue to be an issue in places like Arizona.

  • Toxic chemicals: The fab process comprises hundreds of carefully orchestrated physical and chemical steps. Most of the chemicals used are protected as trade secrets and do not even need to be disclosed. The ones we do know about include a vast array of carcinogenic, mutagenic, and teratogenic metals, solvents, and polymers. In addition to the health impacts on workers inside, the resulting toxic waste ultimately leaves factories one way or another, all too often ending up in soil and groundwater.

  • Mining of precious metals: The 7.8B people on Earth average 13 pounds each of e-waste per year. Only 20% of this is recycled, leaving the majority of chip manufacturing to be driven by new materials. Mining the metals required for chip production scars landscapes and leaves toxic chemicals used in the refining process in soil, groundwater, and surface water. Demand for metals in electronics is only rising, and these impacts are increasing.

Most of this impact is attributable to CPUs rather than discrete GPUs. But despite their smaller numbers, GPUs are much larger and heavier, thus carrying a higher per-unit impact - and with 41M discrete GPUs shipped in 2020, the impact is sizeable.

Other Environmental Impacts of GPUs

  • Distribution: New GPUs must, of course, be delivered around the world by ships, planes, and trucks.

  • Installation: Even if all you need is acceleration capacity, you can't just plug a GPU into your network - GPUs need to run inside computers. In the data center, that means "racking and stacking" whole servers to house the GPUs, with corresponding hardware that comes with its own environmental impact.

  • Operation: GPUs draw a lot of power - and they run hot, so even more power is needed to cool them to operational temperature limits. They also consume a surprising amount of electricity when idle but "on" (ready for a workload). Overall the data centers they run in require constant cooling and other environmental controls that consume as much power as the servers themselves do.

Water, power, fossil fuels, greenhouse gases, toxic chemicals - GPU manufacturing, delivery, and operation come with the whole suite of environmental impacts. And it's ironic, for example, that the massive computing capacity that goes into things like climate simulation is also a substantial contributor to climate change.

Imagine instead that we could do way more with way fewer GPUs. It might sound too good to be true.

Make Less, Use More

GPU utilization today is woefully low, averaging less than 15%, as we've covered previously. Such dire underutilization is, without question, a colossal waste of resources - but it's also needlessly destructive to the environment.

If we can figure out a way to raise utilization to, say, 90%, we would be able to do six times the computing with the GPUs we have, do the same amount of computing with 1/6 the number of GPUs, or some balance on a spectrum in between. The amount of leverage we could create would be incredible.


Let's imagine two futures five years from now where the world requires 10x the accelerated computing capacity it has today:

Scenario 1: Status Quo (15% Utilization)

Let's predict a ~20% annual increase in computing capacity per GPU, based on the recent slowing of Moore's Law. Over five years, this gives us 2.5x of our 10x (1.2⁵=2.49). We still need 4x on top of that 2.5x to get our 10x.

So we will need 4x as many GPU cards, with a corresponding 4x in nasty environmental impacts. Let's be generous and say that innovation toward cleaner and more efficient manufacturing, mining, delivery systems, etc., reduce these impacts by 10% (a 1.1x effect).

10x compute increase / 2.5x "Moore's Law" / 1.1x cleantech innovation → 3.6x

In Scenario 1, we're still left with a ~3.6x environmental impact to get our 10x compute capacity.

Scenario 2: Breakthrough (90% Utilization)

  • "Moore's Law": 2.5x, as above
  • Innovations to reduce environmental impacts: 1.1x, as above
  • The breakthrough innovation raises utilization from 15% to 90%: 6x
  • That 6x can't be applied completely to all environmental impacts because raising utilization does raise the energy required for operation - but we do know that most of the environmental impact of computing comes from manufacturing. Let's say 1/3 of the environmental impact comes from operation, so our 6x utilization drives a 2x effect in the opposite direction - 0.5x

10x compute increase / 2.5x "Moore's Law" / 1.1x cleantech innovation / 6x utilization / 0.5x operational impact → 1.2x

1.2x! In Scenario 2, incredibly, we can meet this 10x accelerated computing future while barely increasing our environmental impact.

Sounds great - but how?

This might sound too good to be true, but we don't need to bend the laws of physics to achieve this stunning 8x result (10x compute with 1.2x environmental impact) - we just need to unleash an additional 75% utilization in our GPUs that is already there.

This is not a trivial undertaking. If it were, someone would have already done it. But the breakthrough innovation has arrived.

By abstracting the PCIe connector between a GPU and its application host with software, the two fundamental changes to GPU usage - both necessary to drive the 6x increase in utilization - are now possible:

  1. Accessing a GPU remotely over a network. Removing the physical connection between the application host and GPU enables wider and more efficient pooling of resources and lets any host dynamically attach and detach remote GPU.

  2. Dynamically sharing a GPU. Much more efficient than today's limited splitting options that just create smaller fixed partitions, which themselves are underutilized, general-purpose software abstraction enables a new sharing paradigm where multiple clients (consumers of GPU) can keep a single GPU working harder.

We've already outlined how our team has done this, deployed our solution with customers, and envisioned a world where remote GPU is widely adopted.

As we move into an uncertain environmental future, the last thing we need is rampant waste. We hope you'll join us in our mission to bring the world to 10x compute capacity with only a marginal increase in environmental impact - driven by the simple idea of doing more with what we already have.

Steve Golik is co-founder of Juice Labs, a startup with a vision to make computing power flow as easily as electricity.


Written by stego | Graphics & compute should flow like electricity.
Published by HackerNoon on 2021/11/15