Can we make AI less power-hungry? These researchers are working on it.

For the checks, his group used setups with Nvidia’s A100 and H100 GPUs, those mostly used at information facilities in the present day, and measured how a lot power they used working varied massive language fashions (LLMs), diffusion fashions that generate photos or movies based mostly on textual content enter, and lots of different sorts of AI programs.

The biggest LLM included within the leaderboard was Meta’s Llama 3.1 405B, an open-source chat-based AI with 405 billion parameters. It consumed 3352.92 joules of power per request working on two H100 GPUs. That’s round 0.93 watt-hours—considerably lower than 2.9 watt-hours quoted for ChatGPT queries. These measurements confirmed the enhancements within the power effectivity of {hardware}. Mixtral 8x22B was the biggest LLM the group managed to run on each Ampere and Hopper platforms. Working the mannequin on two Ampere GPUs resulted in 0.32 watt-hours per request, in comparison with simply 0.15 watt-hours on one Hopper GPU.

What stays unknown, nevertheless, is the efficiency of proprietary fashions like GPT-4, Gemini, or Grok. The ML Power Initiative group says it’s extremely laborious for the analysis neighborhood to begin developing with options to the power effectivity issues after we don’t even know what precisely we’re dealing with. We are able to make estimates, however Chung insists they have to be accompanied by error-bound evaluation. We don’t have something like that in the present day.

Essentially the most urgent challenge, in line with Chung and Chowdhury, is the dearth of transparency. “Corporations like Google or Open AI haven’t any incentive to speak about energy consumption. If something, releasing precise numbers would hurt them,” Chowdhury mentioned. “However individuals ought to perceive what is definitely occurring, so perhaps we must always one way or the other coax them into releasing a few of these numbers.”

The place rubber meets the street

“Power effectivity in information facilities follows the pattern just like Moore’s regulation—solely working at a really massive scale, as an alternative of on a single chip,” Nvidia’s Harris mentioned. The ability consumption per rack, a unit utilized in information facilities housing between 10 and 14 Nvidia GPUs, goes up, he mentioned, however the performance-per-watt is getting higher.

Source link

Can we make AI less power-hungry? These researchers are working on it.

The place rubber meets the street

Leave a Reply Cancel reply

About Us

Quick Links

Latest News