DeepMind repurposes game-playing AIs to optimize code and infrastructure

DeepMind's Alpha series of AIs has provided a few world-firsts, like AlphaGo beating the world champion at Go. Now these AIs originally trained around playing games have been put to work on other tasks, and are showing a surprising facility for them.

Originally, AlphaGo was trained using human gameplay, then AlphaGo Zero learned only by playing against itself, then AlphaZero did the same but also mastered Chess and Shogi. MuZero did all that and more without even being told the rules of the game, which if you think about it may limit the way it "thinks" about how to accomplish its task.

At Google, a system called Borg manages task assignment at data centers — basically parsing requests and allocating resources at light speed so the enormous tech company can do work and research at scale. But Borg "uses manually-coded rules for scheduling tasks to manage this workload. At Google scale, these manually-coded rules cannot consider the variety of ever-changing workload distributions," creating inefficiencies that are as logically inevitable as they are difficult to track.

But AlphaZero, exposed to Borg data, began to identify patterns in data center usage and incoming tasks, and produced new ways to predict and manage that load. When applied in production, it "reduce[d] the amount of underused hardware by up to 19%," which sounds a bit cherry-picked but even if half true is a huge improvement "at Google scale."

Similarly, MuZero was put to work looking at YouTube streams to see if it could help with compression, a complex software domain that yields large results for small optimizations. It was reportedly able to reduce the bitrate of videos by 4%, which again at YouTube scales is pretty major. MuZero is even getting into the weeds of compression, like frame grouping.

AlphaDev — some relative of AlphaZero's — likewise improved sorting algorithms compared to the standard ones in the library Google was using. And it made a better hashing function for small byte ranges (9-16), reducing the load by 30%.

These improvements aren't going to change the world on their own; incremental changes to developer systems are being made all the time. What's interesting is that an AI that developed a problem-solving method focused on winning games was able to learn and generalize its approach in totally unrelated fields like compression.

There's still a long, long way to go before we have "general-purpose AI," whatever that really means, but it's promising that there is a certain amount of flexibility in the ones we have already created. Not just because we can apply them to different fields, but because it suggests flexibility and robustness within the fields they already work in.